Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple node processes support with a single datastore file #105

Closed
realguess opened this issue Dec 6, 2013 · 20 comments
Closed

Multiple node processes support with a single datastore file #105

realguess opened this issue Dec 6, 2013 · 20 comments

Comments

@realguess
Copy link

I was running two node servers, each on different ports, but both reading and writing to the same persistent datastore file. It looks like each node process is reading its own copy of the doc from the file.

For example, node process one creates the first record below, and second node process creates the second record. The dump of the data file contains both record, but each process can only read one that has been created by itself.

{"path":"/path","time":1386333352657,"_id":"FLRlkL2deQMA1RyZ"}
{"path":"/path","time":1386333358367,"_id":"BzTn0GRGq67Mwco8"}
@szwacz
Copy link
Contributor

szwacz commented Dec 6, 2013

Yes, that's exactly what is happening. See #99.

@realguess
Copy link
Author

So connection management will not be implemented at all, and I should seek for MongoDB or others as an alternative in this situation?

@louischatriot
Copy link
Owner

That's what I would have said before but a lot of people asked me this
feature and since it's not a lot of work I'll probably do it. One thing you
should keep in mind though is that NeDB keeps a copy of the database in RAM
so is not suited for large applications where you have more than 1M
records, If that's your case you should go with mongodb
Le 6 déc. 2013 05:22, "Chao Huang" notifications@github.com a écrit :

So connection management will not be implemented at all, and I should seek
for MongoDB or others as an alternative in this situation?


Reply to this email directly or view it on GitHubhttps://github.com//issues/105#issuecomment-29991088
.

@eldad87
Copy link

eldad87 commented Feb 26, 2014

so is not suited for large applications where you have more than 1M records
Really?!

@louischatriot
Copy link
Owner

@eldad87 of course, it's also written on top of the readme. Large scale databases need to be external processes with a fast language such as C, there are too many limitations in a Node.js environment.

@ghost
Copy link

ghost commented Mar 12, 2014

Not sure whether to wake this or #99 but I was wondering if Louis had made any progress with the concept of multiple processes accessing a DB?

I don't think you'd ever consider multiple CRUD processes but the concept of having a process to perform updates and a separate process returning queries would be quite useful I think?

I have a Node game backend where I really need separate CRUD and Query processes because a single-threaded CRUD process is fine (and indeed has considerable integrity benefits) but a single-threaded query process couldn't cope with the likely demand...

In an ideal world I'd use MongoDB but there are reasons why I'd rather not in this case...

@louischatriot
Copy link
Owner

I began working on this https://github.com/louischatriot/nedb-server which basically an Express server around a single NeDB instance, allowing concurrent access by different processes. Didn't have the time to finish and not sure when I will find it but this is pretty simple, just a REST wrapper. If anyone wants to take the project and finish it I'll be happy to review a PR !

@ghost
Copy link

ghost commented Mar 13, 2014

I'm new to Express (and, indeed, Node as a server thing - I use it in bundled WebAPPs mostly) but isn't that approach still 'single threaded' for all requests made the the Express 'server'?

You could multi-thread it using something like Cluster Node but then you'd run the risk of conflicting CRUD requests (one thread deletes a record whilst another is updating it - leaving it undeleted and updated or updated and then deleted at-random).

When I think about it - the best solution would be to have 2 separate DB files - a read-only one which is accessed by a multi-threaded query 'server' and the writable one which is access by a single-threaded CRUD server - but I've no idea how practical that is - how we'd tell NEDB to write the file/copy the file and re-read the file on-demand!?

@louischatriot
Copy link
Owner

With the kind of load these kind of project are expected to have, a single threaded instance for both read and write is more than enough, even without indexing.

@ghost
Copy link

ghost commented Mar 13, 2014

I think what I'm saying is that there's a practical application of NEDB where you have

1 single-threaded CRUD process
Multiple multi-threaded query processes

It's not hard to think of about a million applications for that - any website using NEDB as it's backend is a good example tho.

Someone else talked about simply copying the updated db file periodically - which would be fine if there were a mechanism to schedule a copy of the db file at a point it's guaranteed to be consistent (e.g. when all updates have been written and none are pending).

You'd also need a way to tell the query processes to re-read the file of course - and you'd lose any performance benefits from the query processes working from an in-memory copy - which may be considerable.

I'm talking myself into MongoDB here aren't I?

@louischatriot
Copy link
Owner

I think so :) When you arrive at this level of requirements you definitely want a full-blown database solution !

@ghost
Copy link

ghost commented Mar 13, 2014

Yeah - if for no other reason than the moment you start thinking about 'scaling' you really have to consider that you'll probably want your processes to run not just 'multi threaded' but as multiple completely separate processes - perhaps on completely separate machines in different places entirely!!

I bit of testing I did using Express did show that using Node Cluster would offer considerable improvement in the load a single server could manage in query terms tho - so I still think there's room to consider a mechanism (if one doesn't already exist) to enable 'safe' copying and updating of the db file any NEDB app is using (e.g. force write and callback - force read and callback)???

@sewe75
Copy link

sewe75 commented Mar 17, 2014

@shrewdlogarithm: Did you think about combining http://elasticsearch.org and MongoDB? Using that together would give you extremely fast queries on a large amount of data via ElasticSearch while MongoDB keeps responsible for concurrent CRUD ops.
Here's a short blog entry about ES and MongoDB: http://www.usesold.com/blog/2013/08/06/In-2-Minutes-from-Zero-to-elasticsearch-with-Mongoose.html

@Meekohi
Copy link

Meekohi commented Mar 18, 2014

I wish this had been clearly described in the initial README. Wasted a lot of time implementing a solution with this before realizing this major issue.

@ghost
Copy link

ghost commented Mar 18, 2014

+Meeohi - I'm not sure what's not clearly described - the README outlines it's use as a low-concurrency in-memory datastore suitable for desktop and low-access WebApps - I just raised the issue of parallel 'read' threads because that has possible benefits within that space (IMO)

You won't find a 'native' solution for high concurrency data access because Javascript just isn't a platform which would adapt well to that - I'm unaware of anything remotely like that right now (which I why I'm testing the boundaries of where NEDB could go, really).

Good news - it's pretty-much 100% MongoDB compliant -switching to that would be relatively easy (from your code PoV if not in installing, configuring and understanding MongoDB :)

It's a super-common mistake people make when storing data tho - not actually thinking-through how they're going to access/update it and what implications that has for their design and choice of tools.

I didn't find easy work as an SQL (Oracle) DBA for decades because it was an easy thing to get right and that was before there were a fraction of the possible solutions which exist now.

@Meekohi
Copy link

Meekohi commented Mar 18, 2014

Hey don't get me wrong it's a great project :) Indeed I switched over to MongoDB in about 30 minutes.
I'm not sure I understand the distinction between "low-concurrency" and "high-concurrency" here. It seems to me that NeDB supports "no concurrency" across multiple processes, since each process gets its own copy of the database and there is no way I could discover to sync the view two processes have of the database.

@ghost
Copy link

ghost commented Mar 18, 2014

Javascript is inherently single-threaded so you're right in that it's "no concurrency" really.

Understand, tho, that almost all DBMS's perform CRUD as a single-threaded-task (it's almost impossible to consider any other option) - it's only queries where you have some freedom to multi-thread (with the proviso that maintaining a consistent view of the data can be expensive)

Actual cases of people with high-query-traffic databases which are also high-CRUD-traffic are very, very scarce tho - hence my suggesting ways of sharing the db file to enable queries/copy-over updates once the queries are done...

@louischatriot
Copy link
Owner

Closing as no activity

@allain
Copy link

allain commented Feb 20, 2016

If anyone's interested, I'm trying to do the same for nedb as level-party does for leveldb:

https://github.com/allain/nedb-party

It uses rpc to send requests to the process that "owns" the db. It's still early days, but I hope to be committing the time to make the approach robust. At present event the owning's requests are sent through the proxy.

@vangelov
Copy link

vangelov commented May 6, 2017

Recently I struggled with this problem and came up with a solution that is similar to nedb-party but a bit more robust: https://github.com/vangelov/nedb-multi, hope it helps someone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants