-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple node processes support with a single datastore file #105
Comments
Yes, that's exactly what is happening. See #99. |
So connection management will not be implemented at all, and I should seek for MongoDB or others as an alternative in this situation? |
That's what I would have said before but a lot of people asked me this
|
|
@eldad87 of course, it's also written on top of the readme. Large scale databases need to be external processes with a fast language such as C, there are too many limitations in a Node.js environment. |
Not sure whether to wake this or #99 but I was wondering if Louis had made any progress with the concept of multiple processes accessing a DB? I don't think you'd ever consider multiple CRUD processes but the concept of having a process to perform updates and a separate process returning queries would be quite useful I think? I have a Node game backend where I really need separate CRUD and Query processes because a single-threaded CRUD process is fine (and indeed has considerable integrity benefits) but a single-threaded query process couldn't cope with the likely demand... In an ideal world I'd use MongoDB but there are reasons why I'd rather not in this case... |
I began working on this https://github.com/louischatriot/nedb-server which basically an Express server around a single NeDB instance, allowing concurrent access by different processes. Didn't have the time to finish and not sure when I will find it but this is pretty simple, just a REST wrapper. If anyone wants to take the project and finish it I'll be happy to review a PR ! |
I'm new to Express (and, indeed, Node as a server thing - I use it in bundled WebAPPs mostly) but isn't that approach still 'single threaded' for all requests made the the Express 'server'? You could multi-thread it using something like Cluster Node but then you'd run the risk of conflicting CRUD requests (one thread deletes a record whilst another is updating it - leaving it undeleted and updated or updated and then deleted at-random). When I think about it - the best solution would be to have 2 separate DB files - a read-only one which is accessed by a multi-threaded query 'server' and the writable one which is access by a single-threaded CRUD server - but I've no idea how practical that is - how we'd tell NEDB to write the file/copy the file and re-read the file on-demand!? |
With the kind of load these kind of project are expected to have, a single threaded instance for both read and write is more than enough, even without indexing. |
I think what I'm saying is that there's a practical application of NEDB where you have 1 single-threaded CRUD process It's not hard to think of about a million applications for that - any website using NEDB as it's backend is a good example tho. Someone else talked about simply copying the updated db file periodically - which would be fine if there were a mechanism to schedule a copy of the db file at a point it's guaranteed to be consistent (e.g. when all updates have been written and none are pending). You'd also need a way to tell the query processes to re-read the file of course - and you'd lose any performance benefits from the query processes working from an in-memory copy - which may be considerable. I'm talking myself into MongoDB here aren't I? |
I think so :) When you arrive at this level of requirements you definitely want a full-blown database solution ! |
Yeah - if for no other reason than the moment you start thinking about 'scaling' you really have to consider that you'll probably want your processes to run not just 'multi threaded' but as multiple completely separate processes - perhaps on completely separate machines in different places entirely!! I bit of testing I did using Express did show that using Node Cluster would offer considerable improvement in the load a single server could manage in query terms tho - so I still think there's room to consider a mechanism (if one doesn't already exist) to enable 'safe' copying and updating of the db file any NEDB app is using (e.g. force write and callback - force read and callback)??? |
@shrewdlogarithm: Did you think about combining http://elasticsearch.org and MongoDB? Using that together would give you extremely fast queries on a large amount of data via ElasticSearch while MongoDB keeps responsible for concurrent CRUD ops. |
I wish this had been clearly described in the initial README. Wasted a lot of time implementing a solution with this before realizing this major issue. |
+Meeohi - I'm not sure what's not clearly described - the README outlines it's use as a low-concurrency in-memory datastore suitable for desktop and low-access WebApps - I just raised the issue of parallel 'read' threads because that has possible benefits within that space (IMO) You won't find a 'native' solution for high concurrency data access because Javascript just isn't a platform which would adapt well to that - I'm unaware of anything remotely like that right now (which I why I'm testing the boundaries of where NEDB could go, really). Good news - it's pretty-much 100% MongoDB compliant -switching to that would be relatively easy (from your code PoV if not in installing, configuring and understanding MongoDB :) It's a super-common mistake people make when storing data tho - not actually thinking-through how they're going to access/update it and what implications that has for their design and choice of tools. I didn't find easy work as an SQL (Oracle) DBA for decades because it was an easy thing to get right and that was before there were a fraction of the possible solutions which exist now. |
Hey don't get me wrong it's a great project :) Indeed I switched over to MongoDB in about 30 minutes. |
Javascript is inherently single-threaded so you're right in that it's "no concurrency" really. Understand, tho, that almost all DBMS's perform CRUD as a single-threaded-task (it's almost impossible to consider any other option) - it's only queries where you have some freedom to multi-thread (with the proviso that maintaining a consistent view of the data can be expensive) Actual cases of people with high-query-traffic databases which are also high-CRUD-traffic are very, very scarce tho - hence my suggesting ways of sharing the db file to enable queries/copy-over updates once the queries are done... |
Closing as no activity |
If anyone's interested, I'm trying to do the same for nedb as level-party does for leveldb: https://github.com/allain/nedb-party It uses rpc to send requests to the process that "owns" the db. It's still early days, but I hope to be committing the time to make the approach robust. At present event the owning's requests are sent through the proxy. |
Recently I struggled with this problem and came up with a solution that is similar to |
I was running two node servers, each on different ports, but both reading and writing to the same persistent datastore file. It looks like each node process is reading its own copy of the doc from the file.
For example, node process one creates the first record below, and second node process creates the second record. The dump of the data file contains both record, but each process can only read one that has been created by itself.
The text was updated successfully, but these errors were encountered: