Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LevelDB & Node.js 'real world' use cases #49

Closed
navaru opened this issue Jan 7, 2013 · 38 comments
Closed

LevelDB & Node.js 'real world' use cases #49

navaru opened this issue Jan 7, 2013 · 38 comments
Labels
discussion Discussion

Comments

@navaru
Copy link

navaru commented Jan 7, 2013

I've read the docs on LevelDB and some topics on the LevelDB Google Group & StackOverflow, I understand for what it was built.

What I want to know is what are some of your use cases and on what scenario do you believe LevelDB & Node.js is a good fit.

I am not very experienced with DBs but I would like to learn and there's not too much info on LevelDB & Node.js

Thank you :)

@juliangruber
Copy link
Member

It empowers you to write your own db, from a super light but fast filesystem abstraction to a beefed up server with custom replication schemes and other application-logic right at the heart of this super fast thing. And all in one language, on client, server and database.

@Raynos
Copy link
Member

Raynos commented Jan 7, 2013

You could implement something like lambda architecture completely on top of leveldb.

it's a basic building block, you can build your own db abstraction with your own trade offs on top of it.

@rvagg
Copy link
Member

rvagg commented Jan 7, 2013

@navaru LevelDB makes data persistence easy. Instead of storing your data structures in memory, store them in a LevelDB database and fetch when neede--this way you can store a ton of data and not have to worry about RAM and you get to keep that data across restarts. This is how I mainly use it, LevelUP makes it super simple and fetching & processing large amounts via a readStream() is just so nice to work with. If you use the inbuilt JSON encoding then you get to pretend the data is in memory (except where serialisation/deserialisation may change the form of your data, like Date objects), since LevelDB is so fast and all operations are async the speed impact of storing on disk is hardly noticeable.

@rvagg rvagg closed this as completed Jan 7, 2013
@dominictarr
Copy link
Contributor

I propose leaving this option open as a 'discussion issue'

@dominictarr dominictarr reopened this Jan 8, 2013
@navaru
Copy link
Author

navaru commented Jan 8, 2013

I've read the article on 'lambda architecture', and I think I've worked with those concepts when using other DBs, batch layer on CouchDB views, Serving layer on Riak, etc, am I correct?

Because there are no examples, if I need to implement a namespace structure (collections in MongoDB or buckets in Riak), how should I proceed? I need to create an architecture first as the one advised in 'lambda architecture' or I can create a simpler layer to work with?

Do you have any architecture layer available as Open Source or any apps where you've used LevelDB (LevelUP)?

Most of my apps are targeted at small companies, where I don't need to setup a cluster, a single server almost always does more than I need, when I deliver an app I don't want to include an external DB like Mongo or Riak. I want to build a layer on top of LevelDB that suits my needs, like a custom CMS or something in this area.

Thank you for your replies :)

@dominictarr
Copy link
Contributor

There are two modules available for generating namespaces,
https://github.com/dominictarr/range-bucket and https://github.com/kesla/level-namespace

The way that leveldb is set up internally, it's pretty much it's own lambda architecture.
I have map-reduce module for levelup https://github.com/dominictarr/map-reduce
please give feedback if you decide to use it.

also, see the module listing! https://github.com/rvagg/node-levelup/wiki/Modules it's brand new!

@rvagg
Copy link
Member

rvagg commented Jan 9, 2013

It occurred to me after writing it that the LevelDB code in Level Session might be a good example of the basics of using LevelDB in Node. It covers the basic operations and a simple readStream() search.

https://github.com/rvagg/node-level-session/blob/master/lib/level-store.js

@jeremybenaim
Copy link

Hi there,

Not sure that this is the good place to post that but whatever...

I've put together a small app based on express which relies on leveldb and which might be a good illustration of what you can achieve pretty quickly w/ level-up (and level-namespace)

https://github.com/jeremybenaim/express-leveldb

Any feedback is more than welcome :)

Thanks folks

@juliangruber
Copy link
Member

looks good, I'd only stream the results of findall to the client instead of buffering them

@jeremybenaim
Copy link

Thanks , will do that. Btw, do not hesitate to open issues directly on the repo if you got any other suggestions! :)

@Raynos
Copy link
Member

Raynos commented Jan 12, 2013

@juliangruber you can't stream results to the client with express, express is not build for streams.

@jeremybenaim
Copy link

Right, but through sockets?

@Raynos
Copy link
Member

Raynos commented Jan 13, 2013

Sure through sockets but then you either need to parse stream JSON over XHR or websockets or build a streaming templating engine

@juliangruber
Copy link
Member

@Raynos Express only augments the response object, res.write still is there. Express is really unobstrusive.

@Raynos
Copy link
Member

Raynos commented Jan 13, 2013

@juliangruber the fact that it augments node is really obtrusive.

The fact that it enforces a global lockstepped waterfall pipeline is about as anti-stream as you can get.

@juliangruber
Copy link
Member

@Raynos it does neither.

@dominictarr
Copy link
Contributor

This is not an issue about express, guys.

@navaru
Copy link
Author

navaru commented Feb 10, 2013

Thank you for your help and examples. I've tried most of the modules that I could find on NPM.
I've started to write a CMSish node app, I'm looking to build something closer to mongoose on top of leveldb, tho' I have a lot to learn until I get there.

Taking an example like a 'users' collection, based on the examples I've got something like:

key: users:7blcacc80-72e.. value: { name: John, age: 21 }
key: users:7bcfsdg71-72e.. value: { name: Bob, age: 23 }
key: users:7bsdfsg50-72e.. value: { name: Mike, age: 25 }
...

How would I search for all the users under 22? I could do a map-reduce but is there a better way when the number of records is very large?
Implementing something like a 'view' from couchdb would be a good approach? Tho I am thinking when I wanna do some custom filters I always need to have a view?

I've tried to look at how Riak uses leveldb and how search goes there but I found this comment on a post about Riak: "To be clear, search uses its own storage engine, merge_index, which is separate from the KV backend... The search indexes themselves are stored in a merge_index backend, and in the future we may also support a leveldb backend for those search indexes."

Search indexes == views?

TL;DR
How do I implement search in leveldb?

@juliangruber
Copy link
Member

you can use level-hooks to store young users under a special field. Or store your own indexes, e.g. with b+ trees. Search is kinda hard and we don't have that implemented yet. This could be sth where you have to resort to classical computer sience.

@Raynos
Copy link
Member

Raynos commented Feb 10, 2013

@navaru There are two ways you can do searching, incremental map reduce or re-indexing.

You can create a hook that listens on put and does a second put at users~age:21:{{id}} with { name: John, age 21 }

@jeremybenaim
Copy link

@navaru or you can use levelhooks to index your documents in elasticsearch, which might be not the better way (since you need an elasticsearch server) but sounds like a solid solution.

Anyway, a full leveldb search layer would surely be better.

@dominictarr
Copy link
Contributor

using map-reduce gives you basically the same thing as couch db has, except it's incremental and realtime.

If you are gonna use a hook type thing to index in elastic search (or any other db sort of thing) use level-trigger instead, because it will be reliably eventually consistent, even if your process crashes.

if you want to make a separate index you can do it like this with hooks:

db.hooks.pre(function (batch) {
  var l = batch.length
  while(l--) {
   var v = JSON.parse(batch[l].value)
   if('number' === typeof v.age) {
     batch.push({type: 'put', key: ['AGES', pad(v.age, 10), v.name].join('!'), value: batch[l].value}
   }
  }
  return batch
})

Iterate over the batch backwards (so you don't hit the things you are adding)
and add extra records, indexed by the age. pad age with leading zeros, so 22 is 0000022,
otherwise 4 will sort after 34. Finally, we it to the batch, so it's saved atomically with the other data

join the key with ! so you can sort by range. ! is the first printable ascii character. for this to work you must not have ! in the key. you could also use null ('\x00') but I am using ! in this example for clarity.

This will save it twice, but each record will be the same.

Then you can query an age range by db.readStream({start:AGES!00000!', end: 'AGES!00022!~'})`.
That will return every document between ages 0 and 22.

(~ is the last printable ascii character, so use that to represent the end of the range, could also use \xFF)

However, age changes every year, so this will end up with records saved twice.
Use dateOfBirth instead (otherwise example is pretty much the same).

dateOfBirth will change as well - it might be entered incorrectly. If you use map-reduce it will handle this, remembering to delete the old map.

@deanlandolt
Copy link

I just released a library for bytewise structural sorting [1] to avoid having to use these kinds of hacks. It will sort numbers and arrays and such correctly, and exactly like couchdb -- except it supports an even wider range of types.

[1] https://github.com/deanlandolt/bytewise

@dominictarr
Copy link
Contributor

levelup has support for custom comparator functions, but they need to be implemented in C.

@deanlandolt
Copy link

Great -- I looked for that in the docs but hadn't come across it.

Still, this approach allows you to build pretty much any kind of sorting you could possibly need using a fast, simple binary serialization. It'll let you store anything you can serialize with json (and a lot more), except better, because it'll sort properly, including for numbers and component-wise for arrays.

@dominictarr
Copy link
Contributor

hmm, the only problem is that a range query on those values will not comeout in the same ordering...
although! maybe you could escape them... and add a crazy hack that converted them into something that would make them lexiographically sort in the desired order?

@dominictarr
Copy link
Contributor

Ooops, I just read your docs and realized that was indeed what you had done!

@dominictarr
Copy link
Contributor

@deanlandolt I am curious what the string versions look like, can you add a section showing that to the readme?

@deanlandolt
Copy link

I'm not sure I understand exactly what you mean by "string versions" but I put a bunch of examples in the README with their corresponding hex values.

@dominictarr
Copy link
Contributor

Oh, right - that was what I meant. How come you used hex instead of making it (semi) human-readable?

@deanlandolt
Copy link

What encoding do you prefer? The type tags are just arbitrary (sometimes non-printable) bytes, and those bytes are actually shifted by one when nested in arrays (so 'abc' becomes 'bcd' + a null terminating byte). I figured anything other than hex would just look like noise, but maybe a few examples that are printable (even if a little confusing) would be in order.

@deanlandolt
Copy link

Okay, I added a few examples showing the raw binary encoding. Thanks for the feedback. I'm really curious what the levelup community thinks about this lib -- my hope is that it makes some really difficult use cases a lot easier.

@rvagg
Copy link
Member

rvagg commented Apr 4, 2013

this would work nicely with #51, we really need a simple way of hooking in to encode & decode operations for all reads and writes

@rvagg rvagg closed this as completed May 20, 2013
@balupton
Copy link

It would be great to get something like https://github.com/karlseguin/the-little-redis-book for LevelDB. After reading that book (1.5 hours) I now am equipped with the knowledge of what redis does, how it does it, and what I can (and shouldn't) use it for. Right now, the leveldb stuff just talks about the API, it's good for people who already have the problem, and know what their looking for, but for people evaluating what's out there, and where it fits in, leveldb is in the dark.

@andrewrk
Copy link

I realize this issue is closed, but here's a "real world use case": Groove Basin - music player server and client

@rvagg rvagg reopened this Jun 16, 2014
@ralphtheninja
Copy link
Member

@andrewrk Awesome! Do you accept bitcoin donations?

@andrewrk
Copy link

Do you accept bitcoin donations?

Sure, if you feel so inclined. https://www.gittip.com/superjoe30/

@ralphtheninja ralphtheninja reopened this Dec 18, 2018
@ralphtheninja ralphtheninja transferred this issue from Level/levelup Dec 18, 2018
@ralphtheninja ralphtheninja added the discussion Discussion label Dec 18, 2018
@vweevers
Copy link
Member

vweevers commented Jan 4, 2020

Merging into #88

@vweevers vweevers closed this as completed Jan 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Discussion
Projects
None yet
Development

No branches or pull requests