Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

LevelDB & Node.js 'real world' use cases #55

Open
ugin opened this Issue Jan 7, 2013 · 37 comments

Comments

Projects
None yet
10 participants

ugin commented Jan 7, 2013

I've read the docs on LevelDB and some topics on the LevelDB Google Group & StackOverflow, I understand for what it was built.

What I want to know is what are some of your use cases and on what scenario do you believe LevelDB & Node.js is a good fit.

I am not very experienced with DBs but I would like to learn and there's not too much info on LevelDB & Node.js

Thank you :)

Owner

juliangruber commented Jan 7, 2013

It empowers you to write your own db, from a super light but fast filesystem abstraction to a beefed up server with custom replication schemes and other application-logic right at the heart of this super fast thing. And all in one language, on client, server and database.

Owner

Raynos commented Jan 7, 2013

You could implement something like lambda architecture completely on top of leveldb.

it's a basic building block, you can build your own db abstraction with your own trade offs on top of it.

Owner

rvagg commented Jan 7, 2013

@navaru LevelDB makes data persistence easy. Instead of storing your data structures in memory, store them in a LevelDB database and fetch when neede--this way you can store a ton of data and not have to worry about RAM and you get to keep that data across restarts. This is how I mainly use it, LevelUP makes it super simple and fetching & processing large amounts via a readStream() is just so nice to work with. If you use the inbuilt JSON encoding then you get to pretend the data is in memory (except where serialisation/deserialisation may change the form of your data, like Date objects), since LevelDB is so fast and all operations are async the speed impact of storing on disk is hardly noticeable.

@rvagg rvagg closed this Jan 7, 2013

Owner

dominictarr commented Jan 8, 2013

I propose leaving this option open as a 'discussion issue'

@dominictarr dominictarr reopened this Jan 8, 2013

ugin commented Jan 8, 2013

I've read the article on 'lambda architecture', and I think I've worked with those concepts when using other DBs, batch layer on CouchDB views, Serving layer on Riak, etc, am I correct?

Because there are no examples, if I need to implement a namespace structure (collections in MongoDB or buckets in Riak), how should I proceed? I need to create an architecture first as the one advised in 'lambda architecture' or I can create a simpler layer to work with?

Do you have any architecture layer available as Open Source or any apps where you've used LevelDB (LevelUP)?

Most of my apps are targeted at small companies, where I don't need to setup a cluster, a single server almost always does more than I need, when I deliver an app I don't want to include an external DB like Mongo or Riak. I want to build a layer on top of LevelDB that suits my needs, like a custom CMS or something in this area.

Thank you for your replies :)

Owner

dominictarr commented Jan 8, 2013

There are two modules available for generating namespaces,
https://github.com/dominictarr/range-bucket and https://github.com/kesla/level-namespace

The way that leveldb is set up internally, it's pretty much it's own lambda architecture.
I have map-reduce module for levelup https://github.com/dominictarr/map-reduce
please give feedback if you decide to use it.

also, see the module listing! https://github.com/rvagg/node-levelup/wiki/Modules it's brand new!

Owner

rvagg commented Jan 9, 2013

It occurred to me after writing it that the LevelDB code in Level Session might be a good example of the basics of using LevelDB in Node. It covers the basic operations and a simple readStream() search.

https://github.com/rvagg/node-level-session/blob/master/lib/level-store.js

Hi there,

Not sure that this is the good place to post that but whatever...

I've put together a small app based on express which relies on leveldb and which might be a good illustration of what you can achieve pretty quickly w/ level-up (and level-namespace)

https://github.com/jeremybenaim/express-leveldb

Any feedback is more than welcome :)

Thanks folks

Owner

juliangruber commented Jan 12, 2013

looks good, I'd only stream the results of findall to the client instead of buffering them

Thanks , will do that. Btw, do not hesitate to open issues directly on the repo if you got any other suggestions! :)

Owner

Raynos commented Jan 12, 2013

@juliangruber you can't stream results to the client with express, express is not build for streams.

Right, but through sockets?

Owner

Raynos commented Jan 13, 2013

Sure through sockets but then you either need to parse stream JSON over XHR or websockets or build a streaming templating engine

Owner

juliangruber commented Jan 13, 2013

@Raynos Express only augments the response object, res.write still is there. Express is really unobstrusive.

Owner

Raynos commented Jan 13, 2013

@juliangruber the fact that it augments node is really obtrusive.

The fact that it enforces a global lockstepped waterfall pipeline is about as anti-stream as you can get.

Owner

juliangruber commented Jan 13, 2013

@Raynos it does neither.

Owner

dominictarr commented Jan 13, 2013

This is not an issue about express, guys.

ugin commented Feb 10, 2013

Thank you for your help and examples. I've tried most of the modules that I could find on NPM.
I've started to write a CMSish node app, I'm looking to build something closer to mongoose on top of leveldb, tho' I have a lot to learn until I get there.

Taking an example like a 'users' collection, based on the examples I've got something like:

key: users:7blcacc80-72e.. value: { name: John, age: 21 }
key: users:7bcfsdg71-72e.. value: { name: Bob, age: 23 }
key: users:7bsdfsg50-72e.. value: { name: Mike, age: 25 }
...

How would I search for all the users under 22? I could do a map-reduce but is there a better way when the number of records is very large?
Implementing something like a 'view' from couchdb would be a good approach? Tho I am thinking when I wanna do some custom filters I always need to have a view?

I've tried to look at how Riak uses leveldb and how search goes there but I found this comment on a post about Riak: "To be clear, search uses its own storage engine, merge_index, which is separate from the KV backend... The search indexes themselves are stored in a merge_index backend, and in the future we may also support a leveldb backend for those search indexes."

Search indexes == views?

TL;DR
How do I implement search in leveldb?

Owner

juliangruber commented Feb 10, 2013

you can use level-hooks to store young users under a special field. Or store your own indexes, e.g. with b+ trees. Search is kinda hard and we don't have that implemented yet. This could be sth where you have to resort to classical computer sience.

Owner

Raynos commented Feb 10, 2013

@navaru There are two ways you can do searching, incremental map reduce or re-indexing.

You can create a hook that listens on put and does a second put at users~age:21:{{id}} with { name: John, age 21 }

@navaru or you can use levelhooks to index your documents in elasticsearch, which might be not the better way (since you need an elasticsearch server) but sounds like a solid solution.

Anyway, a full leveldb search layer would surely be better.

Owner

dominictarr commented Feb 12, 2013

using map-reduce gives you basically the same thing as couch db has, except it's incremental and realtime.

If you are gonna use a hook type thing to index in elastic search (or any other db sort of thing) use level-trigger instead, because it will be reliably eventually consistent, even if your process crashes.

if you want to make a separate index you can do it like this with hooks:

db.hooks.pre(function (batch) {
  var l = batch.length
  while(l--) {
   var v = JSON.parse(batch[l].value)
   if('number' === typeof v.age) {
     batch.push({type: 'put', key: ['AGES', pad(v.age, 10), v.name].join('!'), value: batch[l].value}
   }
  }
  return batch
})

Iterate over the batch backwards (so you don't hit the things you are adding)
and add extra records, indexed by the age. pad age with leading zeros, so 22 is 0000022,
otherwise 4 will sort after 34. Finally, we it to the batch, so it's saved atomically with the other data

join the key with ! so you can sort by range. ! is the first printable ascii character. for this to work you must not have ! in the key. you could also use null ('\x00') but I am using ! in this example for clarity.

This will save it twice, but each record will be the same.

Then you can query an age range by db.readStream({start:AGES!00000!', end: 'AGES!00022!~'})`.
That will return every document between ages 0 and 22.

(~ is the last printable ascii character, so use that to represent the end of the range, could also use \xFF)

However, age changes every year, so this will end up with records saved twice.
Use dateOfBirth instead (otherwise example is pretty much the same).

dateOfBirth will change as well - it might be entered incorrectly. If you use map-reduce it will handle this, remembering to delete the old map.

I just released a library for bytewise structural sorting [1] to avoid having to use these kinds of hacks. It will sort numbers and arrays and such correctly, and exactly like couchdb -- except it supports an even wider range of types.

[1] https://github.com/deanlandolt/bytewise

Owner

dominictarr commented Apr 3, 2013

levelup has support for custom comparator functions, but they need to be implemented in C.

Great -- I looked for that in the docs but hadn't come across it.

Still, this approach allows you to build pretty much any kind of sorting you could possibly need using a fast, simple binary serialization. It'll let you store anything you can serialize with json (and a lot more), except better, because it'll sort properly, including for numbers and component-wise for arrays.

Owner

dominictarr commented Apr 4, 2013

hmm, the only problem is that a range query on those values will not comeout in the same ordering...
although! maybe you could escape them... and add a crazy hack that converted them into something that would make them lexiographically sort in the desired order?

Owner

dominictarr commented Apr 4, 2013

Ooops, I just read your docs and realized that was indeed what you had done!

Owner

dominictarr commented Apr 4, 2013

@deanlandolt I am curious what the string versions look like, can you add a section showing that to the readme?

I'm not sure I understand exactly what you mean by "string versions" but I put a bunch of examples in the README with their corresponding hex values.

Owner

dominictarr commented Apr 4, 2013

Oh, right - that was what I meant. How come you used hex instead of making it (semi) human-readable?

What encoding do you prefer? The type tags are just arbitrary (sometimes non-printable) bytes, and those bytes are actually shifted by one when nested in arrays (so 'abc' becomes 'bcd' + a null terminating byte). I figured anything other than hex would just look like noise, but maybe a few examples that are printable (even if a little confusing) would be in order.

Okay, I added a few examples showing the raw binary encoding. Thanks for the feedback. I'm really curious what the levelup community thinks about this lib -- my hope is that it makes some really difficult use cases a lot easier.

Owner

rvagg commented Apr 4, 2013

this would work nicely with #51, we really need a simple way of hooking in to encode & decode operations for all reads and writes

@rvagg rvagg closed this May 20, 2013

It would be great to get something like https://github.com/karlseguin/the-little-redis-book for LevelDB. After reading that book (1.5 hours) I now am equipped with the knowledge of what redis does, how it does it, and what I can (and shouldn't) use it for. Right now, the leveldb stuff just talks about the API, it's good for people who already have the problem, and know what their looking for, but for people evaluating what's out there, and where it fits in, leveldb is in the dark.

Contributor

andrewrk commented Jun 15, 2014

I realize this issue is closed, but here's a "real world use case": Groove Basin - music player server and client

@rvagg rvagg reopened this Jun 16, 2014

Owner

ralphtheninja commented Jun 29, 2014

@andrewrk Awesome! Do you accept bitcoin donations?

Contributor

andrewrk commented Jun 29, 2014

Do you accept bitcoin donations?

Sure, if you feel so inclined. https://www.gittip.com/superjoe30/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment