Custom encodings #149

dominictarr · 2013-06-18T17:37:21Z

Adds support for custom encodings.

Just pass in an object with {encode: function, decode: function, buffer: boolean}
instead of a string encoding type, for keyEncoding, valueEncoding or encoding options.

So, this could probably be cleaned up, to use more abstraction - like: var key =this.toKey(key_)
but I did it this way, because was simplest to make smallest change to the current code.

closes #51

max-mapper · 2013-06-18T22:28:54Z

speaking on behalf of the web browser lobby I would like to ask if this API would let me ensure that levelup/leveldown never try to create a Buffer, e.g. what are the semantics of the buffer: boolean option? can there be a default in which leveldown returns the value from the db in whatever encoding it is stored in?

dominictarr · 2013-06-18T22:37:29Z

Yes! so this is passed straight to leveldown, and leveldown returns whatever it wants
see the levelDown asBuffer option.

https://github.com/rvagg/node-leveldown#leveldowngetkey-options-callback

that should be flexible - we could use bops in the leveldown tests, so that a Bufferish object is acceptable.

dominictarr · 2013-06-18T23:00:40Z

@maxogden the reason for the option is to avoid a memcopy in leveldown, depending on whether the custom encoding wants to retrive a buffer or a string (example: JSON needs string, msgpack needs buffers)

max-mapper · 2013-06-18T23:11:29Z

@chrisdickinson have you benchmarked bops at all? there is a question of if it is any slower than plain Buffers when running in node on top of Buffers

chrisdickinson · 2013-06-18T23:14:53Z

@maxogden I haven't, but I would expect to see a (very) small slowdown as the node version of bops is just directly delegating to the appropriate Buffer operations (so it's the overhead of an extra call that could potentially be inlined by the JIT).

rvagg · 2013-06-18T23:38:03Z

test/read-stream-test.js

@@ -389,6 +390,38 @@ buster.testCase('ReadStream', {
      }.bind(this))
    }

+  , 'test injectable encoding': function (done) {


What is this testing other than the ability to pass in the encoding object? Can we do a utf8 readStream and collect the data and compare it to what it should look like in msgpack format?

rvagg · 2013-06-18T23:44:57Z

Yeah, I'm cool with this I think. It gives us the flexibility to still pass in alternative encodings to individual operations so you could have crazy stuff like msgpack and bytewise encoded values/keys in the same store.
Needs some cleaning up and some of the methods are getting pretty big now so it'd be good to try and pull out common utility functions.

dominictarr · 2013-06-19T04:35:10Z

Cool, agree on need to refactor.

mcollina · 2013-06-19T09:47:17Z

In node-level-ttl there is some need of an API for encoding/decoding the keys, as it does not support keys that are not utf8. How about exposing the encoding/decoding functions in the API?

Level/level-ttl#3

(if you can figure out how to deal with that without some LevelUp support, I'll be very happy!)

mikeal · 2013-07-07T19:04:46Z

@dominictarr just enlightened me to the fact that I could use this for validations by just throwing if the key didn't match certain parameters. +1

rvagg · 2013-07-17T13:33:22Z

Where are we at with this @dominictarr? would be a good one to get out soon. Do you need someone else to help get this finalised?

dominictarr · 2013-07-17T23:55:47Z

I want to put some time into this soon, and also, getting lt, gt, lte, gte into leveldown - probably next week?

rvagg · 2013-07-17T23:59:26Z

excellent, le tme know if you need any help cleaning up the leveldown work, looking forward to that!

dominictarr · 2013-08-13T18:45:10Z

okay!

I've reorganised this to use some utility functions,
which didn't involve any changes to the public interface.
as mentioned before, there is are some additional tests.

Also, https://github.com/eugeneware/byteup will break with this change, so @eugeneware should update it.
(note, byteup adds another encoding to levelup, but not via the public api, so thems the breaks)

Also, added documentation to readme.

dominictarr · 2013-08-13T19:04:30Z

okay, this is now passing the tests on master & good to merge!

rvagg · 2013-08-14T00:09:46Z

LATM ("Looks Awesome To Me")

If there's no objections by the time you're around next, go ahead and merge and I'll push it in a release, this is fantastic.

eugeneware · 2013-08-14T00:12:29Z

@dominictarr no probs. I'll update it work again. Can't wait for this sucker to land! :-)

eugeneware · 2013-08-14T08:57:54Z

Can't find any easy way to update byteup. @dominictarr any ideas?

But that's fine. The new method is better. Byteup will be deprecated once this land in any case. It was an interim hack.

Though it would be nice if you could register global types. There is a type field that gets output for debugging purposes. Would be nice if you could have a simple way to register a default encoding for a type string.

But it's not a big deal. Let's push this out :-)

Custom encodings

dominictarr · 2013-08-14T12:52:14Z

merged.

@eugeneware to be honest, I'm a little confused about how byteup is used, I can see it's called by subindex.
I see that it adds a 'bytewise' encoding to levelup's utils, but I can't see it ever pass that as an encoding option to anything?

eugeneware · 2013-08-14T13:02:14Z

@dominictarr presently you use byteup like so:

var bytewise = byteup(); 
// adds the type "bytewise" to the encodings list with the bytewise.encode / bytewise.decode fns

var db = levelup(dbPath, { keyEncoding: 'bytewise', valueEncoding: 'json' }); 
// the string 'bytewise' will map to the codec entry added.

Once again, not a big deal, but checking if there was some way with the new API to register an entry into the encoders/decoders data structure for a specific string "type" (ie. in the same way that 'utf8' and 'json' exist).

Hope I'm making sense. If not, no big dramas, but worth an ask.

eugeneware · 2013-08-14T13:05:09Z

@dominictarr in regards to subindex, I started off using byteup, but I struggled to get sublevel play well with Buffers as a keyencodings (like bytewise). I eventually just encoded the key as a hex string.

You're right, the byteup reference is not actually used in subindex any more.

dominictarr · 2013-08-14T13:53:34Z

oh right - that makes sense. I would be opposed to registering encodings,
better, would be to just get modules like msgpack and bytewise to export the {encode, decode, buffer, type} pattern,
and then you can just do

level(path, {keyEncoding: require('bytewise'), valueEncoding: require('msgpack')})

etc.

I am planning on refactoring level-sublevel so that it has real support for binary encoded keys, bytewise, etc.

Although, ranges is a higher priority for me. Once that is cleaned up it will remove a lot of crud from level-sublevel.

dominictarr added 3 commits June 18, 2013 18:30

tests for custom encodings

a3b9541

implement custom encodings

2f22f70

use msgpack-js for tests

9aa1d4a

rvagg reviewed Jun 18, 2013
View reviewed changes

juliangruber mentioned this pull request Jun 27, 2013

added msgpack encoding support #151

Closed

dominictarr mentioned this pull request Aug 13, 2013

this beaks if you do not use level instead of using levelup directly. eugeneware/byteup#1

Closed

dominictarr added 5 commits August 13, 2013 16:03

refactor encoding to use util functions

72f3874

optional declare the encoding's type

a238110

tidy up encodings

2e06ba0

tidy

e990a99

document custom encodings

2e2e215

merge

03a403e

dominictarr added a commit that referenced this pull request Aug 14, 2013

Merge pull request #149 from dominictarr/custom_encodings

79c974c

Custom encodings

dominictarr merged commit 79c974c into Level:master Aug 14, 2013

rvagg mentioned this pull request Aug 15, 2013

Custom encodings cleanup #173

Merged

dominictarr mentioned this pull request Aug 15, 2013

support new levelup encoding pattern deanlandolt/bytewise#6

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom encodings #149

Custom encodings #149

dominictarr commented Jun 18, 2013

max-mapper commented Jun 18, 2013

dominictarr commented Jun 18, 2013

dominictarr commented Jun 18, 2013

max-mapper commented Jun 18, 2013

chrisdickinson commented Jun 18, 2013

rvagg Jun 18, 2013

rvagg commented Jun 18, 2013

dominictarr commented Jun 19, 2013

mcollina commented Jun 19, 2013

mikeal commented Jul 7, 2013

rvagg commented Jul 17, 2013

dominictarr commented Jul 17, 2013

rvagg commented Jul 17, 2013

dominictarr commented Aug 13, 2013

dominictarr commented Aug 13, 2013

rvagg commented Aug 14, 2013

eugeneware commented Aug 14, 2013

eugeneware commented Aug 14, 2013

dominictarr commented Aug 14, 2013

eugeneware commented Aug 14, 2013

eugeneware commented Aug 14, 2013

dominictarr commented Aug 14, 2013

Custom encodings #149

Custom encodings #149

Conversation

dominictarr commented Jun 18, 2013

max-mapper commented Jun 18, 2013

dominictarr commented Jun 18, 2013

dominictarr commented Jun 18, 2013

max-mapper commented Jun 18, 2013

chrisdickinson commented Jun 18, 2013

rvagg Jun 18, 2013

Choose a reason for hiding this comment

rvagg commented Jun 18, 2013

dominictarr commented Jun 19, 2013

mcollina commented Jun 19, 2013

mikeal commented Jul 7, 2013

rvagg commented Jul 17, 2013

dominictarr commented Jul 17, 2013

rvagg commented Jul 17, 2013

dominictarr commented Aug 13, 2013

dominictarr commented Aug 13, 2013

rvagg commented Aug 14, 2013

eugeneware commented Aug 14, 2013

eugeneware commented Aug 14, 2013

dominictarr commented Aug 14, 2013

eugeneware commented Aug 14, 2013

eugeneware commented Aug 14, 2013

dominictarr commented Aug 14, 2013