New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support arbitary encoding #51
Comments
I think there are better names than "toEncoding" and "toBuffer" |
I would agree on that |
How about var db = levelup("uri", {
transform: function (raw) {
return new Type(JSON.parse(raw))
}
}) |
var db = levelup('uri', {
encode : function (raw) { return JSON.stringify(raw) },
decode : function (json) { return JSON.parse(json) }
}) |
what about var db = levelup('uri', {
encoding : {
encode | stringify : function () {...},
decode | parse : function () {...}
}
}) pretty much every node module that does this will either have encode & decode, or stringify & parse. var db = levelup('uri', {
encoding : require('yamlish')
}) or JSON, or msg-pack or whatever. That is how I think you should do that, if you where going to do this. |
@dominictarr With that approach we could check if it's a callback or not and have support for streams. Just pipe the data through a stream provided by the user and pipe it back as a duplex if the data should continue somewhere else. The user can then create any type of chain of streams. |
@dominictarr the reason this needs to go into core is because there's a bunch of json / buffer encoding logic all over the place already. I just want that logic to allow me to use custom encoding / decoding functions. If we dont want this in code then we can remove all the encoding / decoding logic. I do agree that |
stating that you need to provide both methods by wrapping them in a config option is a good idea. What I don't like is big objects for configuration instead of methods and chaining, which are more beautiful and also semantically more correct. |
Something to keep in mind: the binding can handle either Strings or Buffers so that's fine, but coming back out it needs to know whether to put the contents into a String or a Buffer; putting everything into a Buffer is going to have a performance hit where Strings are all that's needed (the majority case) due to an extra copy. So there would need to be some way to signal to the binding what to return. See the Also, my vote would be for adding this to the options object, and |
@ralphtheninja streams is the wrong idea here, because the leveldb api needs a single buffer/string. I think we should remove the json encoding. just support the same base64, utf-8, ascii encodings that node core has. Small core, big Userland. Note, it's beginning to become evident that node core is still too large. Easier to nip it in the bud, than to prune it back when fully grown. |
The correct thing would be to use buffers as only data type, but that's not an option since it slows things down and I think that 95% of users only need strings. So what about making strings the default type and having buffers as an option like db.put('foo', new Buffer('bar'))
db.get('foo', { buffer : true }) |
Also, should the encode/decode functions need to handle buffers or just be ignored for them? |
Maybe if you supply an encoder and/or decoder then you can't supply the encoding option for any operation (you can currently supply it for any operation). Then, at the same time as you give the encoder and decoder you can supply the There's also the issue that you can adjust encoding for keys too, probably a < 1% use-case but it's still there as an option. |
The binding only needs the encoding when creating a string, or accepting a string, correct? |
only when passing back into V8, otherwise it can detect whether it's a |
Right, but if the encoding is base64 or something, you turn it into a different string. Also, can't you just make a Buffer that point to that particular memory range? or is there memory management problems there? |
Making a new |
hmm... so setting any value for encoding implys a String. ... this is complicated by arbitrary encodings, |
So, the core issue here is how to handle whether an encoding goes to a buffer or a string, |
The other issue I've found with custom encodings (see: http://github.com/eugeneware/byteup) is that the 'batch' event that gets emitted ignores the decoding, and you just get the raw buffer, which may/may not be what you want. While there is a performance issue of decoding the buffer, it was a little disconcerting to see the abstraction leak. |
Another option with specifying custom encodings would be simply to override the existing keyEncoding and valueEncoding options. If it's text then use the built in codecs. If it's a function use that: var db = levelup('uri', {
keyEncoding: {
encode : function (raw) { return JSON.stringify(raw) },
decode : function (json) { return JSON.parse(json) }
},
valueEncoding: {
encode : function (raw) { return JSON.stringify(raw) },
decode : function (json) { return JSON.parse(json) }
}
}) Then you could have custom libraries that just return the var mycodec = require('mycode');
var db = levelup('uri', {
keyEncoding: mycodec,
valueEncoding: mycodec
}) |
the Re your suggestion, yes I think that's the way to go except for the issue of the codec needing to tell LevelDOWN whether it wants to receive buffers or Strings--it matters for performance that we get the right type from the db. So there needs to be a 3rd param. |
Okay, what about: var encoding = {
stringify: function (...) {..}, parse: function (...){...}, buffer: true | false
} I like stringify/parse because then you can just do keyEncoding: JSON Of course, somethings use encode/decode... and even pack/unpack... because people are like that. |
hmm, except that it's already called a value encoding... |
Let's say I want encoding like
I basically want to wrap all objects coming out of the db in a certain encoding and clean up all objects I put into the db with a certain cleanup logic.
The text was updated successfully, but these errors were encountered: