ruby implementation #14

Merged
merged 20 commits into from May 28, 2013

Projects

None yet

3 participants

Contributor
jackdoe commented Dec 31, 2012

This is simple ruby implementation (including snappy compression support).
It is still incomplete (does not have tracking support, refp/refn/alias/object/objectv) and it does have different RB_OBJECT support (it is more like HASH - described in ext/sereal/encode.c, it uses RESERVED + 1 tag).

RB_OBJECT tag:  (RESERVED)
class name <STR>
instance variables count <VARINT>
[instance variable name <STR>,instance variable value <ITEM> ,...]
SRL_HDR_SYM tag: (RESERVED + 1)
len <VARINT>
value - <len> bytes

due to the nature of Sereal protocol, it fits very nicely in my work env, and will replace BSON because it is n-times faster (esp. for hashes with few short_binary keys filled with small arrays).
In the future I will implement the cyclic structure.

Im a little worried by this patch. proto.h is generated from the specification file, so merging these headers will make your life awkward. (It is doable if you insist, but you should look at the precedents for generated files).

Owner

I think you are right, when i decided to combine them it seemed like a good idea, because there are few macros operating on proto.h key-words and because the protocol specification is very compact merging them looked OK (it was 23:50 @ new year's eve).

Now when I look at it it looks awkward.

@jackdoe jackdoe split sereal.h into sereal.h and proto.h (protocol specification); bu…
…ffer.c, sereal.c: rename DEFAULT_CALLBACK to s_default_reader; encode.c: fix s_append_integer() to work with ULL integers; correctly #undef RETURN_STRING
c697c31
Contributor
jackdoe commented Jan 2, 2013

BTW, i am not sure how people will feel about this way of encoding ruby objects (it sends all instance variables), but i it is very fast because it does not use to_json or to_msgpack methods to create new object and etc..

It is a bit subjective in this case because currently i transfer simple objects but i encode/decode them to hashes.

So maybe it will be better just to check for to_srl method and call it before encoding the object.

Owner
demerphq commented Jan 2, 2013

On 2 January 2013 09:42, borislav nikolov notifications@github.com wrote:

BTW, i am not sure how people will feel about this way of encoding ruby
objects (it sends all instance variables), but i it is very fast because it
does not use to_json or to_msgpack methods to create new object and etc..

It is a bit subjective in this case because currently i transfer simple
objects but i encode/decode them to hashes.

So maybe it will be better just to check for to_srl method and call it
before encoding the object.

I dont understand why you cant use the OBJECT + HASH notation for objects,
however it is that you end up mapping from and to ruby objects.

We would need a really good justification for adding a new tag. We would
like to make it possible to do cross language mappings, so adding a new tag
goes the wrong way.

As far as constructing a ruby object I can only say that in the perl
version we "inflate" the object out of nothing, to look as expected.

Anyway, we are reviewing your changes and hopefully can work out the
details to get your patch sequence merged. You can expect further followup.

Thanks a lot for the effort!

cheers,
Yves

perl -Mre=debug -e "/just|another|perl|hacker/"

jackdoe added some commits Jan 3, 2013
@jackdoe jackdoe Improve decoder speed, cleanup
Move decoder's function pointer registration from runtime to static this
improved its performance dramatically (depending on the input 10-30%).

- cleanup
- create realloc_or_raise() that is used in s_alloc() and
  alloc_or_raise()
- benchmark using benchmark/ips, which tries to compare
  iterations/second
094696e
@jackdoe jackdoe Improve encoder performance
use function pointers to avoid bad branch prediction with
switch() because there is no way to know which type of objects
are most common in the input.
b0a1b76
@jackdoe jackdoe Inline buffer functions 86b8cb2
@jackdoe jackdoe Disable SRL_RB_OBJECT decoding, unless requested
After recent security problems with deserializing stuff into arbitrary objects,
maybe it will be best to let the user decide if he/she can trust the input data.
Sereal.decode(data) will not decode SRL_RB_OBJECT tag but Sereal.decode(data,false)
will spawn arbitrary objects.
a5b9c8c
guai commented Apr 12, 2013

Am I right thinking that nonascii :šÿmbõłs not supported in your implementation?

Owner

On 12 April 2013 19:06, guai notifications@github.com wrote:

Am I right thinking that nonascii :šÿmbõłs not supported in your
implementation?

Are you asking about Sereal in general or about the Ruby implementation?

Yves

perl -Mre=debug -e "/just|another|perl|hacker/"

jackdoe added some commits Apr 12, 2013
@jackdoe jackdoe cleanup, unused variables, remove FLT2NUM etc..
* find maximum size of ruby_value_type enum - those we use
as indexes in WRITER[] function pointer array

* FLT2NUM was removed, use DBL2NUM (casting float to double)
* remove some unused variables and signess warnings
* add 'requirements' in README
22f60e1
@jackdoe jackdoe add rubinius (mri 1.9) support a9460f4
@jackdoe jackdoe cleanup rubinius port c2a62ac
@jackdoe jackdoe cleanup a20bcb2
guai commented Apr 15, 2013

demerphq, I was asking about ruby implementation. As far as I know sereal supports utf8 strings. But :symbols in ruby may be in utf8 too, so it seems to be not the right implementation.

jackdoe added some commits Apr 15, 2013
@jackdoe jackdoe temporary: remove support for symbols/raw objects
* Convert symbols to strings
* call to_srl on objects and expect serializeable structure (hash,
  array,string..etc)
2f84d8d
@jackdoe jackdoe update readme e67e43f
Contributor
jackdoe commented Apr 15, 2013

@guai yea, this implementation did not support utf8 symbols, but it was unclear if there will be a tag for ruby symbol, so when this decision is made, making the fix for utf8 symbols is pretty easy

@jackdoe jackdoe merged commit 7416ae4 into Sereal:master May 28, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment