Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

ruby implementation #14

Merged
merged 20 commits into from

3 participants

@jackdoe
Owner

This is simple ruby implementation (including snappy compression support).
It is still incomplete (does not have tracking support, refp/refn/alias/object/objectv) and it does have different RB_OBJECT support (it is more like HASH - described in ext/sereal/encode.c, it uses RESERVED + 1 tag).

RB_OBJECT tag:  (RESERVED)
class name <STR>
instance variables count <VARINT>
[instance variable name <STR>,instance variable value <ITEM> ,...]
SRL_HDR_SYM tag: (RESERVED + 1)
len <VARINT>
value - <len> bytes

due to the nature of Sereal protocol, it fits very nicely in my work env, and will replace BSON because it is n-times faster (esp. for hashes with few short_binary keys filled with small arrays).
In the future I will implement the cyclic structure.

@demerphq
Owner

Im a little worried by this patch. proto.h is generated from the specification file, so merging these headers will make your life awkward. (It is doable if you insist, but you should look at the precedents for generated files).

Owner

I think you are right, when i decided to combine them it seemed like a good idea, because there are few macros operating on proto.h key-words and because the protocol specification is very compact merging them looked OK (it was 23:50 @ new year's eve).

Now when I look at it it looks awkward.

@jackdoe jackdoe split sereal.h into sereal.h and proto.h (protocol specification); bu…
…ffer.c, sereal.c: rename DEFAULT_CALLBACK to s_default_reader; encode.c: fix s_append_integer() to work with ULL integers; correctly #undef RETURN_STRING
c697c31
@jackdoe
Owner

BTW, i am not sure how people will feel about this way of encoding ruby objects (it sends all instance variables), but i it is very fast because it does not use to_json or to_msgpack methods to create new object and etc..

It is a bit subjective in this case because currently i transfer simple objects but i encode/decode them to hashes.

So maybe it will be better just to check for to_srl method and call it before encoding the object.

@demerphq
Owner
jackdoe added some commits
@jackdoe jackdoe Improve decoder speed, cleanup
Move decoder's function pointer registration from runtime to static this
improved its performance dramatically (depending on the input 10-30%).

- cleanup
- create realloc_or_raise() that is used in s_alloc() and
  alloc_or_raise()
- benchmark using benchmark/ips, which tries to compare
  iterations/second
094696e
@jackdoe jackdoe Improve encoder performance
use function pointers to avoid bad branch prediction with
switch() because there is no way to know which type of objects
are most common in the input.
b0a1b76
@jackdoe jackdoe Inline buffer functions 86b8cb2
@jackdoe jackdoe Disable SRL_RB_OBJECT decoding, unless requested
After recent security problems with deserializing stuff into arbitrary objects,
maybe it will be best to let the user decide if he/she can trust the input data.
Sereal.decode(data) will not decode SRL_RB_OBJECT tag but Sereal.decode(data,false)
will spawn arbitrary objects.
a5b9c8c
@guai

Am I right thinking that nonascii :šÿmbõłs not supported in your implementation?

@demerphq
Owner
jackdoe added some commits
@jackdoe jackdoe cleanup, unused variables, remove FLT2NUM etc..
* find maximum size of ruby_value_type enum - those we use
as indexes in WRITER[] function pointer array

* FLT2NUM was removed, use DBL2NUM (casting float to double)
* remove some unused variables and signess warnings
* add 'requirements' in README
22f60e1
@jackdoe jackdoe add rubinius (mri 1.9) support a9460f4
@jackdoe jackdoe cleanup rubinius port c2a62ac
@jackdoe jackdoe cleanup a20bcb2
@guai

demerphq, I was asking about ruby implementation. As far as I know sereal supports utf8 strings. But :symbols in ruby may be in utf8 too, so it seems to be not the right implementation.

jackdoe added some commits
@jackdoe jackdoe temporary: remove support for symbols/raw objects
* Convert symbols to strings
* call to_srl on objects and expect serializeable structure (hash,
  array,string..etc)
2f84d8d
@jackdoe jackdoe update readme e67e43f
@jackdoe
Owner

@guai yea, this implementation did not support utf8 symbols, but it was unclear if there will be a tag for ruby symbol, so when this decision is made, making the fix for utf8 symbols is pretty easy

@jackdoe jackdoe merged commit 7416ae4 into Sereal:master
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Dec 31, 2012
  1. @jackdoe

    initial

    jackdoe authored
  2. @jackdoe

    forgot to remove old file

    jackdoe authored
  3. @jackdoe
  4. @jackdoe

    remove empty file

    jackdoe authored
  5. @jackdoe
  6. @jackdoe

    fix bm.rb require

    jackdoe authored
  7. @jackdoe

    Update ruby/README.md

    jackdoe authored
Commits on Jan 2, 2013
  1. @jackdoe

    split sereal.h into sereal.h and proto.h (protocol specification); bu…

    jackdoe authored
    …ffer.c, sereal.c: rename DEFAULT_CALLBACK to s_default_reader; encode.c: fix s_append_integer() to work with ULL integers; correctly #undef RETURN_STRING
Commits on Jan 3, 2013
  1. @jackdoe

    Improve decoder speed, cleanup

    jackdoe authored
    Move decoder's function pointer registration from runtime to static this
    improved its performance dramatically (depending on the input 10-30%).
    
    - cleanup
    - create realloc_or_raise() that is used in s_alloc() and
      alloc_or_raise()
    - benchmark using benchmark/ips, which tries to compare
      iterations/second
  2. @jackdoe

    Improve encoder performance

    jackdoe authored
    use function pointers to avoid bad branch prediction with
    switch() because there is no way to know which type of objects
    are most common in the input.
  3. @jackdoe

    Inline buffer functions

    jackdoe authored
Commits on Feb 3, 2013
  1. @jackdoe

    Disable SRL_RB_OBJECT decoding, unless requested

    jackdoe authored
    After recent security problems with deserializing stuff into arbitrary objects,
    maybe it will be best to let the user decide if he/she can trust the input data.
    Sereal.decode(data) will not decode SRL_RB_OBJECT tag but Sereal.decode(data,false)
    will spawn arbitrary objects.
Commits on Apr 12, 2013
  1. @jackdoe

    cleanup, unused variables, remove FLT2NUM etc..

    jackdoe authored
    * find maximum size of ruby_value_type enum - those we use
    as indexes in WRITER[] function pointer array
    
    * FLT2NUM was removed, use DBL2NUM (casting float to double)
    * remove some unused variables and signess warnings
    * add 'requirements' in README
Commits on Apr 14, 2013
  1. @jackdoe

    add rubinius (mri 1.9) support

    jackdoe authored
  2. @jackdoe

    cleanup rubinius port

    jackdoe authored
  3. @jackdoe

    cleanup

    jackdoe authored
Commits on Apr 15, 2013
  1. @jackdoe

    temporary: remove support for symbols/raw objects

    jackdoe authored
    * Convert symbols to strings
    * call to_srl on objects and expect serializeable structure (hash,
      array,string..etc)
  2. @jackdoe

    update readme

    jackdoe authored
  3. @jackdoe

    cleanup

    jackdoe authored
Commits on Apr 16, 2013
  1. @jackdoe

    Update README.md

    jackdoe authored
Something went wrong with that request. Please try again.