Variable length integers in serialization #741

kddnewton · 2023-03-30T20:37:45Z

kddnewton
Mar 30, 2023
Maintainer

Most of the integers that we store in serialization are related to offsets. For every integer we have, we use 4 bytes to represent the value. We don't need that much space for most cases.

One thing we could do instead is to use variable length integers. Protobuf describes it pretty well: https://protobuf.dev/programming-guides/encoding/#varints. This could potentially save on a fair amount of space because I would imagine most files would fit into 2^14, and almost all of the rest would fit into 2^21.

@enebo, @eregon thoughts? Would this kind of deserialization slow it down too much so that it wouldn't be worth it?

enebo · 2023-03-30T22:15:31Z

enebo
Mar 30, 2023
Collaborator

I have not read this particular guide but I have changed {start_offset, end_offset} to {start_offset, length} since all our APIs using this format and length usually is a small value. I then combined that with short for all that fit in 15 bits and then have highest value represent a marker to say it should be a full int (I have also played with 6bit byte values, short values and long but that is barely largely than just short+int). This reduced the size of the serialized file by 50% or so. In my case the speed got a tiny bit faster so the cost of reducing size was mildly beneficial to speed. It was always doing things at byte boundaries which I think makes it a simple scheme.

I will read that document and see if I can grok it :) It looks complicated but a lot of thought has been put into protobufs so I am optimistic.

We should try it and measure it but that I definitely think some scheme can be used without hurting perf.

4 replies

eregon Apr 4, 2023
Maintainer

Ah yes this is a good argument for length instead of end_offset. Should we change to that everywhere then (including deserialized nodes and original nodes)?

enebo Apr 4, 2023
Collaborator

@eregon for JRuby all our methods seem to be designed around (index, length) so as it stands we are doing a little math. I think for Nodes it is definitely my preference (and it is trivial to change in serialize.c). TR is probably similar but if not then it would be a decision to make it compress a bit more or not. As for C structs feeding serialization and compile.c? It probably just depends on how compile.c consumes those offsets.

I should point out in my playground branch I am generating JRuby stuff into a jruby directory. I am hoping we coalesce on important aspects of serialization (smaller/faster/less-processing) but we may not want to do everything the same. This is especially true once we start considering AST-less building or just even which extra interfaces/subtypes we want. We have a bigger discussion on best strategy for packaging too.

eregon Apr 5, 2023
Maintainer

Changing to start, length would be fine for TruffleRuby, it's already what Truffle uses internally actually.

eregon Apr 5, 2023
Maintainer

We have a bigger discussion on best strategy for packaging too.

Any pointer to that?

enebo · 2023-03-31T14:28:18Z

enebo
Mar 31, 2023
Collaborator

Using single sign bits for continuation vs what I am doing seems like it would help on space. I suppose my waste of space by using a single continuation byte is offset by not requiring the math that 128 bit varints use but simple int math is pretty cheap. The reduction in bytes to process is at odds with the math to reassemble the value. Tough to know without trying whether we would notice that overhead or not.

0 replies

eregon · 2023-04-04T10:35:58Z

eregon
Apr 4, 2023
Maintainer

One note is while this can decrease the serialized size it won't change anything about deserialized node size e.g. for Java or Ruby nodes.

But still it seems highly worth it if we can make the serialized size smaller without impacting performance too negatively. I think we should also look at msgpack varint encoding, that may be simpler. The protobuf seems not so efficient because you need to read each 1 bit of every byte to know how long vs knowing it from the first byte read.

22 replies

eregon May 20, 2023
Maintainer

I used main...eregon:yarp:varint to try the various varint strategies, I only did the serialization part and some I only mocked by writing the same number of bytes but not necessarily the correct value (since anyway we don't need to deserialize to check this).

Here are the results, core is truffleruby's core files, the 2nd run without argument is all .rb files of top-100-gems.

Initial, 32-bits for each uint32_t:
$ ruby serialized_size.rb core
source:        917377
serialized:   2248206
serialized/source: 2.45
$ ruby serialized_size.rb
source:      90191454
serialized: 150875080
serialized/source: 1.67

BER length encoding with endOffset:
$ ruby serialized_size.rb core
source:        917377
serialized:   1689720
serialized/source: 1.84
$ ruby serialized_size.rb
source:      90191454
serialized: 134330027
serialized/source: 1.49

For below, all with length instead of endOffset:

BER length encoding:
$ ruby serialized_size.rb core
source:        917377
serialized:   1315840
serialized/source: 1.43
$ ruby serialized_size.rb
source:      90191454
serialized:  98883526
serialized/source: 1.10

short or int:
$ ruby serialized_size.rb core
source:        917377
serialized:   1388546
serialized/source: 1.51
$ ruby serialized_size.rb
source:      90191454
serialized: 103607718
serialized/source: 1.15

2-bits prefix:
$ ruby serialized_size.rb core
source:        917377
serialized:   1167889
serialized/source: 1.27
$ ruby serialized_size.rb
source:      90191454
serialized:  82751892
serialized/source: 0.92

protobuf:
$ ruby serialized_size.rb core
source:        917377
serialized:   1162465
serialized/source: 1.27
$ ruby serialized_size.rb
source:      90191454
serialized:  82353925
serialized/source: 0.91

BER length encoding is very simple and a nice gain, but not as good as 2-bits prefix and protobuf which are basically the same in terms of serialized size.

enebo May 21, 2023
Collaborator

@eregon cool. It feels like time to read the bytes (not standup anything but just decode them back to ints) could be a deciding factor. Probably not but if it is easy to try then it would probably put a nail in the winner. Memory is important for laziness as a second property we want.

eregon May 22, 2023
Maintainer

For time to decode in Ruby, protobuf is faster than raw and probably faster than all others (as they involve reading multiple bytes at once which causes allocations or the need to use IO::Buffer) so that's good: #836 (reply in thread)
That's not the same for Java decoding though, I'll post it here when I get some measurements about that.

eregon May 23, 2023
Maintainer

I created #927 and measured on Java.
Basically, it's a tiny bit slower with protobuf varints than raw 4-bytes integers, which intuitively kind of makes sense, it's more work and less predictable than just reading 4 bytes and reinterpret as an int, OTOH it's simply less bytes to read so that benefits performance to some degree too probably.
protobuf varints are such a huge gain in serialized size that I think this is highly worth it.

I tried my own implementation of decoding protobuf varint, and then tried https://github.com/protocolbuffers/protobuf/blob/v23.1/java/core/src/main/java/com/google/protobuf/BinaryReader.java#L1507 which is a little bit faster.
I used TruffleRuby's core files and measured the time to deserialize all of them.

Java Loader:
raw 4-bytes int vs protobuf mine vs protobuf upstream

jvm (C2):
jt ruby -rbenchmark -e 'sources = Dir.glob("src/main/ruby/truffleruby/**/*.rb").map { |file| source=File.read(file); [source, Truffle::Debug.yarp_serialize(source)] }; 100.times { p Benchmark.realtime { 10.times { sources.each { |source, serialized| Truffle::Debug.yarp_load(source, serialized) } } } }'
0.04162979500142683
0.04219745499904093
0.04144576299950131
0.04172010699949169
0.04208778300017002
vs
0.04317428000103973
0.04366730699985055
0.04323257000032754
0.04333246200076246
0.0438804910008912
vs
0.04351711600065755
0.04308311800014053
0.04294450599991251
0.04293689600126527
0.04287302500051737

jvm-ce (Graal CE):
jt -u jvm-ce ruby -rbenchmark -e 'sources = Dir.glob("src/main/ruby/truffleruby/**/*.rb").map { |file| source=File.read(file); [source, Truffle::Debug.yarp_serialize(source)] }; 100.times { p Benchmark.realtime { 10.times { sources.each { |source, serialized| Truffle::Debug.yarp_load(source, serialized) } } } }'
0.03699708500062115
0.0370449149995693
0.03713681699991866
0.03724886900090496
0.03763073499976599
vs
0.04011225900103454
0.04194237899901054
0.04002559800028394
0.04003527900022164
0.04006649899929471
vs
0.03817277000052854
0.03817133099983039
0.03813794999950915
0.03880394000043452
0.03809320999971533


serialize:
jt ruby -rbenchmark -e 'sources = Dir.glob("src/main/ruby/truffleruby/**/*.rb").map { |file| source=File.read(file); [source, Truffle::Debug.yarp_serialize(source)] }; 30.times { p Benchmark.realtime { 10.times { sources.each { |source, serialized| Truffle::Debug.yarp_serialize(source) } } } }'
vs
0.5042316790004406
0.5033410449996154
0.505833053999595
0.5033581759998924
0.5047975580000639

It's pretty close for raw vs protobuf upstream (the 1st and 3rd of the vs).

Also interesting is that deserializing is >10x faster than parsing + serializing (caveat: of course that includes JNI overhead and 2 copies for Java<->native), in other words, deserializing seems already pretty fast.

enebo May 23, 2023
Collaborator

@eregon Looks great. I am not terribly surprised on ratio. If parsing was only for execution (and not representing the syntax also) this ratio would decrease but the amount I could not give an estimate. Ruby parsing is a lot of work. Fast deserialze does give more power to pre-compilation.

I am fairly close to running more real stuff on my old WIP branch. Once I do I will see how much speeding up parsing would matter. Setting up a Ruby runtime has a bunch of other overhead.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Variable length integers in serialization #741

{{title}}

Replies: 3 comments 26 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Variable length integers in serialization #741

kddnewton Mar 30, 2023 Maintainer

Replies: 3 comments · 26 replies

enebo Mar 30, 2023 Collaborator

eregon Apr 4, 2023 Maintainer

enebo Apr 4, 2023 Collaborator

eregon Apr 5, 2023 Maintainer

eregon Apr 5, 2023 Maintainer

enebo Mar 31, 2023 Collaborator

eregon Apr 4, 2023 Maintainer

eregon May 20, 2023 Maintainer

enebo May 21, 2023 Collaborator

eregon May 22, 2023 Maintainer

eregon May 23, 2023 Maintainer

enebo May 23, 2023 Collaborator

kddnewton
Mar 30, 2023
Maintainer

Replies: 3 comments 26 replies

enebo
Mar 30, 2023
Collaborator

eregon Apr 4, 2023
Maintainer

enebo Apr 4, 2023
Collaborator

eregon Apr 5, 2023
Maintainer

eregon Apr 5, 2023
Maintainer

enebo
Mar 31, 2023
Collaborator

eregon
Apr 4, 2023
Maintainer

eregon May 20, 2023
Maintainer

enebo May 21, 2023
Collaborator

eregon May 22, 2023
Maintainer

eregon May 23, 2023
Maintainer

enebo May 23, 2023
Collaborator