Analyze varints and zigzag encoding #1

olt · 2013-06-11T10:14:21Z

The 0.01 draft needs at least three bytes for all values above 128. Varint encoding as used by protobuf only uses two bytes for values between 128 and 16384.

https://developers.google.com/protocol-buffers/docs/encoding

nicklasaven · 2013-06-11T10:48:58Z

Well, the first one will need three because you need one byte to give the new size. But the coordinates after that uses only 2 bytes until next change.

nicklasaven · 2013-06-11T10:55:25Z

That link was very interesting about zig zag encoding. But if I understand right you sacrify one bit to tell there is a next byte. Is that correct? Then one byte will only give you the range from -64 to 63. Or do I miss the point?

olt · 2013-06-11T11:06:26Z

Ok, I missed that the size is for all following coordinates.

For signed integers it is right that on byte can store from -63 to 64 (IIRC). But you always store numbers in the smallest possible way and do not need to change the sizes back and forth.

nicklasaven · 2013-06-11T11:15:07Z

Yes, you are right. Very interesting. Here I think it comes down to statistics about real world data. What I have noticed is that it is quite big differences in overview maps and more detailed maps. More detailed maps often have only a few meters between the points and hen almost all vertex points is stored as 1 byte even with a precision of 1 decimal. With 1 decimal it can hold values from -12.7 to + 12.7.

olt · 2013-06-11T12:10:56Z

Yeah, only a comparison with real world data will show which method is better. But varints could be used for all numbers (IDs and also first coordinates) and should benefit for small geometries.

nicklasaven · 2013-06-11T12:44:31Z

Yes, that is an important point about ID and first coordinate.

Another question is what is fastest to parse in javascript, since that is another bottle neck.

hmm, it just struck me.
With this varint, you will sacrifice one bit per byte, not only for the whole value.
so 16 bits will only give -8191 to +8191 instead of -32767 to +32767 with INT16

nicklasaven · 2013-08-17T22:50:35Z

I have added varInt as mthod 1 in the spec and very soon in the PostGIS and javascript implementations.

It looks very promising :-)

nicklasaven · 2013-08-26T20:48:39Z

VarInt is now the one and only method in TWKB spec and PostGIS implementation

olt mentioned this issue Jun 11, 2013

Support for int64 IDs #3

Closed

nicklasaven closed this as completed Aug 26, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Analyze varints and zigzag encoding #1

Analyze varints and zigzag encoding #1

olt commented Jun 11, 2013

nicklasaven commented Jun 11, 2013

nicklasaven commented Jun 11, 2013

olt commented Jun 11, 2013

nicklasaven commented Jun 11, 2013

olt commented Jun 11, 2013

nicklasaven commented Jun 11, 2013

nicklasaven commented Aug 17, 2013

nicklasaven commented Aug 26, 2013

Analyze varints and zigzag encoding #1

Analyze varints and zigzag encoding #1

Comments

olt commented Jun 11, 2013

nicklasaven commented Jun 11, 2013

nicklasaven commented Jun 11, 2013

olt commented Jun 11, 2013

nicklasaven commented Jun 11, 2013

olt commented Jun 11, 2013

nicklasaven commented Jun 11, 2013

nicklasaven commented Aug 17, 2013

nicklasaven commented Aug 26, 2013