Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quickdump size improvements #2788

Open
kodebach opened this issue Jun 14, 2019 · 10 comments

Comments

Projects
None yet
4 participants
@kodebach
Copy link
Contributor

commented Jun 14, 2019

There are a few ways the size of quickdump files can be improved.

String sizes

At the moment quickdump encodes a string of length n (without null-terminator) as a 64 bit integer n followed by n bytes. Most of the time we don't need 64 bits, so some way to encodes strings with a shorter length more efficiently would bring a lot of improvement.

To show how much space is wasted, think of a KeySet with 100 Keys all of which have the metadata type. Both the metakey type and all of its allowed values are shorter than 255 bytes. This means that for each use of the type metadata we have 14 zero bytes of wasted space. In the example with 100 Keys, the type metadata alone wastes 1400 Bytes.

Markers for short strings

In quickdump there are marker bytes to indicate what data comes next (e.g. m for metakey and value). We could use separate marker bytes to indicate that the following data uses strings with length less than 255 and therefore encodes the string lengths in 8 bit instead of 64 bit.

This comes with only minor performance hit during reading and writing. However, as soon as we hit 256 bytes string length (might occur for key values, but otherwise unlikely), we will be wasting bytes again.

Even more markers

The above solution could be extended. At the moments the marker bytes are expressive (s for string value, b for binary, etc.). If we abandon this, we could encode the string size used in 2 bits of the marker and only use the remaining 6 bits.

Similar performance impact to above. Never wasting bytes. Reduced expressiveness (not that important in a binary format). Reduced possibilities for future markers (only 64 instead of 256 possible markers).

Variable Length Integer Encoding

We could also simply always use a variable length integer encoding for string sizes. For example LEB128 or PrefixVarint from https://github.com/stoklund/varint.

Key names

quickdump currently takes a KeySet and for each Key it stores the full name minus the parentKey name. So with the parentKey user/test we store the key user/test/my/interesting/new/key as my/interesting/new/key.

This allows quick encoding and decoding and also moving a KeySet from one parent to another. However, since KeySets are supposed to be hierarchical, we waste a lot of space, by repeating common the same prefix over and over again.

For example this key set:

user/parent/my/key/x
user/parent/my/key/y
user/parent/my/key/y/a
user/parent/my/key/y/a/b/c
user/parent/my/key/y/a/b/delta
user/parent/my/key/y/a/b/delta/e
user/parent/my/key/y/a/b/f
user/parent/my/key/z

With the parent key user/parent is stored as:

my/key/x
my/key/y
my/key/y/a
my/key/y/a/b/c
my/key/y/a/b/delta
my/key/y/a/b/delta/e
my/key/y/a/b/f
my/key/z

You can easily see all the repetition. Instead we could use this:

my/key/x
[1]
y
/a
/b/c
[1]
delta
e
[7]
f
[7]
z

Here by default key names are appended to the previous key name instead of the parent key. Additionally, for when the next key is not below the previous one, there are special markers. [n] indicates that n of the previous key are removed, before the next name is appended.

This is quite easy to implement in reading part of quickdump. The writing is a lot more complicated, however, as we have to compare each key against the previous one. That would be quite a big performance hit I suspect. The efficient way of doing writing this format would require an efficient implementation of a prefix tree. Then the output procedure becomes rather trivial.

Since quickdump is used in specload for its super fast writing and reading, any writing implementation of this prefix elimination would be unwanted. That means there has to be a way to enable/disable this form of compress. Alternatively, we could implement only the reading part and always write the uncompressed form. In addition we would provide a tool that compresses quickdump files. This tool could be used on specifications mounted with specload, as these files would only ever be read.

@markus2330

This comment has been minimized.

Copy link
Contributor

commented Jun 14, 2019

Thank you for the detailed discussion. Yes, I agree: quickdump is meant to be fast, not small.

Trying to make it small now, will probably make it slow without too much success that it will be small enough.

So I either we compress the whole quickdump as-is (if this is good enough) or make a new format which is as small as possible.

For communication with specload, it might make sense to send some header, so that specload can use different formats (e.g. different compressions).

In any case: do we really need this, as we agreed to have compiler-options to completely remove the spec?

@haraldg

This comment has been minimized.

Copy link

commented Jun 14, 2019

In any case: do we really need this, as we agreed to have compiler-options to completely remove the spec?

I can't really know, because I don't have a full picture of the tradeoffs involved. I'd first benchmark the other improvements and variants.

Maybe I shouldn't be involving myself in this discussion, but:

Trying to make it small now, will probably make it slow

Generally speaking I kind of doubt that. Often reading a small file from storage is so much faster, that you can easily expand it while the big version is read.

Also, if you are actually concerned about speed, wouldn't you store the keys in some more efficient representation than raw strings internally anyway to make lookups/compares faster (while probably saving RAM at the same time because of reduced duplication)? The same internal representation might also provide what you need to write the keys out in compact form to storage quickly.

@markus2330

This comment has been minimized.

Copy link
Contributor

commented Jun 15, 2019

I'd first benchmark the other improvements and variants.

Exactly.

Maybe I shouldn't be involving myself in this discussion

You are always welcome to join discussions.

Generally speaking I kind of doubt that. Often reading a small file from storage is so much faster, that you can easily expand it while the big version is read.

It was not a general statement but about quickdump and the suggestions above. Both the quickdump as now and a new plugin with the suggestions above make sense depending on the available memory, IO speed and so on.

Also, if you are actually concerned about speed, wouldn't you store the keys in some more efficient representation than raw strings internally anyway to make lookups/compares faster (while probably saving RAM at the same time because of reduced duplication)?

The lookup currently uses either bsearch with memcmp or, if available, a hash lookup. Both obviously need the whole key name to find the correct key. I doubt that someone can beat this performance-wise without immense effort (comparable to what was needed to bring bsearch where it is now).

@haraldg

This comment has been minimized.

Copy link

commented Jun 15, 2019

You are always welcome to join discussions.

I know. But as I know nothing about the code in question and can only remark in very general terms, there is a high risk this just adds noise.

The lookup currently uses either bsearch with memcmp or, if available, a hash lookup. Both obviously need the whole key name to find the correct key. I doubt that someone can beat this performance-wise without immense effort (comparable to what was needed to bring bsearch where it is now).

Fair enough. Intuitively I'd expect a tree based data structure to give much faster lookups. But maybe the average key set just isn't big enough for this to actually show off.

If you have already sorted the keys for bsearch, prefix elimination should be fairly cheap though ...

@markus2330

This comment has been minimized.

Copy link
Contributor

commented Jun 15, 2019

Fair enough. Intuitively I'd expect a tree based data structure to give much faster lookups.

Why so? It is also a binary search (log n) but the data is much more distributed in RAM. As said before, with huge investments in performance improvements you might get the same lookup time. And you need to implement your own memory management, otherwise trees needs lot of mallocs.

What would be very interesting is an alternative implementation of Elektra which completely works without mallocs.

But maybe the average key set just isn't big enough for this to actually show off.

I often have large key sets, e.g. 35216 keys is normal. But then you want a hash map for fast lookups.

@haraldg

This comment has been minimized.

Copy link

commented Jun 15, 2019

Why so? It is also a binary search (log n) but the data is much more distributed in RAM.

I didn't prove it, but i think the complexity of bsearch is O(m*log(n)) while tree search should be doable in O(m+log(n)) where m is the typical length of a key. The point of a tree structure of course would be, that the memory footprint of the whole key set is smaller then storing all keys in full length, thus making RAM access more localized instead of less. If this doesn't hold, then the tree indeed is a bad idea.

And you need to implement your own memory management, otherwise trees needs lot of mallocs.

Yes, and also pointers (on 64-bit systems) would be a bit large. The nodes would need to be addressed by array indices instead.

What would be very interesting is an alternative implementation of Elektra which completely works without mallocs.

That seems like a hard challenge. But only implementing the most important core data structures without mallocs might be doable. (Actually I'm not sure what the benefit would be, why it is interesting to you.)

But then you want a hash map for fast lookups.

I'm not familiar with the tradeoffs of hash maps.

@kodebach

This comment has been minimized.

Copy link
Contributor Author

commented Jun 15, 2019

Fair enough. Intuitively I'd expect a tree based data structure to give much faster lookups.

Not to good too much off topic here, but a tree like structure might actually help a lot with spec stuff. You could easily lookup globbing expressions at runtime with a lot less processing. The spec plugin would also become much more efficient, since you could create a tree of the spec and off the actual configuration and then try to match those two trees.

The point of a tree structure of course would be, that the memory footprint of the whole key set is smaller then storing all keys in full length,

We would still have to store the full key names. Otherwise you would have to traverse the tree every time you need the full key name for something. But like I said above, at least in some situations a tree structure could improve performance or even enable new features.

Yes, and also pointers (on 64-bit systems) would be a bit large. The nodes would need to be addressed by array indices instead.

Elektra uses size_t or kdb_long_long_t (possibly kdb_unsigned_long_long_t as well) for array indices. So there wouldn't be a lot of space saving. However, array indices would be better, because it means the mmapstorage plugin has to do less pointer correction.

I'm not familiar with the tradeoffs of hash maps.

AFAIK the hash map implementation that Elektra uses, provides essentially constant lookup times and also preserves ordering. On the other side there is a significant effort involved in calculating the hash map in the first place. @markus2330 please correct me, if I am wrong.


Anyway for the case of quickdump:

quickdump is meant to be fast, not small.
Trying to make it small now, will probably make it slow without too much success that it will be small enough.

The first improvement I suggested (option for 1 byte string sizes), would have basically no impact on reading. It would just be an addition case in an already existing switch. During writing we would need 1 additional if at each of the 4 points where we write a marker byte. As all these ifs would only use constants or values that are already used now, this should also not have any real-world impact.

To show how much space this alone could save, here is a calculation for the LCDd spec.

The whole spec has 454 keys. The longest keyname used is 64 bytes. As these are spec keys none of those have values. All keys together have 2557 metakeys, the longest of which is 32 bytes long. Of the corresponding metavalues 73 are longer than 255 bytes.

For any of the strings with length less than 255 (including the empty ones) we would save 7 bytes. In total this makes (454 + 454 + (2557 - 73)) * 7 = 23744 bytes. That is 37.65%.
(Note: this is not the full LCDd spec only the spec for the parts currently implemented)

The longest string the whole spec (one of the descriptions) was 1020 bytes, so the second and third suggestion would save another 73 * 7 = 518 bytes, although I am not a fan of using this many markers. Also the code complexity would increase quite a bit (performance still shouldn't be an issue, it would just be a few more conditional jumps during writing).

Using an actual variable length integer encoding would definitely have a performance impact that has to be benchmarked. However, considering that WebAssembly and various other projects (e.g. Protobuf) use such encodings while maintaining performance, this should be doable as well.

@markus2330

This comment has been minimized.

Copy link
Contributor

commented Jun 16, 2019

@haraldg wrote:

Actually I'm not sure what the benefit would be, why it is interesting to you.

Actually, @chrysn came up with the requirement and there seems to be a market for systems which do not support mallocs. It is currently not listed in the public topics for students.

@kodebach wrote:

The spec plugin would also become much more efficient, since you could create a tree of the spec and off the actual configuration and then try to match those two trees.

It is definitely not prohibited to transform the KeySet to other more suitable data structures. Key's are actually designed that they can be inserted into other data structures while still being members of KeySets. E.g. the mounting logic is implemented with a trie because the KeySet is not suitable for suffix-lookups. For parser/generators it also makes sense to have a data structure which represents the file format. (For hosts file we use an array in the order of the hosts entry. And maybe @sanssecours implemented new ones? Ini and yajl clearly show limitations of exclusively using KeySets.)

means the mmapstorage plugin has to do less pointer correction.

According to @mpranj the pointer correction is quite cheap.

On the other side there is a significant effort involved in calculating the hash map in the first place.

Yes, because of that it only pays off for larger key sets and/or many lookups. The hash map is also stored in mmap, so the buildup does not need to pay off during a single execution of a binary. (This is not benchmarked, though.)

use such encodings while maintaining performance, this should be doable as well.

Of course it is doable but it is a project on its own. And we first should check if we can use one of the existing implementations.

@sanssecours

This comment has been minimized.

Copy link
Member

commented Jun 17, 2019

And maybe @sanssecours implemented new ones?

Not really. Most of the YAML plugins use a Key stack, but apart from that, all of the plugins I wrote just use Key or KeySet data structures directly.

@kodebach kodebach referenced this issue Jun 17, 2019

Merged

Quickdump v3 #2791

9 of 14 tasks complete
@kodebach

This comment has been minimized.

Copy link
Contributor Author

commented Jun 17, 2019

I implemented string size improvements in #2791.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.