-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory efficient management of outputs #28
Comments
But, the approach can degrade time efficiency to extract members from a byte array. |
How about using the following bincode like variant encoding: That encoding cannot represent a large value (>= 0x1000000) in 4 byte sequence, but I think it is enough to represent lengths. For example match x {
0..=253 => [x],
254..=0xffff => [254, x & 0xff, x >> 8],
0x10000..=0xffffff => [255, x & 0xff, x >> 8 & 0xff, x >> 16],
} |
1 bit for a flag, so match x {
0..=125 => [x],
126..=0xffff => [126, x & 0xff, x >> 8],
0x10000..=0xffffff => [127, x & 0xff, x >> 8 & 0xff, x >> 16],
} |
Yeah, such variable encoding is a good alternative. Compared to a well-known variable byte encoding, the bincode scheme would be faster because it does not need For a terminator flag, I think embedding it into 1 bit of |
A memory-efficient representation of output sets is achieved by another solution (#30). |
Daachorse handles a value-length pair as a pattern's output (see https://github.com/daac-tools/daachorse/blob/main/src/lib.rs#L289-L292)
In the current implementation, 31 bits and 32 bits are assigned to length and value, respectively. (1 bit is used for the flag,)
But, in many cases, the assignment is too rich.
For example, when the maximum length is 255, 1 byte is sufficient to represent.
If we know the maximum length and value, we can memory-efficiently store members on byte-aligned memory.
For example, if a length is represented in 1 byte and a value (with flag) is represented in 3 bytes, we can interleave them in a byte array
outputs
as follows.The text was updated successfully, but these errors were encountered: