-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[QUESTION] Best way to store hashes? #2443
Comments
v4 had FileStorage available which is presumably better for storing binary blobs: |
File storage appears to be for dealing with large files. This is for numerous small blobs - and since they're fixed, known length, in theory it's possible to optimize for this behavior. Doing this manually, i.e. by splitting a 128-bit hash into two 64-bit ints, doesn't help. Storing the data as a string does seem to be the best option. I'm guessing there are just more optimizations out there (in C# and/or in LiteDB) for string processing than byte[] processing and that's what improves the performance. |
Good point. I read "128 KB", not "128 bits". Sorry. |
Perhaps I'm testing the wrong way, but I'm getting relatively the same file sizes and insertion times (except for the Base64 string approach) with the following:
(Yes, I know using Benchmark would have been better, but I was just trying to get a rough feel for times and sizes.) |
For my test with using two longs, I made a struct instead of putting the longs directly into the model. I'm guessing that's the difference. GUID I hadn't even considered trying. |
I'm currently storing hashes in a litedb for use as a file cache. I'm planning on expanding it, so I was thinking of revisiting the storage. Right now I'm just using base64 encoded strings. I figured that changing the storage to bytes would be more efficient.
So I wrote a quick test program to find out. I made 1,000,000 integers, hashed them, and used a simple stopwatch to compare strings and
byte[]
. I'm using murmurhash for this, the output is 128 bits. The models are simple:The results were not what I expected:
This is just from doing a
InsertBulk
on the models. Inserting bytes took longer, which wasn't what I expected. The database is smaller so I assume this isn't doing any kind of binary->hex conversion for storage, but I would naively assume that the smaller bytes record would also insert faster.Is there a better way to store small binary data in litedb? I'm more concerned about speed than filesize, so should it be left as a base64 string? Or am I setting up the models incorrectly in some way?
The text was updated successfully, but these errors were encountered: