Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt.Options.Filter question #8

Closed
JensRantil opened this issue Mar 9, 2013 · 7 comments
Closed

opt.Options.Filter question #8

JensRantil opened this issue Mar 9, 2013 · 7 comments
Labels

Comments

@JensRantil
Copy link
Contributor

Since the documentation does not mention anything about this, is it possible to use different filters on leveldb.Open(...)? The reason I ask is because the bloom filter might need to be enlarged as your database is growing.

@syndtr
Copy link
Owner

syndtr commented Mar 9, 2013

Of course, the bloom filter are optional, you may create your own filter implementation if you wish. The default value for opt.Options.Filter are nil (no filter).

The filter block will not created if no filter defined (the default).

@syndtr syndtr closed this as completed Mar 9, 2013
@JensRantil
Copy link
Contributor Author

Thanks for you answer, Suryandaru.

So, if I reopen the database using a different bloom filter, will
rebuilding that filter block opening the database? Or is the rebuild going
to take place asynchronous? Also, is filter state persisted if I reuse the
same (in this case, bloom) filter?

9 mar 2013 kl. 17:47 skrev Suryandaru Triandana notifications@github.com:

Of course, the bloom filter are optional, you may create your own filter
implementation if you wish. The default value for opt.Options.Filter
are nil(no filter).


Reply to this email directly or view it on
GitHubhttps://github.com//issues/8#issuecomment-14666178
.

@syndtr
Copy link
Owner

syndtr commented Mar 10, 2013

So, if I reopen the database using a different bloom filter, will rebuilding that filter block opening the database? Or is the rebuild going to take place asynchronous?

The new filter will take effect eventually. New sstable will be generated using the new filter (filter block on the old sstable that generated by old filter will be ignored by new filter during read operation), eventually as compaction happen, new filter will take effect entirely.

You may force rebuilding of entire database by overwriting entire keys (e.g. overwrites keys with its own value).

Also, is filter state persisted if I reuse the same (in this case, bloom) filter?

The filter state are written in the sstable including the name of filter that generate it. During read operation the filter block will be loaded and used iff it has same name with the current filter.

JensRantil added a commit to JensRantil/goleveldb that referenced this issue Mar 24, 2013
The documentation came out of issue/question syndtr#8.
@JensRantil
Copy link
Contributor Author

I know it's been a while now, but I just wanted to thank you for your answer! I've on my TODO to add this to the documentation and finally did it today. See pull request #9.

The new filter will take effect eventually. New sstable will be generated using the new filter (filter block on the old sstable that generated by old filter will be ignored by new filter during read operation), eventually as compaction happen, new filter will take effect entirely.

Revisiting this answer it struck me how this could possibly lead to a huge performance penalty during the migration period from one filter to another. This is even worse because usually when you are changing filter, you are dependent on the fact that 1) querying needs to be fast and 2) you have a database with many keys.

A possible way to work around the above issue would be to introduce opt.Options.oldFilters (or similar) which is a list of old filter that's been used. On database creation, create a lookup map from filter name to filter.Filter instance (also including opt.Options.Filter). So, for new sstables, use opt.Options.Filter and for filter checks, first lookup filter instance O(1) operation and use it if found. This would make it possible to reuse older filters during the migration period and I don't think it would have a significant performance penalty.

Last, but not least, I am not experiencing this as an issue. In fact, I don't have goleveldb in production as of now. Rather, you should see this as a potential future improvement/feature. Would you like me to file a separate issue for it? I'd love to hear your input. Also, do you know if the mother project (LevelDB) has implemented something like this?

@syndtr
Copy link
Owner

syndtr commented Mar 25, 2013

Revisiting this answer it struck me how this could possibly lead to a huge performance penalty during the migration period from one filter to another. This is even worse because usually when you are changing filter, you are dependent on the fact that 1) querying needs to be fast and 2) you have a database with many keys.

A possible way to work around the above issue would be to introduce opt.Options.oldFilters (or similar) which is a list of old filter that's been used. On database creation, create a lookup map from filter name to filter.Filter instance (also including opt.Options.Filter). So, for new sstables, use opt.Options.Filter and for filter checks, first lookup filter instance O(1) operation and use it if found. This would make it possible to reuse older filters during the migration period and I don't think it would have a significant performance penalty.

Sound like a good idea.

Would you like me to file a separate issue for it?

That would be great

Also, do you know if the mother project (LevelDB) has implemented something like this?

No, there is no such mechanism in the original leveldb implementation.

@JensRantil
Copy link
Contributor Author

Would you like me to file a separate issue for it?

That would be great

Said and done - see #11.

@syndtr
Copy link
Owner

syndtr commented Apr 4, 2013

Thanks
On Apr 4, 2013 12:35 AM, "Jens Rantil" notifications@github.com wrote:

Would you like me to file a separate issue for it?

That would be great

Said and done - see #11 #11.


Reply to this email directly or view it on GitHubhttps://github.com//issues/8#issuecomment-15851438
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants