Explore using Bloom filter for id field index to optimize read performance #4489

itiyama · 2022-09-12T14:06:44Z

Opensearch uses FST data structure to store document ids and then searches for it as a term query. Document id queries are generally "does_exist" for most use-cases. Bloom filter is known to perform better for such queries. We will explore this data structure to store document ids and then measure read performance for various usecases:

Mostly appends
Upserts, but no updates. Equal number of appends and upserts
Both updates and appends

This is a spin off from a discussion captured here

mgodwan · 2023-08-28T07:07:03Z

While analysing updates, I found FST lookups to be taking time as well, and started exploring bloom filter based postings for id field, and worked on a POC for the same: https://github.com/mgodwan/OpenSearch/compare/d76614cb4f071989da04d0976c1227b8943e5b61..02a344004413d39b08d882cc2b04321808682909

Upon running the same for NYC Taxis dataset with 25% conflict probability, we see gains in indexing throughput of the order of ~30%

Server: 1 node r6g.2xlarge, 32G Heap, 64G RAM
Client: c4.4xlarge, 16 client instances

opensearch-benchmark execute-test --workload=nyc_taxis --target-hosts "<<HOST>>" --results-format csv --pipeline benchmark-only --test-procedure update --workload-params '{"index_settings": {"number_of_shards": 1, "number_of_replicas": 0, "codec": "best_compression", "refresh_interval": "60s", "translog.flush_threshold_size": "4g"}}'

Following are the results of the benchmarks for the same (7 runs for each code):

Following is a differential flame graph for the CPU usage analysis which clearly shows the advantage of the short circuiting bloom filter introduces with the short circuiting:

Memory Used Percentage (Left Baseline, Right candidate)

JVM Used Percentage (Left Baseline, Right candidate)

Total Young GC time (Left Baseline, Right candidate)

CPU Used Percentage (Left Baseline, Right candidate)

mgodwan · 2023-08-28T07:16:11Z

Based on the above results, I think we should look into exposing this as an opt-in feature to start with while working to optimize the usage of bloom filters internally, and potentially exposing this as a data structure for other fields.

In parallel, we can continue to investigate on the optimizations on the impact this introduces on JVM, Storage, and get/search paths(if any) to promote this towards a default feature for indices.

@shwetathareja @backslasht @dblock @itiyama
Looking forward to your thoughts on this.

backslasht · 2023-08-28T17:21:20Z

@mgodwan - Thanks for the experiments. With approx. 2% increase in disk size, 20%+ decrease in latency is great.

We should definitely consider to bring it as opt-in feature. Are you thinking of exposing conflict probability as an option as well?

itiyama · 2023-08-29T04:18:10Z

@mgodwan This are great results. Any reason for choosing this as an opt-in instead of default? I am aware that some users piggyback on term features on the id field, but I am thinking that going forward we should disable term features and just use id as a mechanism for does_exist or get kind of queries so that the optimizations can be more tailored towards these type of queries.

shwetathareja · 2023-08-29T07:01:34Z

The nos. look great. +1 @mgodwan to offer this as opt in/ experimental features with setting to configure the false positive %. @itiyama , it is pending full scale testing to evaluate the impact impact on JVM/ Storage with different workloads. And, also how it impacts merging before marking it production ready.

@mgodwan, Lets try to make it to 2.10 release as opt-in feature.

mgodwan · 2023-08-30T06:47:29Z

Thanks folks for the comments.

We should definitely consider to bring it as opt-in feature. Are you thinking of exposing conflict probability as an option as well?

@backslasht Yes, we can expose the false positive probability as an argument as it provides trade-offs between jvm/storage usage and the throughput improvement we observe for update use cases.

Any reason for choosing this as an opt-in instead of default?

@itiyama For append-only use cases with auto-generated ids, this may not help and add additional compute/memory/storage overhead. Also, we are still profiling the overall resource utilization the new data structure results in to take more informed decision.

I am thinking that going forward we should disable term features and just use id as a mechanism for does_exist

@itiyama I agree that it will help optimize on the majority of use cases where FST may be an overhead. I think we can track a separate issue for evaluating the underlying deterministic structure.

mgodwan · 2023-08-30T07:07:36Z

@shwetathareja @backslasht

The Bloom filter implementation in Lucene is part of the Lucene's codecs extensions, outside of Lucene core.

Digging further on the same, while exploring the existing implementation in Lucene and seeing the past changes (example), it seems like the implementation has been changed in the past, as it is marked experimental, and does not provide direct support for backward compatibility among Lucene versions.

This may be concerning for us to support version upgrades, and may require us to use an implementation which can ensure backwards compatibility.

Hence, I wanted to discuss:

Is it alright to provide this as an experimental feature flag gated setting in 2.10 given these constraints as it provides opportunity to gather customer feedback on the feature? This will restrict the upgrades for indices using the bloom filter postings when Lucene implementations change, unless customers disable bloom filters through re-index/force merge old segments into new.
We may need explore if it may be beneficial to use the code from Lucene's implementation and ensure backwards compatibility on it. This adds extra maintenance overhead for us but also provides more control, and opportunities to add support for more forms of fuzzy sets (e.g. https://arxiv.org/pdf/1912.08258.pdf) to optimize on.

shwetathareja · 2023-08-30T07:25:55Z

Is it alright to provide this as an experimental feature flag gated setting in 2.10 given these constraints as it provides opportunity to gather customer feedback on the feature? This will restrict the upgrades for indices using the bloom filter postings when Lucene implementations change, unless customers disable bloom filters through re-index/force merge old segments into new.

Generally, we don't provide backward compatibility guarantees for experimental features. We would need to call it out explicitly as well. Do you know the reason why it is not part of Lucene core codecs? Also, considering our approach may change drastically as we look into providing backward compatibility, then we shouldn't rush for 2.10 release.

backslasht · 2023-08-30T09:46:39Z

Backward compatibility is must.

I agree with @shwetathareja, if backward compatibility is not supported, lets review our approach.

mgodwan · 2023-08-30T10:10:05Z

Do you know the reason why it is not part of Lucene core codecs?

Looks like only codecs in core are supported for backward compatibility. This is marked experimental along with other extension codecs (although it has been present for a long time in the Lucene repository) due to which the guarantees for backward compatibility may not apply. I can open an issue in the Lucene repo to discuss this with the community.

If backward compatibility is not supported, lets review our approach.

I agree.
This can be achieved through an extension of Lucene's PostingsFormat which can implement a postings format similar to Lucene's while adding needed semantics for ensuring backward compatibility. Are we open to introducing custom codecs like this in OpenSearch core?

backslasht · 2023-09-02T17:27:43Z

This can be achieved through an extension of Lucene's PostingsFormat which can implement a postings format similar to Lucene's while adding needed semantics for ensuring backward compatibility. Are we open to introducing custom codecs like this in OpenSearch core?

We recently introduced ZStd as a custom codec (#9658) and there are more to come (Snappy). If via custom codec we can support backward compatibility, worth giving it a try.

mgodwan · 2023-09-19T07:48:55Z

For the bloom filter implementation, following is the throughput observed with varying false positive probabilities:

The storage/memory variation for different false positive probability on nyc_taxis dataset is as follows:

mgodwan · 2023-09-19T07:49:39Z

I'm working on collating the results for an XOR Filter implementation as well with same parameters.
Will update the issue with the same.

mgodwan · 2023-10-25T10:20:16Z

I worked on comparing different fuzzy set implementations over the past few days. Following is a comparison of the same with similar size overhead:

We observe XOR Filter to be providing the highest reported indexing throughput among the implementations. Although, this comes at the cost of higher construction time and JVM overhead during construction (XOR Filter implementation POC is based off of https://github.com/FastFilter/fastfilter_java)

Following is a result of micro-benchmarks performed using JMH to profile filter construction times:

Filter Construction

Benchmark                                (numIds)  (setTypeAlias)  Mode  Cnt      Score       Error  Units
FilterConstructionBenchmark.buildFilter   1000000    bloom_filter  avgt    3     26.229 ±     9.165  ms/op
FilterConstructionBenchmark.buildFilter   1000000      xor_filter  avgt    3     78.331 ±    27.188  ms/op
FilterConstructionBenchmark.buildFilter   1000000   cuckoo_filter  avgt    3     52.045 ±     9.371  ms/op
FilterConstructionBenchmark.buildFilter  10000000    bloom_filter  avgt    3    422.899 ±   578.249  ms/op
FilterConstructionBenchmark.buildFilter  10000000      xor_filter  avgt    3   2363.247 ±   781.973  ms/op
FilterConstructionBenchmark.buildFilter  10000000   cuckoo_filter  avgt    3    943.940 ±  1622.361  ms/op
FilterConstructionBenchmark.buildFilter  50000000    bloom_filter  avgt    3   8545.999 ±  1791.090  ms/op
FilterConstructionBenchmark.buildFilter  50000000      xor_filter  avgt    3  20415.212 ± 26552.341  ms/op
FilterConstructionBenchmark.buildFilter  50000000   cuckoo_filter  avgt    3  10126.537 ±   887.455  ms/op

mgodwan · 2023-10-25T10:21:41Z

Side note: While trying to use existing cuckoo filter implementation in OpenSearch, I encountered an issue: #10582

Will fix it separately.

shwetathareja · 2023-10-27T08:11:58Z

@mgodwan : do you also see scope of improvement in XOR filter construction considering it is providing better throughput and lower latencies as compared to bloom and cuckoo filter?

mgodwan · 2023-11-06T08:15:16Z

@shwetathareja The implementation used for xor filter was based on https://arxiv.org/pdf/1912.08258.pdf, where the mapping step requires few auxiliary data structures which consume the high memory during construction. There was no evident way to get rid of the same during the experiments.

kiranprakash154 · 2024-01-19T00:31:42Z

Hi, are we on track for this to be released in 2.12 ?

shwetathareja · 2024-01-27T08:16:11Z

@kiranprakash154 yes we are targeting 2.12 for the release of bloom filter support for DocumentId in OpenSearch

itiyama added enhancement Enhancement or improvement to existing feature or request untriaged labels Sep 12, 2022

itiyama mentioned this issue Sep 12, 2022

Idempotency for mostly append-only indexes/streams #4481

Open

dreamer-89 added Indexing & Search discuss Issues intended to help drive brainstorming and decision making distributed framework and removed untriaged labels Sep 12, 2022

anasalkouz removed the distributed framework label Sep 19, 2023

mgodwan mentioned this issue Oct 5, 2023

Enable bloom filter for id field #10394

Closed

7 tasks

This was referenced Oct 31, 2023

Enable Fuzzy codec for doc id fields using a bloom filter #11022

Closed

Enable Fuzzy codec for doc id fields using a bloom filter #11027

Merged

mgodwan added the v2.12.0 Issues and PRs related to version 2.12.0 label Nov 2, 2023

mgodwan mentioned this issue Dec 29, 2023

Introduce Bloom Filter as non-experimental/core postings format apache/lucene#12986

Open

shwetathareja closed this as completed in #11027 Feb 5, 2024

mgodwan mentioned this issue Feb 19, 2024

[DOC] Documentation for new bloom filter settings opensearch-project/documentation-website#6434

Closed

4 tasks

mgodwan mentioned this issue Apr 2, 2024

[Feature Request] An automation tool to help identify the optimal bulk/batch size for ingestion #13009

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Explore using Bloom filter for id field index to optimize read performance #4489

Explore using Bloom filter for id field index to optimize read performance #4489

itiyama commented Sep 12, 2022

mgodwan commented Aug 28, 2023 •

edited

mgodwan commented Aug 28, 2023

backslasht commented Aug 28, 2023

itiyama commented Aug 29, 2023

shwetathareja commented Aug 29, 2023 •

edited

mgodwan commented Aug 30, 2023

mgodwan commented Aug 30, 2023

shwetathareja commented Aug 30, 2023 •

edited

backslasht commented Aug 30, 2023

mgodwan commented Aug 30, 2023

backslasht commented Sep 2, 2023

mgodwan commented Sep 19, 2023 •

edited

mgodwan commented Sep 19, 2023

mgodwan commented Oct 25, 2023

mgodwan commented Oct 25, 2023

shwetathareja commented Oct 27, 2023

mgodwan commented Nov 6, 2023

kiranprakash154 commented Jan 19, 2024

shwetathareja commented Jan 27, 2024

Explore using Bloom filter for id field index to optimize read performance #4489

Explore using Bloom filter for id field index to optimize read performance #4489

Comments

itiyama commented Sep 12, 2022

mgodwan commented Aug 28, 2023 • edited

mgodwan commented Aug 28, 2023

backslasht commented Aug 28, 2023

itiyama commented Aug 29, 2023

shwetathareja commented Aug 29, 2023 • edited

mgodwan commented Aug 30, 2023

mgodwan commented Aug 30, 2023

shwetathareja commented Aug 30, 2023 • edited

backslasht commented Aug 30, 2023

mgodwan commented Aug 30, 2023

backslasht commented Sep 2, 2023

mgodwan commented Sep 19, 2023 • edited

mgodwan commented Sep 19, 2023

mgodwan commented Oct 25, 2023

mgodwan commented Oct 25, 2023

shwetathareja commented Oct 27, 2023

mgodwan commented Nov 6, 2023

kiranprakash154 commented Jan 19, 2024

shwetathareja commented Jan 27, 2024

mgodwan commented Aug 28, 2023 •

edited

shwetathareja commented Aug 29, 2023 •

edited

shwetathareja commented Aug 30, 2023 •

edited

mgodwan commented Sep 19, 2023 •

edited