Skip to content

testuj-to/tantivy-merge-policy-demo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tantivy - unexpected merge policy demo

This repository contains demo of unexpected bahaviour of Tantivy's merge policy described in the issue: quickwit-oss/tantivy#2454

Experimental results

The following results were run on a release profile build with M1 Max / 64GB to index 1000 randomly generated documents:

Run Commit Merge policy Wait for merge threads Time Segment counts compute_merge_candidates
A Single MergeWhenever No 245ms .fast: 4x
.fieldnorm: 4x
.idx: 4x
.pos: 4x
.store: 4x
.term: 4x
Calls: 8x
0 args: 4x
1 arg: 3x
2 args: 1x
B Single MergeWhenever Yes 488ms .fast: 1x
.fieldnorm: 1x
.idx: 1x
.pos: 1x
.store: 1x
.term: 1x
Calls: 12x
0 args: 6x
1 arg: 4x
2 args: 2x
C Single TargetDocs No 377ms .fast: 6x
.fieldnorm: 5x
.idx: 5x
.pos: 5x
.store: 6x
.term: 5x
Calls: 10x
0 args: 6x
1 arg: 4x
D Single TargetDocs Yes Inf. loop ??? Calls: >63466x
0 args: >31734x
1 arg: >31732x
E After every change MergeWhenever No 198s .fast: 5x
.fieldnorm: 5x
.idx: 5x
.pos: 5x
.store: 5x
.term: 5x
Calls: 5992x
0 args: 2282x
1 arg: 2712x
2 args: 998x
F After every change MergeWhenever Yes 211s .fast: 1x
.fieldnorm: 1x
.idx: 1x
.pos: 1x
.store: 1x
.term: 1x
Calls: 5998x
0 args: 2273x
1 arg: 2726x
2 args: 999x
G After every change TargetDocs No 575s .fast: 1004x
.fieldnorm: 1003x
.idx: 1003x
.pos: 1002x
.store: 1004x
.term: 1003x
Calls: 14548x
0 args: 8274x
1 arg: 6274x
H After every change TargetDocs Yes Inf. loop ??? Calls: >62218x
0 args: >32109x
1 arg: >30109x

Observations

  • Runs D and H didn't actually finish, after 45-50min I have manually terminated them
  • Both runs D and H share 2 settings - both use the TargetDocs merge policy and both of them wait for merging threads
  • When the TargetDocs is used, then the compute_merge_candidates is never invoked with more then 1 single merge candidate - regardless of other settings (# of commits or waiting for merging threads)
  • The TargetDocs merge policy is slightly computationally/memory heavier then the very simple MergeWhenever merge policy

Conclusion

Race condition.

Well...

  • When the merge policy is "heavier" above some threshold, then a race condition takes place with some internal Tantivy prodecure
  • This race condition somehow causes compute_merge_candidates never to be passed more then 1 merge candidate
  • Waiting for merging threads in combination with this race condition causes the program to be stuck in a (possibly) infinite loop

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published