Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MB-60844: MapReduce for PreSearch #1999

Merged
merged 13 commits into from
Mar 20, 2024
Merged

MB-60844: MapReduce for PreSearch #1999

merged 13 commits into from
Mar 20, 2024

Conversation

CascadingRadium
Copy link
Contributor

  • Instead of Merging all the PreSearch Results in one shot at the main coordinator node of an alias tree, merge them incrementally at each level of the tree instead, which would balance the reduction process across all the indexes in a distributed Bleve index, leading to a more even memory distribution.

index_alias_impl.go Outdated Show resolved Hide resolved
index_alias_impl.go Outdated Show resolved Hide resolved
search/scorer/scorer_knn.go Show resolved Hide resolved
knnProcessor *KnnPreSearchResultProcessor
}

func CreatePreSearchResultProcessor(req *SearchRequest) *PreSearchResultProcessor {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't return type here simply be KnnPreSearchResultProcessor? I don't really see the need for 2 structs above. We can drop PreSearchResultProcessor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the aim is to allow Presearch to be generic construct that can be reused for other purposes. Lets say i have another logic for merging synonym results from multiple partitions, i would add the SynonymPreSearchResultProcessor as another struct in the PreSearchResult Processor, so any presearch result containing either knn or synonym data can be processed independently. By just keeping it as KnnPreSearchResultProcessor in PreSearchDataSearch, i would have to refactor it later on for synonym search.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your rationale but we can adapt this or add a separate processor for synonyms later, we'll also want to refactor the name - preSearchDataSearch at that point as well, which is quite distasteful.

For now let's reduce the number of redirects, keep it simple.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have made it an interface and one struct implementing it now ... cannot use it directly as the index_alias_impl.go file is not in the vectors tag so it cannot read the knn specific code like the knn store ...

index_alias_impl.go Outdated Show resolved Hide resolved
index_alias_impl.go Outdated Show resolved Hide resolved
index_alias_impl.go Outdated Show resolved Hide resolved
index_alias_impl.go Outdated Show resolved Hide resolved
@abhinavdangeti abhinavdangeti added this to the v2.4.0 milestone Mar 19, 2024
@abhinavdangeti abhinavdangeti merged commit e76f594 into master Mar 20, 2024
9 checks passed
@CascadingRadium CascadingRadium deleted the mapreduce branch March 20, 2024 18:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants