Skip to content
This repository has been archived by the owner on Sep 12, 2022. It is now read-only.

feat: performance improvements #5

Merged
merged 9 commits into from Nov 22, 2021
Merged

feat: performance improvements #5

merged 9 commits into from Nov 22, 2021

Conversation

cristianmtr
Copy link
Contributor

@cristianmtr cristianmtr commented Nov 18, 2021

Closes #6

Results before this PR:

indexing 1000 takes 0 seconds (0.22s)
rolling update 3 replicas x 2 shards takes 0 seconds (0.82s)
search with 10 takes 0 seconds (0.23s)
indexing 10000 takes 0 seconds (0.75s)
rolling update 3 replicas x 2 shards takes 9 seconds (9.08s)
search with 10 takes 0 seconds (0.22s)
indexing 100000 takes 7 seconds (7.59s)
rolling update 3 replicas x 2 shards takes 7 minutes and 17 seconds (437.44s)
search with 10 takes 0 seconds (0.22s)


RESULTS NOW

indexing 1000 takes 0 seconds (0.44s)                                                                                   
rolling update 3 replicas x 2 shards takes 0 seconds (0.81s)
indexing 10000 takes 1 second (1.01s)                                                                                   
rolling update 3 replicas x 2 shards takes 2 seconds (2.63s)
indexing 100000 takes 8 seconds (8.10s)                                                                                 
rolling update 3 replicas x 2 shards takes 3 minutes and 27 seconds (207.14s)

MORE BENCHMARKING

indexing 500000 takes 30 seconds (30.07s)    
rolling update 3 replicas x 2 shards takes 26 minutes and 57 seconds (1617.99s)
search with 10 takes 0 seconds (0.21s)

@cristianmtr cristianmtr requested a review from a team November 18, 2021 17:18
@cristianmtr cristianmtr changed the title test: benchmark feat: performance improvements Nov 19, 2021
@cristianmtr cristianmtr marked this pull request as ready for review November 19, 2021 13:14
Copy link

@davidbp davidbp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

executor/hnswlib_searcher.py Show resolved Hide resolved
executor/hnswlib_searcher.py Show resolved Hide resolved
@numb3r3
Copy link
Member

numb3r3 commented Nov 22, 2021

So, the performance improvement is achieved by reducing the values of ef_construct and max_connection, right?

Base automatically changed from feat-add-cleanup to main November 22, 2021 08:17
@cristianmtr
Copy link
Contributor Author

So, the performance improvement is achieved by reducing the values of ef_construct and max_connection, right?

No. It's about the batching during .sync_index

Copy link
Member

@makram93 makram93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

executor/hnswlib_searcher.py Show resolved Hide resolved
executor/hnswlib_searcher.py Show resolved Hide resolved
executor/hnswpsql.py Show resolved Hide resolved
executor/hnswpsql.py Show resolved Hide resolved
@cristianmtr
Copy link
Contributor Author

@makram93 I will address the trav path in another PR

@cristianmtr cristianmtr merged commit 0d1bc19 into main Nov 22, 2021
@cristianmtr cristianmtr deleted the test-performance branch November 22, 2021 10:07
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

performance(HNSWPSQL): syncing is slow
4 participants