bug: reactivate benchmarks with quick fixes#2766
Merged
Conversation
…into quickfix_benchmarks
… requests that exceed elastic's limits (happening in dense 500k runs)
bogdankostic
approved these changes
Sep 20, 2022
Contributor
bogdankostic
left a comment
There was a problem hiding this comment.
LGTM, one last step before merging would be to change the title such that it complies with the commit conventions.
brandenchan
suggested changes
Sep 20, 2022
| Run the benchmarks with the following command: | ||
|
|
||
|
|
||
| To run all benchmarks (e.g. for a new haystack release): |
Contributor
There was a problem hiding this comment.
Suggested change
| To run all benchmarks (e.g. for a new haystack release): | |
| To start all benchmarks (e.g. for a new Haystack release), run: |
| ``` | ||
|
|
||
| Results will be stored in this directory as | ||
| - retriever_index_results.csv (+ .md) |
Contributor
There was a problem hiding this comment.
Suggested change
| - retriever_index_results.csv (+ .md) | |
| - retriever_index_results.csv and retriever_index_results.md |
|
|
||
| Results will be stored in this directory as | ||
| - retriever_index_results.csv (+ .md) | ||
| - retriever_query_results.csv (+ .md) |
Contributor
There was a problem hiding this comment.
Suggested change
| - retriever_query_results.csv (+ .md) | |
| - retriever_query_results.csv and retriever_query_results.md |
| Results will be stored in this directory as | ||
| - retriever_index_results.csv (+ .md) | ||
| - retriever_query_results.csv (+ .md) | ||
| - reader_results.csv (+ .md) |
Contributor
There was a problem hiding this comment.
Suggested change
| - reader_results.csv (+ .md) | |
| - reader_results.csv and reader_results.md |
| Therefore, start them manually before you trigger the benchmark script and assign more memory to them: | ||
|
|
||
| `docker start opensearch > /dev/null 2>&1 || docker run -d -p 9201:9200 -p 9600:9600 -e "discovery.type=single-node" -e "OPENSEARCH_JAVA_OPTS=-Xms4096m -Xmx4096m" --name opensearch opensearchproject/opensearch:2.2.1` | ||
| and |
Contributor
There was a problem hiding this comment.
Having a line break either side of this "and" will make the paragraph be much more readable
6 tasks
brandenchan
pushed a commit
that referenced
this pull request
Sep 21, 2022
* quick fix benchmark runs to make them work with current haystack version * fix minor typo * update readme. fix minor things to make benchmarks run again * Update Documentation & Code Style * fix typo in readme * update result files for reader and retriever querying * reduce batch size for update embeddings to prevent xlarge bulk_update requests that exceed elastic's limits (happening in dense 500k runs) * change default memory allocation back to normal. add note to readme * add first indexing results * add memory to docker cmd * full benchmarks results on commit c5a2651 Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
6 tasks
6 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Related Issue(s): fixes #2813
In the past we were running benchmarks quite regularly for haystack and published results here. As the benchmarking script was not working anymore at some point, we stopped doing it and never found time to fix it.
Proposed changes:
text->content)Future work
Note
In order to successfully finish the 500k indexing runs on elastic + opensearch the docker containers we need to allocate more RAM. The default cmds used in haystack's utils
launch_opensearch()andlaunch_es()are not sufficient here. This means we need to start the two containers manually before running the benchmarks. Documented this in the benchmarks' README, but would be nice to automate this (e.g. by exposing the memory param in the util functions and passing a value from benchmarks).Pre-flight checklist