bug: reactivate benchmarks with quick fixes by tholor · Pull Request #2766 · deepset-ai/haystack

tholor · 2022-07-06T09:45:29Z

Related Issue(s): fixes #2813
In the past we were running benchmarks quite regularly for haystack and published results here. As the benchmarking script was not working anymore at some point, we stopped doing it and never found time to fix it.

Proposed changes:

Update readme for benchmarks
Adjust benchmark scripts to new document primitives (text -> content)
Enable logging again
Updating the set of models used in benchmarking to more "popular ones"
Fresh benchmarks results for the latest main branch (c5a2651)

Future work

Refactoring/cleaning up the code + any automated execution of it
Benchmark pipelines rather than "individual nodes"
Remove DPR from the benchmarks and go all for "sentence transformers" (?)
Add a deberta reader model to the benchmarks
Automatically allocate more memory to elastic/opensearch containers for 500k runs (only manual work around for now)

Note
In order to successfully finish the 500k indexing runs on elastic + opensearch the docker containers we need to allocate more RAM. The default cmds used in haystack's utils launch_opensearch() and launch_es() are not sufficient here. This means we need to start the two containers manually before running the benchmarks. Documented this in the benchmarks' README, but would be nice to automate this (e.g. by exposing the memory param in the util functions and passing a value from benchmarks).

Pre-flight checklist

I have read the contributors guidelines
I have enabled actions on my fork
If this is a code change, I added tests or updated existing ones
If this is a code change, I updated the docstrings

…into quickfix_benchmarks

… requests that exceed elastic's limits (happening in dense 500k runs)

bogdankostic

LGTM, one last step before merging would be to change the title such that it complies with the commit conventions.

brandenchan · 2022-09-20T08:24:11Z

test/benchmarks/README.md

-Run the benchmarks with the following command:

+
+To run all benchmarks (e.g. for a new haystack release):


Suggested change

To run all benchmarks (e.g. for a new haystack release):

To start all benchmarks (e.g. for a new Haystack release), run:

brandenchan · 2022-09-20T08:28:36Z

test/benchmarks/README.md

+```
+
+Results will be stored in this directory as
+- retriever_index_results.csv (+ .md)


Suggested change

- retriever_index_results.csv (+ .md)

- retriever_index_results.csv and retriever_index_results.md

brandenchan · 2022-09-20T08:29:03Z

test/benchmarks/README.md

+
+Results will be stored in this directory as
+- retriever_index_results.csv (+ .md)
+- retriever_query_results.csv (+ .md)


Suggested change

- retriever_query_results.csv (+ .md)

- retriever_query_results.csv and retriever_query_results.md

brandenchan · 2022-09-20T08:29:37Z

test/benchmarks/README.md

+Results will be stored in this directory as
+- retriever_index_results.csv (+ .md)
+- retriever_query_results.csv (+ .md)
+- reader_results.csv (+ .md)


Suggested change

- reader_results.csv (+ .md)

- reader_results.csv and reader_results.md

brandenchan · 2022-09-20T08:31:38Z

test/benchmarks/README.md

+Therefore, start them manually before you trigger the benchmark script and assign more memory to them: 
+
+`docker start opensearch > /dev/null 2>&1 || docker run -d -p 9201:9200 -p 9600:9600 -e "discovery.type=single-node" -e "OPENSEARCH_JAVA_OPTS=-Xms4096m -Xmx4096m" --name opensearch opensearchproject/opensearch:2.2.1`
+and


Having a line break either side of this "and" will make the paragraph be much more readable

* quick fix benchmark runs to make them work with current haystack version * fix minor typo * update readme. fix minor things to make benchmarks run again * Update Documentation & Code Style * fix typo in readme * update result files for reader and retriever querying * reduce batch size for update embeddings to prevent xlarge bulk_update requests that exceed elastic's limits (happening in dense 500k runs) * change default memory allocation back to normal. add note to readme * add first indexing results * add memory to docker cmd * full benchmarks results on commit c5a2651 Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

tholor and others added 14 commits June 17, 2022 11:27

quick fix benchmark runs to make them work with current haystack version

cb16a57

fix minor typo

904c5f5

update readme. fix minor things to make benchmarks run again

afbdc6b

Merge branch 'master' into quickfix_benchmarks

4aee0cd

Update Documentation & Code Style

bd8a8b7

fix typo in readme

51c46e2

Merge branch 'quickfix_benchmarks' of github.com:deepset-ai/haystack …

b155848

…into quickfix_benchmarks

update result files for reader and retriever querying

799e71c

reduce batch size for update embeddings to prevent xlarge bulk_update…

d963bbe

… requests that exceed elastic's limits (happening in dense 500k runs)

change default memory allocation back to normal. add note to readme

bbf9931

add first indexing results

0ad8ff4

add memory to docker cmd

62d705a

Merge branch 'main' into quickfix_benchmarks

c5a2651

full benchmarks results on commit c5a2651

8785a4f

tholor marked this pull request as ready for review September 19, 2022 07:15

tholor requested review from a team as code owners September 19, 2022 07:15

tholor requested review from bogdankostic and removed request for a team September 19, 2022 07:15

bogdankostic added topic:speed topic:eval labels Sep 20, 2022

bogdankostic approved these changes Sep 20, 2022

View reviewed changes

tholor added the type:bug Something isn't working label Sep 20, 2022

tholor changed the title ~~Quickfix benchmarks~~ bug: reactivate benchmarks with quick fixes Sep 20, 2022

tholor merged commit 7e79a48 into main Sep 20, 2022

tholor deleted the quickfix_benchmarks branch September 20, 2022 08:22

brandenchan suggested changes Sep 20, 2022

View reviewed changes

tholor mentioned this pull request Sep 20, 2022

docs: improve readability of benchmarks readme #3247

Merged

6 tasks

tholor mentioned this pull request Oct 6, 2022

docs: add latest benchmark results for v1.9.0 #3339

Closed

6 tasks

tholor mentioned this pull request Oct 10, 2022

docs: add benchmark results for v1.9.0 (commit ce36be8) #3355

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: reactivate benchmarks with quick fixes#2766

bug: reactivate benchmarks with quick fixes#2766
tholor merged 14 commits intomainfrom
quickfix_benchmarks

tholor commented Jul 6, 2022 •

edited

Loading

Uh oh!

bogdankostic left a comment

Uh oh!

brandenchan Sep 20, 2022

Uh oh!

brandenchan Sep 20, 2022

Uh oh!

brandenchan Sep 20, 2022

Uh oh!

brandenchan Sep 20, 2022

Uh oh!

brandenchan Sep 20, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		Run the benchmarks with the following command:


		To run all benchmarks (e.g. for a new haystack release):

	To run all benchmarks (e.g. for a new haystack release):
	To start all benchmarks (e.g. for a new Haystack release), run:

	- retriever_index_results.csv (+ .md)
	- retriever_index_results.csv and retriever_index_results.md

	- retriever_query_results.csv (+ .md)
	- retriever_query_results.csv and retriever_query_results.md

	- reader_results.csv (+ .md)
	- reader_results.csv and reader_results.md

Conversation

tholor commented Jul 6, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pre-flight checklist

Uh oh!

bogdankostic left a comment

Choose a reason for hiding this comment

Uh oh!

brandenchan Sep 20, 2022

Choose a reason for hiding this comment

Uh oh!

brandenchan Sep 20, 2022

Choose a reason for hiding this comment

Uh oh!

brandenchan Sep 20, 2022

Choose a reason for hiding this comment

Uh oh!

brandenchan Sep 20, 2022

Choose a reason for hiding this comment

Uh oh!

brandenchan Sep 20, 2022

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tholor commented Jul 6, 2022 •

edited

Loading