Polish Evaluation Tutorial #2212

brandenchan · 2022-02-17T13:03:55Z

Proposed changes:

Since the evaluation tutorial has gotten quite long and complex, we should make some minor changes to improve readability
Added link to explain new metrics
Added note that integrated eval mode runs as well when isolated eval mode is engaged
Removed the initialize_device_settings
Show there is no discrepancy in retriever isolated eval

To Do

Reader eval discrepancy still exists

review-notebook-app · 2022-02-17T13:03:59Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

julian-risch

Looks good to me so far but I think the filtering of no_answers needs more explanation and I would prefer to move it to a different line (earlier).

tutorials/Tutorial5_Evaluation.py

julian-risch · 2022-02-22T13:26:33Z

tutorials/Tutorial5_Evaluation.ipynb

-     ]
-    }
-   ],
+   "outputs": [],


Here it was actually on purpose to have the output in the tutorial so that users can see the format of the Evaluation Report. I think we should keep it for that reason.

Sure, I have included the output from just those cells which print the Evaluation Report

julian-risch

LGTM! 👍 I also did a quick test run on colab and it works without any problems. With a fast internet connection in the office it took two minutes to download cross-encoder/stsb-roberta-large but other users might need to wait much longer. We could replace it with cross-encoder/stsb-roberta-base and mention in a comment that the large model would give even better results. Your choice. :)

Polish evaluation tutorial

5e3fd93

Clear notebook output

cbe07f6

brandenchan requested a review from julian-risch February 17, 2022 17:00

brandenchan mentioned this pull request Feb 21, 2022

Refactor Evaluation Section deepset-ai/haystack-website#250

Merged

brandenchan removed the request for review from julian-risch February 21, 2022 11:24

Cleanup tutorials

76ad041

brandenchan requested a review from julian-risch February 22, 2022 09:10

Fix discrepancy in isolated retriever eval results

7cdd479

brandenchan mentioned this pull request Feb 22, 2022

Discrepancy between two forms of isolated evaluation #2216

Closed

julian-risch approved these changes Feb 22, 2022

View reviewed changes

julian-risch requested changes Feb 22, 2022

View reviewed changes

tutorials/Tutorial5_Evaluation.py Outdated Show resolved Hide resolved

julian-risch reviewed Feb 22, 2022

View reviewed changes

brandenchan added 2 commits February 23, 2022 14:11

Incorporate reviewer feedback

5d534f7

Clean notebook output

aa5f1c3

brandenchan requested a review from julian-risch February 23, 2022 13:13

julian-risch approved these changes Feb 24, 2022

View reviewed changes

brandenchan merged commit bb107e5 into master Feb 24, 2022

brandenchan deleted the evaluation_docs branch February 24, 2022 16:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Polish Evaluation Tutorial #2212

Polish Evaluation Tutorial #2212

brandenchan commented Feb 17, 2022 •

edited

review-notebook-app bot commented Feb 17, 2022

julian-risch left a comment

julian-risch Feb 22, 2022

brandenchan Feb 23, 2022

julian-risch left a comment

Polish Evaluation Tutorial #2212

Polish Evaluation Tutorial #2212

Conversation

brandenchan commented Feb 17, 2022 • edited

review-notebook-app bot commented Feb 17, 2022

julian-risch left a comment

Choose a reason for hiding this comment

julian-risch Feb 22, 2022

Choose a reason for hiding this comment

brandenchan Feb 23, 2022

Choose a reason for hiding this comment

julian-risch left a comment

Choose a reason for hiding this comment

brandenchan commented Feb 17, 2022 •

edited