New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
higher accuracies for experiment two with modified prompt #4
base: main
Are you sure you want to change the base?
Conversation
Hi @sradc, thanks for pointing this out and glad you enjoyed our paper. What prompt did you use to achieve this accuracy? |
Hey, the notebook is included in this MR, (and the predictions themselves). Will include below for convenience:
(Note that the example is removed from the prompt, if it's for the celebrity being tested.) Also running this on gpt3.5 currently. Edit: also, used this for the system prompt:
|
gtp-3.5turbo seems to get around 45% accuracy with this prompt (included results in previous commit) |
Pushed updates. Best results so far:
|
Best results are now in the latest commit:
(Probably going to stop now because it's expensive.) |
…the parent either) and _slightly_ better caching...
…d a _bad_ timeout implementation...
Thanks for pointing these out! I'm not going to merge for now, since your change doesn't really integrate with the existing codebase, but it's cool to see that there are better prompts out there. |
No prob, this PR was just to share and track the work. Let me know if you might want to integrate the prompt stuff. |
Hi there, super interesting work.
Looks like the accuracy of gpt-4 might be higher with this prompt.