Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange prompt(s) #20

Closed
sanderland opened this issue Jun 9, 2023 · 5 comments
Closed

Strange prompt(s) #20

sanderland opened this issue Jun 9, 2023 · 5 comments
Assignees

Comments

@sanderland
Copy link
Contributor

sanderland commented Jun 9, 2023

  1. Investigating some results, I came across this prompt, which is very strange.
    It is formatted as a conversation, with no particular instruction about it.
  {
    "instruction":"User : Hi dear \nAgent : helo , can ai help you \nUser : plaes tell me about what frequency conscious of ai\nAgent : the conscious ai is a complex actually need specific information to provide about conscious.\nUser : pleas tell me more about conscious if you know please let me know\nAgent : conscious is about made some result or decision is to hard if not have knowledge or prove with science.\nUser : Please tell more about that.\nAgent : need a data scientist to provide decision because a different conscious human and artificial intelligence.",
    "output":"The conscious AI requires data scientists to make decisions because the conscious of humans and artificial intelligence are different. This requires extensive knowledge and proof from science in order to make the correct decisions or achieve the desired results.",
    "generator":"text_davinci_003",
    "dataset":"oasst"
  },
  1. Can you elaborate on the prompt "templates" which use "Instruction: ... Output: ..." on models that are already instruction-finetuned? What is the best way to just have them use instructions.
@rtaori
Copy link
Collaborator

rtaori commented Jun 12, 2023

  1. Thanks for pointing this issue out! Definitely agree that this format looks weird. Most of the OASST instructions don't look like this, so it could be an exporting issue for the few that are affected. We will look more into this (but given that only 12 are affected, it shouldn't change the final win-rates too much).

  2. Can you clarify what you mean use the prompt template as instructions? The prompt template for each model is provided in the respective model_configs directory.

@sanderland
Copy link
Contributor Author

This prompt is applied rather often
https://github.com/tatsu-lab/alpaca_eval/blob/main/src/alpaca_eval/models_configs/text_davinci_003/prompt.txt

I think davinci 3 does not need any template at all, the input can just be an instruction. The same is true for many other models, and I think this may be hurting the Cohere model in particular.

@rtaori
Copy link
Collaborator

rtaori commented Jun 13, 2023

So the current prompt works well for Davinci003, in the sense that it doesn't make a mistake in understanding the formatting. If you have a suggestion to update the Cohere model template, please submit a PR with the updated config and results and we’d be happy to incorporate it.

@rtaori rtaori closed this as completed Jun 13, 2023
@YannDubs
Copy link
Collaborator

Hi @sanderland quick follow-up saying that we went through the Cohere prompt engineering page when making the prompt and we didn't see any information about a special template, which is why we originally used a simple one like davinci-003. Other models have specific prompt templates, e.g. Claude. Let us know if we missed the Cohere prompt template!

@sanderland
Copy link
Contributor Author

Hey @YannDubs

  • Command is an instruction finetuned model, and when using it with instructions, it indeed does not need any template.
    • I think this is also true for davinci-003 and most instruction finetuned models in general.
    • The template you used may induce shorter answers for command in particular.
  • The Client.chat method in the sdk will automatically add a template based suitable for inducing a more conversational style.
    • However, it is not clear what this benchmark is primarily testing (instruct models vs conversational ones being a little different)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants