llama3 instruct and chat system prompts #950

oktie · 2024-06-25T13:57:45Z

Issue #945

I removed the formats.llama3_chat_with_system_prompt system prompt that was not being used anywhere, and also wasn't right (e.g. had \n between parts instead of \n\n).

I've updated formats.llama3_chat definition so:

the demos also have the user/assistant prompts. I believe that is how we are supposed to provide examples.
added a default DEFAULT_SYSTEM_PROMPT that is one that is being suggested/used for llama2, e.g. see: https://developer.ibm.com/tutorials/awb-prompt-engineering-llama-2/

I've added a formats.llama3_instruct that I believe more accurately reflects the official guidance https://huggingface.co/blog/llama3#how-to-prompt-llama-3 which states

The Instruct versions use the following conversation structure:

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

{{ system_prompt }}<|eot_id|><|start_header_id|>user<|end_header_id|>

{{ user_msg_1 }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>

{{ model_answer_1 }}<|eot_id|>

This format has to be exactly reproduced for effective use.

Note: the instruction is not a part of the demos. I believe a demo example-specific instruction can improve the prompt, but I don't think it's possible the way unitxt templates work now (instruction cannot be used in demo_format).

eladven

Elron, The PR looks fine. Please add the json files and it will allow the consistency test to pass

yoavkatz

To be consistent. The system prompt needs to be defined outside. the format

yoavkatz · 2024-06-26T14:34:21Z

prepare/formats/models/llama3.py

+    "{instruction}{source}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"
+    "{target_prefix}{target}<|eot_id|>",
+    model_input_format="<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\n"
+    "{DEFAULT_SYSTEM_PRO#MPT}<|eot_id|>{demos}<|start_header_id|>user<|end_header_id|>"


system_prompt should be taken from {system_prompt}, so it can be configured from the outside.

@yoavkatz yes, this is how it is for llama3_instruct. For llama3_chat it was defined without a system prompt in the past, and according to the guidelines that is not a good idea. So I thought it makes sense to add a default system prompt so if the user does not define one, we at least have a generic one. Does this make sense?

should we just put the system prompt there instead of using this DEFAULT_SYSTEM_PROMPT constant? I tested it on my side, and it worked, but I am not sure if it will work if unitxt is installed as a library.

If the recommended approach is to have system_prompt, then llama3_chat should have system prompt, but this should be configurable via {system_prompt}.

By default, it should be format=formats.llama3_chat,system_prompt=system_prompt.llama3.

The users can always pass system_prompt.empty if they don't want system prompt.

If there is any need to format without |system| special token, there should be a format which is llama3_chat_without_system_prompt.

BTW, formats.llama3_chat_with_system_prompt was used in the internal project that uses unitxt.

template = InputOutputTemplate(
instruction="Answer the following questions in one word.",
input_format="{question}",
output_format="{answers}",
postprocessors=["processors.lower_case"],
)

dataset = load_dataset(
card=card,
template=template,
format="formats.llama3_chat",
system_prompt="system_prompts.models.llama2",
num_demos=2,
demos_pool_size=3,
)

Thanks @yoavkatz I was also working on a fix and testing it, but yours looks good to me, and I just tested it on a task and it works well (with llama3-70b-instruct I get a much better score with llama3 format).

The only thing is that now llama3_chat and llama3_instruct are the same. I suggest removing llama3_chat and updating all the references to it. Do you want me to do it?

Yes. If they are the same we can keep just one. I just wonder about about the alternative format where the instruction and demos are in the same user prompt as the question. Granite recommended icl format is this way.

@yoavkatz okay I just replaced llama_chat with llama_instruct anywhere it was used.

I also experimented with two alternatives. An alt1 option where the instruction goes to the user role prompt, and an alt2 option where the demos all go into one user prompt, as you had suggested. Numbers I get on two example tasks, each with llama-3-70b-instruct and 3-shot prompts:

Task Option Score

cards.boolq.classification formats.llama3_instruct 0.9119

cards.boolq.classification formats.llama3_instruct_alt1 0.9081

cards.boolq.classification formats.llama3_instruct_alt2 0.89

cards.summarize_from_human_feedback no format 0.64

cards.summarize_from_human_feedback formats.llama3_instruct 0.68

cards.summarize_from_human_feedback formats.llama3_instruct_alt1 0.67

cards.summarize_from_human_feedback formats.llama3_instruct_alt2 0.56

So it seems like alt1 is not making much of a difference, but alt2 is worse. I believe we have the right prompt now. I suggest approving this PR and merging for now. I'll be doing more testing and will let you know if I find ways to improve the format.

Signed-off-by: Yoav Katz <katz@il.ibm.com>

yoavkatz

I approved. Make sure the old jsons are deleted.
Also, given the results maybe we don't need alt1 and alt2 in the catalog. If we keep then, let's give them descriptive names (e.g formats.llama3_instruct_all_demos_in_a_single_turn".)

Signed-off-by: Yoav Katz <katz@il.ibm.com>

and show confidence internvals Signed-off-by: Yoav Katz <katz@il.ibm.com>

large model inference Signed-off-by: Yoav Katz <katz@il.ibm.com>

Signed-off-by: Yoav Katz <katz@il.ibm.com>

* llama3 instruct and chat system prompts * fixing llama3_chat prompt + adding jsons * fixing llama3 formats + a boolqa system prompt * Start of example to check different formats. Signed-off-by: Yoav Katz <katz@il.ibm.com> * Added instructions to llama3 model. Signed-off-by: Yoav Katz <katz@il.ibm.com> * replacing llama3_chat with llama3_instruct + 2 alts * Added example for multiple formats Signed-off-by: Yoav Katz <katz@il.ibm.com> * prepare artifcats (whitespace changes) Signed-off-by: Yoav Katz <katz@il.ibm.com> * Updated evaluation to use more examples and show confidence internvals Signed-off-by: Yoav Katz <katz@il.ibm.com> * Do not run examples that require large model inference Signed-off-by: Yoav Katz <katz@il.ibm.com> * Seperated examples in example table Signed-off-by: Yoav Katz <katz@il.ibm.com> * Fixed test_examples to skip some files. Signed-off-by: Yoav Katz <katz@il.ibm.com> --------- Signed-off-by: Yoav Katz <katz@il.ibm.com> Co-authored-by: Yoav Katz <katz@il.ibm.com>

llama3 instruct and chat system prompts

1998939

eladven reviewed Jun 26, 2024

View reviewed changes

yoavkatz requested changes Jun 26, 2024

View reviewed changes

yoavkatz reviewed Jun 26, 2024

View reviewed changes

oktie and others added 6 commits June 26, 2024 10:40

fixing llama3_chat prompt + adding jsons

c2bbd1b

fixing llama3 formats + a boolqa system prompt

a0fff10

Merge remote-tracking branch 'origin/main' into llama3_prompt_fix

33872ff

Start of example to check different formats.

301eaff

Signed-off-by: Yoav Katz <katz@il.ibm.com>

Added instructions to llama3 model.

226da5c

Signed-off-by: Yoav Katz <katz@il.ibm.com>

replacing llama3_chat with llama3_instruct + 2 alts

7eab6d1

yoavkatz approved these changes Jun 29, 2024

View reviewed changes

yoavkatz added 7 commits June 29, 2024 22:26

Added example for multiple formats

189eab5

Signed-off-by: Yoav Katz <katz@il.ibm.com>

prepare artifcats (whitespace changes)

ad4bf2d

Signed-off-by: Yoav Katz <katz@il.ibm.com>

Updated evaluation to use more examples

424ca14

and show confidence internvals Signed-off-by: Yoav Katz <katz@il.ibm.com>

Do not run examples that require

6122066

large model inference Signed-off-by: Yoav Katz <katz@il.ibm.com>

Merge remote-tracking branch 'origin/main' into llama3_prompt_fix

b7970fa

Signed-off-by: Yoav Katz <katz@il.ibm.com>

Seperated examples in example table

b06e68d

Signed-off-by: Yoav Katz <katz@il.ibm.com>

Fixed test_examples to skip some files.

78a830a

Signed-off-by: Yoav Katz <katz@il.ibm.com>

yoavkatz merged commit c10f4ba into main Jun 30, 2024
8 checks passed

yoavkatz deleted the llama3_prompt_fix branch June 30, 2024 12:49

lga-zurich mentioned this pull request Jul 1, 2024

Fix llama_3_ibm_genai_generic_template #978

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama3 instruct and chat system prompts #950

llama3 instruct and chat system prompts #950

oktie commented Jun 25, 2024 •

edited

Loading

eladven left a comment

yoavkatz left a comment

yoavkatz Jun 26, 2024

oktie Jun 26, 2024

oktie Jun 26, 2024

yoavkatz Jun 26, 2024

yoavkatz Jun 26, 2024

yoavkatz Jun 27, 2024

yoavkatz Jun 27, 2024

oktie Jun 27, 2024

yoavkatz Jun 27, 2024

oktie Jun 28, 2024 •

edited

Loading

yoavkatz left a comment

Task	Option	Score
`cards.boolq.classification`	`formats.llama3_instruct`	0.9119
`cards.boolq.classification`	`formats.llama3_instruct_alt1`	0.9081
`cards.boolq.classification`	`formats.llama3_instruct_alt2`	0.89
`cards.summarize_from_human_feedback`	no format	0.64
`cards.summarize_from_human_feedback`	`formats.llama3_instruct`	0.68
`cards.summarize_from_human_feedback`	`formats.llama3_instruct_alt1`	0.67
`cards.summarize_from_human_feedback`	`formats.llama3_instruct_alt2`	0.56

llama3 instruct and chat system prompts #950

llama3 instruct and chat system prompts #950

Conversation

oktie commented Jun 25, 2024 • edited Loading

eladven left a comment

Choose a reason for hiding this comment

yoavkatz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

oktie Jun 28, 2024 • edited Loading

Choose a reason for hiding this comment

yoavkatz left a comment

Choose a reason for hiding this comment

oktie commented Jun 25, 2024 •

edited

Loading

oktie Jun 28, 2024 •

edited

Loading