feat: Add LLaMA-3 instruct prompt strategies for fine-tuning #1553

0-hero · 2024-04-20T13:42:32Z

Description

This builds on top of and includes the changes in the below PR's

Fastchat PR from @TJ-Solergibert needs to be merged before merging this

adding llama3 template lm-sys/FastChat#3257

Motivation and Context

Fine-tuning models in the llama-3 instruct conversation format, or continue fine-tuning llama-3 instruct models

How has this been tested?

Successfully pre-processed sharegpt style conversations dataset into llama-3 instruct format for fine-tuning
Model fine-tuning is in progress

Update

Added changes from TJs monkeypatch to support the latest Fastchat PR lm-sys/FastChat#3259

maziyarpanahi · 2024-04-21T08:44:27Z

@winglian Any chance this can be merged to unblock the other PR? (thanks for the PR @0-hero)

TJ-Solergibert · 2024-04-21T08:51:54Z

@maziyarpanahi You can install THIS axolotl PR with THIS FastChat PR

maziyarpanahi · 2024-04-21T10:16:52Z

@maziyarpanahi You can install THIS axolotl PR with THIS FastChat PR

Hi @TJ-Solergibert
Thanks, I will try that today. I just wanted to be sure they are both finished and ready to be merged before I do that. But it seems they are ready.
Thanks again, I'll give it shot :)

upr1ce · 2024-04-22T06:40:50Z

I think this needs to be updated, at this point it generates some token duplication, this function might need to be updated:

def register_llama3_template(system_message=None):
    system_message = system_message or "You are a helpful assistant."
    bos_token = "<|start_of_conversation|>"
    eos_token = "<|end_of_conversation|>"
    register_conv_template(
        Conversation(
            name="llama3",
            system_template=bos_token + "<|start_header_id|>system<|end_header_id|>\n\n{system_message}" + "<|eot_id|>",
            system_message=system_message,
            roles=["<|start_header_id|>user<|end_header_id|>", "<|start_header_id|>assistant<|end_header_id|>"],
            sep_style=SeparatorStyle.LLAMA3,
            sep="<|eot_id|>",
            sep2="<|start_header_id|>",
        )
    )

To (in line with FastChat implementation):

def register_llama3_template(system_message=None):
    system_message = system_message or "You are a helpful assistant."
    register_conv_template(
        Conversation(
            name="llama3",
            system_template="<|start_header_id|>system<|end_header_id|>\n\n{system_message}" + "<|eot_id|>",
            system_message=system_message,
            roles=["user", "assistant"],
            sep_style=SeparatorStyle.LLAMA3,
            sep="<|eot_id|>",
            sep2="<|start_header_id|>",
        )
    )

0-hero · 2024-04-22T07:12:43Z

Thanks @upr1ce yes it needs to be updated. Just waiting for the fastchat merge with final changes. I’ll update this once that happens

0-hero · 2024-04-24T02:41:11Z

@winglian Made necessary changes, fastchat PR is merged

ShareGPT Original

[ { "from": "system", "value": "You are a helpful AI assistant." }, { "from": "user", "value": "Create a nested loop to print every combination of numbers between 0-9, excluding any combination that contains the number 5. Additionally, exclude any combination that contains a repeating digit. Implement the solution without using any built-in functions or libraries to check for repeating digits." }, { "from": "gpt", "value": "Here is an example of a nested loop in Python to print every combination of numbers between 0-9, excluding any combination that contains the number 5 or repeating digits:\n\n```python\nfor i in range(10): # First digit\n for j in range(10): # Second digit\n for k in range(10): # Third digit\n # Checking for the conditions\n if i != 5 and j != 5 and k != 5 and i != j and i != k and j != k:\n print(i, j, k)\n```\n\nThis code will generate and print every combination of three digits between 0-9 that do not contain the number 5 and do not have any repeating digits." } ]

Tokenised LLaMA-3-Instruct Format

<|begin_of_text|><|start_of_conversation|><|start_header_id|>system<|end_header_id|>
You are a helpful AI assistant.<|eot_id|><|start_header_id|>user<|end_header_id|> Create a nested loop to print every combination of numbers between 0-9, excluding any combination that contains the number 5. Additionally, exclude any combination that contains a repeating digit. Implement the solution without using any built-in functions or libraries to check for repeating digits.<|eot_id|><|start_header_id|>assistant<|end_header_id|> Here is an example of a nested loop in Python to print every combination of numbers between 0-9, excluding any combination that contains the number 5 or repeating digits:

```python
for i in range(10):  # First digit
    for j in range(10):  # Second digit
        for k in range(10):  # Third digit
            # Checking for the conditions
            if i!= 5 and j!= 5 and k!= 5 and i!= j and i!= k and j!= k:
                print(i, j, k)
python```

This code will generate and print every combination of three digits between 0-9 that do not contain the number 5 and do not have any repeating digits.<|eot_id|><|end_of_text|>

LLaMA-3 Instruct format can be used with the config

datasets:
  - path: bjoernp/Vezora_Tested-22k-Python-Alpaca-sharegpt-filtered
    type: sharegpt
    conversation: llama3

Let me know if any changes are required

maziyarpanahi · 2024-04-24T08:07:40Z

@0-hero isn't the actual Llama-3 template (as they have set in their HF tokenizer config) like this:

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are a helpful assistant.<|eot_id|><|start_header_id|>user<|end_header_id|>

Who are you?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

I am an AI<|eot_id|><|start_header_id|>user<|end_header_id|>

What's your name?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

condence:

<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nYou are a helpful assistant.<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nWho are you?<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\nI am an AI<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nWhat's your name?<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n

upr1ce · 2024-04-24T14:16:21Z

src/axolotl/monkeypatch/fastchat_conversation_turns.py

+            if message:
+                yield f"<|start_header_id|>{role}<|end_header_id|>\n\n", f"{message.strip()}<|eot_id|>"
+            else:
+                yield f"<|start_header_id|>{role}<|end_header_id|>\n\n", ""


return is missing here

as @ryj0902 pointed out, when there is no system_message, bos token is not added at the moment. should we do something like the following?

if self.sep_style == SeparatorStyle.LLAMA3: if self.system_message: # For llama3, the system message is NOT incorporated into the first human instruction # All messages follow <|start_header_id|>' + role + '<|end_header_id|>\n\n'+ message + '<|eot_id|> yield "", system_prompt for i, (role, message) in enumerate(self.messages): if message: role_header = f"<|start_header_id|>{role}<|end_header_id|>\n\n" if i == 0: yield "<|begin_of_text|>" + role_header, f"{message.strip()}<|eot_id|>" else: yield role_header, f"{message.strip()}<|eot_id|>" else: yield f"<|start_header_id|>{role}<|end_header_id|>\n\n", "" return

upr1ce · 2024-04-24T14:25:48Z

After the changes, it will generate correct output, example (formatted)

<|begin_of_text|><|start_header_id|>system<|end_header_id|>
You are a helpful assistant.<|eot_id|>
<|start_header_id|>user<|end_header_id|>

Can you please book a flight for me from New York to Los Angeles?<|eot_id|>
<|start_header_id|>assistant<|end_header_id|>

I'm sorry, but I'm unable to assist with booking flights.<|eot_id|><|end_of_text|>

upr1ce · 2024-04-24T14:22:31Z

src/axolotl/monkeypatch/fastchat_conversation_turns.py

+        if self.system_message:
+            # For llama3, the system message is NOT incorporated into the first human instruction
+            # All messages follow <|start_header_id|>' + role + '<|end_header_id|>\n\n'+ message + '<|eot_id|>
+            yield "", "<|begin_of_text|>" + system_prompt


this should be yield "", system_prompt otherwise <|begin_of_text|> will be replicated twice, as bos tokens are added automatically by axolotl. Sorry for next message, I did not mark it correctly so it was visible only for me.

I was just working on that, but even with yield "", system_prompt I'm seeing <|begin_of_text|> twice. Trying t figure out what's going on there

I think thats a cache issue, clear your huggingface dataset cache and it should be fine, something like rm ~/.cache/huggingface/datasets/ depending on your OS

0-hero · 2024-04-24T17:46:41Z

Thanks @upr1ce works as expected now

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are a helpful AI assistant.<|eot_id|><|start_header_id|>user<|end_header_id|>

Create a nested loop to print every combination of numbers between 0-9, excluding any combination that contains the number 5. Additionally, exclude any combination that contains a repeating digit. Implement the solution without using any built-in functions or libraries to check for repeating digits.<|eot_id|><|start_header_id|>assistant<|end_header_id|>

Here is an example of a nested loop in Python to print every combination of numbers between 0-9, excluding any combination that contains the number 5 or repeating digits:

```python
for i in range(10):  # First digit
    for j in range(10):  # Second digit
        for k in range(10):  # Third digit
            # Checking for the conditions
            if i!= 5 and j!= 5 and k!= 5 and i!= j and i!= k and j!= k:
                print(i, j, k)
```python

This code will generate and print every combination of three digits between 0-9 that do not contain the number 5 and do not have any repeating digits.<|eot_id|><|end_of_text|>

Sorry, Missed the last 2 changes when I changed to the 2nd fast chat PR
@maziyarpanahi @upr1ce this should be ready to use now

0-hero · 2024-04-26T03:26:29Z

@upr1ce reran the workflows here - https://github.com/0-hero/axolotl/actions

ryj0902 · 2024-04-26T09:21:45Z

Shouldn't the bos_token be added regardless of whether the system prompt is present or not?
Is the instruction prompt being implemented now exclusive to the single message example?

According to the current implementation, if there is no system message(or empty string), bos token is not added. Was this intentional?
Or are cases where there is no system message very rare, or am I using an immature use case as an example?

case 1, system message exist:
{"conversations": [{"from": "system", "value": "asdf"}, {"from": "user", "value": "It is a test"}, {"from": "assistant", "value": "C"}]}
result:

[2024-04-26 18:15:33,159] [INFO] [axolotl.check_example_labels:37] [PID:30126] [RANK:0] <|begin_of_text|>(-100, 128000) <|start_header_id|>(-100, 128006) system(-100, 9125) <|end_header_id|>(-100, 128007) 

(-100, 271) asdf(-100, 77715) <|eot_id|>(-100, 128009) <|start_header_id|>(-100, 128006) user(-100, 882) <|end_header_id|>(-100, 128007) 

(-100, 271) It(-100, 2181)  is(-100, 374)  a(-100, 264)  test(-100, 1296) <|eot_id|>(-100, 128009) <|start_header_id|>(-100, 128006) assistant(-100, 78191) <|end_header_id|>(-100, 128007) 

(271, 271) C(34, 34) <|eot_id|>(128009, 128009) <|end_of_text|>(128001, 128001)

case 2, system message is empty:
{"conversations": [{"from": "system", "value": ""}, {"from": "user", "value": "It is a test"}, {"from": "assistant", "value": "C"}]}
result:

[2024-04-26 18:18:23,201] [INFO] [axolotl.check_example_labels:37] [PID:32252] [RANK:0] <|start_header_id|>(-100, 128006) user(-100, 882) <|end_header_id|>(-100, 128007) 

(-100, 271) It(-100, 2181)  is(-100, 374)  a(-100, 264)  test(-100, 1296) <|eot_id|>(-100, 128009) <|start_header_id|>(-100, 128006) assistant(-100, 78191) <|end_header_id|>(-100, 128007) 

(271, 271) C(34, 34) <|eot_id|>(128009, 128009) <|end_of_text|>(128001, 128001)

case 3, system message not exist: {"conversations": [{"from": "user", "value": "It is a test"}, {"from": "assistant", "value": "C"}]}
result:

[2024-04-26 18:19:28,071] [INFO] [axolotl.check_example_labels:37] [PID:33302] [RANK:0] <|start_header_id|>(-100, 128006) user(-100, 882) <|end_header_id|>(-100, 128007) 

(-100, 271) It(-100, 2181)  is(-100, 374)  a(-100, 264)  test(-100, 1296) <|eot_id|>(-100, 128009) <|start_header_id|>(-100, 128006) assistant(-100, 78191) <|end_header_id|>(-100, 128007) 

(271, 271) C(34, 34) <|eot_id|>(128009, 128009) <|end_of_text|>(128001, 128001)
[2024-04-26 18:19:28,071] [INFO] [axolotl.check_example_labels:38] [PID:33302] [RANK:0]

luyuzhe111 · 2024-05-02T23:41:46Z

src/axolotl/prompt_strategies/sharegpt.py

should we pass system_message as one of the key word arguments?

https://github.com/meta-llama/llama3/blob/main/llama/test_tokenizer.py#L68

winglian · 2024-05-08T20:45:42Z

here is how llama-3-8b-instruct tokenizes:

<|begin_of_text|><|start_header_id|>user<|end_header_id|>

Hello, how are you?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

Well, how about yourself?<|eot_id|>

upr1ce · 2024-05-09T09:39:08Z

I am having trouble adjusting it to work in all 3 cases mentioned by @ryj0902, case 1 and 3 are easy to fix, case 2 seems weird, if system message is empty, I am not able to force to include <|begin_of_text|> without messing it in other cases, e.g leading to repetition of <|begin_of_text|> any help is appreciated @winglian @NanoCode012

ryj0902 · 2024-05-09T12:07:12Z

It would be great if code could be written to respond to all cases that a user might enter, but I am also aware that case 2 deals with a very rare, edge case.
As for case 2, I honestly don't know what format would be most appropriate, but wouldn't it be possible to respond with a guideline, warning, or assert?

winglian · 2024-05-09T18:57:32Z

@0-hero I rebased this against main, and added a sharegpt tokenization test. I believe in order to tokenize it properly, you have to include this in your configuration, otherwise it includes an <|end_of_text|> token after each assistant turn

special_tokens:
  eos_token: "<|eot_id|>"

maziyarpanahi · 2024-05-10T12:38:48Z

@0-hero I rebased this against main, and added a sharegpt tokenization test. I believe in order to tokenize it properly, you have to include this in your configuration, otherwise it includes an <|end_of_text|> token after each assistant turn
special_tokens:
  eos_token: "<|eot_id|>"

This is true! Unfortunately, Meta instead of fixing their tokenizer_config and changing the eos_token to <|eot_id|>, they offered a workaround which is to use <|eot_id|> in a terminators (stop strings). Thanks for catching this, not sure how much it has impact on the fine-tune.

MoonRide303 · 2024-05-10T14:42:22Z

@0-hero I rebased this against main, and added a sharegpt tokenization test. I believe in order to tokenize it properly, you have to include this in your configuration, otherwise it includes an <|end_of_text|> token after each assistant turn
special_tokens:
  eos_token: "<|eot_id|>"
This is true! Unfortunately, Meta instead of fixing their tokenizer_config and changing the eos_token to <|eot_id|>, they offered a workaround which is to use <|eot_id|> in a terminators (stop strings). Thanks for catching this, not sure how much it has impact on the fine-tune.

They've just fixed official tokenizer config in Meta-Llama-3-8B-Instruct repo, today:
https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct/blob/main/tokenizer_config.json#L2055

0-hero · 2024-05-11T17:03:13Z

@winglian thanks for taking over! I had to go on a short medical break

winglian added the waiting on upstream label Apr 21, 2024

qiyuangong mentioned this pull request Apr 24, 2024

[Feature Request] IPEX-LLM + Axolotl Docker Image intel-analytics/ipex-llm#10821

Open

upr1ce reviewed Apr 24, 2024

View reviewed changes

upr1ce approved these changes Apr 25, 2024

View reviewed changes

luyuzhe111 reviewed May 2, 2024

View reviewed changes

winglian force-pushed the feature/llama-3-instruct-support branch from 2470724 to 73a03cd Compare May 9, 2024 18:55

winglian requested a review from NanoCode012 May 9, 2024 20:18

winglian mentioned this pull request May 9, 2024

add llama3 format for sft (sharegpt) and dpo #1605

Closed

Ram and others added 4 commits May 10, 2024 10:44

Add prompt strategies

c5a5ddc

Update modified URL

7816cd8

Update modified URL

ccc4aff

Update fastchat_conversation_turns.py

65cff51

Ram and others added 9 commits May 10, 2024 10:44

Update register function

53435da

Remove extra /n for system prompt

9468e60

Fix return

9d54b1a

Fix BOS

4484613

Update requirements, pylint

92f636f

Linting

ace8e0b

Linting

e21f971

fix tuples, make sure to set system message in template

aa42a7f

tests for llama3 tokenization

9ae5649

winglian force-pushed the feature/llama-3-instruct-support branch from 73a03cd to 9ae5649 Compare May 10, 2024 14:44

fix conditionals for loading chat template

ec56851

This was referenced May 10, 2024

Add Llama 3 DPO Training and Fix Llama 3 special tokens in examples #1607

Closed

Missing bos and eos token on llama 3 sft training? #1608

Closed

winglian merged commit 50421c8 into OpenAccess-AI-Collective:main May 11, 2024
7 checks passed

ryj0902 mentioned this pull request May 14, 2024

Preprocess failure for llama3 instruct prompt #1614

Open

8 tasks

winglian mentioned this pull request May 15, 2024

feat: add llama-3 to chat_templates #1542

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add LLaMA-3 instruct prompt strategies for fine-tuning #1553

feat: Add LLaMA-3 instruct prompt strategies for fine-tuning #1553

0-hero commented Apr 20, 2024 •

edited

maziyarpanahi commented Apr 21, 2024

TJ-Solergibert commented Apr 21, 2024

maziyarpanahi commented Apr 21, 2024

upr1ce commented Apr 22, 2024 •

edited

0-hero commented Apr 22, 2024

0-hero commented Apr 24, 2024 •

edited

maziyarpanahi commented Apr 24, 2024

upr1ce Apr 24, 2024

luyuzhe111 May 3, 2024

upr1ce May 9, 2024

upr1ce commented Apr 24, 2024

upr1ce Apr 24, 2024 •

edited

0-hero Apr 24, 2024

upr1ce Apr 24, 2024 •

edited

0-hero commented Apr 24, 2024 •

edited

0-hero commented Apr 26, 2024

ryj0902 commented Apr 26, 2024

luyuzhe111 May 2, 2024

MoonRide303 May 8, 2024

winglian commented May 8, 2024

upr1ce commented May 9, 2024 •

edited

ryj0902 commented May 9, 2024

winglian commented May 9, 2024

maziyarpanahi commented May 10, 2024

MoonRide303 commented May 10, 2024

0-hero commented May 11, 2024

feat: Add LLaMA-3 instruct prompt strategies for fine-tuning #1553

feat: Add LLaMA-3 instruct prompt strategies for fine-tuning #1553

Conversation

0-hero commented Apr 20, 2024 • edited

Description

Motivation and Context

How has this been tested?

Update

maziyarpanahi commented Apr 21, 2024

TJ-Solergibert commented Apr 21, 2024

maziyarpanahi commented Apr 21, 2024

upr1ce commented Apr 22, 2024 • edited

0-hero commented Apr 22, 2024

0-hero commented Apr 24, 2024 • edited

maziyarpanahi commented Apr 24, 2024

upr1ce Apr 24, 2024

Choose a reason for hiding this comment

luyuzhe111 May 3, 2024

Choose a reason for hiding this comment

upr1ce May 9, 2024

Choose a reason for hiding this comment

upr1ce commented Apr 24, 2024

upr1ce Apr 24, 2024 • edited

Choose a reason for hiding this comment

0-hero Apr 24, 2024

Choose a reason for hiding this comment

upr1ce Apr 24, 2024 • edited

Choose a reason for hiding this comment

0-hero commented Apr 24, 2024 • edited

0-hero commented Apr 26, 2024

ryj0902 commented Apr 26, 2024

luyuzhe111 May 2, 2024

Choose a reason for hiding this comment

MoonRide303 May 8, 2024

Choose a reason for hiding this comment

winglian commented May 8, 2024

upr1ce commented May 9, 2024 • edited

ryj0902 commented May 9, 2024

winglian commented May 9, 2024

maziyarpanahi commented May 10, 2024

MoonRide303 commented May 10, 2024

0-hero commented May 11, 2024

0-hero commented Apr 20, 2024 •

edited

upr1ce commented Apr 22, 2024 •

edited

0-hero commented Apr 24, 2024 •

edited

upr1ce Apr 24, 2024 •

edited

upr1ce Apr 24, 2024 •

edited

0-hero commented Apr 24, 2024 •

edited

upr1ce commented May 9, 2024 •

edited