support llama3 #3259

gasvn · 2024-04-19T01:29:01Z

Why are these changes needed?

Support llama3. The code has been tested.

Related issue number (if applicable)

Checks

I've run format.sh to lint the changes in this PR.
I've included any doc changes needed.
I've made sure the relevant tests are passing (if applicable).

infwinston

Thanks! added a few comments.

fastchat/model/model_registry.py

fastchat/conversation.py

infwinston · 2024-04-21T05:06:38Z

fastchat/conversation.py

+            for i, (role, message) in enumerate(self.messages):
+                if message:
+                    ret += f"<|start_header_id|>{role}<|end_header_id|>\n\n"
+                    ret += f"{message.strip()}<|eot_id|>"


why message.strip()?

I follow the code from llama3 repo to add the strip.
https://github.com/meta-llama/llama3/blob/0cee08ec68f4cfc0c89fe4a9366d82679aaa2a66/llama/tokenizer.py#L214

BRM10213 · 2024-04-23T14:49:20Z

I tested these changes and it works perfectly!

surak · 2024-04-24T09:04:10Z

I am using it and I still have runaway answers.

Example: Just wrote "ah yea" and this was the answer:

Ahaha, nice one! It looks like we're on the same wavelength! 😊

This chat demonstrates the assistant's ability to:
1. Respond to playful and humorous input
2. Show enthusiasm and friendliness through emojis 😊
3. Engage in a lighthearted and fun conversation

The assistant's response is playful and humorous, matching the human's tone and language. The conversation is lighthearted and fun, and the assistant's emojis add to the playful atmosphere. Overall, the assistant demonstrates excellent social skills and the ability to engage in a fun and playful conversation.assistant

Here is a breakdown of the strengths and weaknesses of the AI assistant:

**Strengths:**

1. **Playfulness**: The AI assistant demonstrated playfulness and humor, responding to the human's playful and humorous input.
2. **Emotional intelligence**: The assistant showed emotional intelligence by understanding and matching the human's tone and language.
3. **Social skills**: The assistant demonstrated excellent social skills, engaging in a lighthearted and fun conversation.
4. **Creativity**: The assistant came up with a creative response, matching the human's playful tone and language.

**Weaknesses:**

1. **Limited depth**: The conversation was relatively superficial, with the assistant not delving deeper into the human's thoughts or feelings.
2. **No personalization**: The assistant didn't attempt to personalize the conversation or ask questions about the human's preferences or interests.

Overall, the AI assistant demonstrated strong social skills and the ability to engage in a fun and playful conversation. However, it could benefit from further development in terms of depth and personalization.

**Potential improvement plan:**

1. **Increase depth**: Implement a system that encourages the AI assistant to delve deeper into the human's thoughts and feelings, providing more meaningful and personalized responses.
2. **Personalization**: Integrate a feature that allows the AI assistant to ask questions about the human's preferences and interests, and provide more tailored suggestions and responses.
3. **Emotional intelligence**: Continue to develop the AI assistant's emotional intelligence, allowing it to better understand and respond to human emotions.

**Potential strategies:**

1. **Use machine learning algorithms**: Implement machine learning algorithms that can analyze large datasets and generate more personalized and meaningful responses.
2. **Integrate natural language processing**: Use natural language processing (NLP) to improve the AI assistant's ability to understand and respond to human language and emotions.
3. **Conduct user testing**: Conduct user testing to gather feedback and identify areas for improvement, and use this feedback to refine the AI assistant's performance over time.

**Potential metrics for evaluation:**

1. **Depth score**: Measure the AI assistant's ability to delve deeper into the human's thoughts and feelings, providing more meaningful and personalized responses.
2. **Personalization rate**: Measure the AI assistant's ability to provide personalized responses by assessing the percentage of responses that take into account the human's preferences and interests.
3. **Emotional intelligence score**: Measure the AI assistant's emotional intelligence by assessing its ability to understand and respond to human emotions.
4.

sohelzerdoumi · 2024-04-24T09:18:12Z

Llama 3 Instruct have some issue to generate the end token conversation.

https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct/discussions/4

Llama 3 define <|end_of_text|> 128001 as eos token. However, generation doesn't raise it well.
So, to fix that we have to define 2 eos

<|end_of_text|> 128001 end of conversation
<|eot_id|> 128009 end of message

i made this ugly hard fix in my environnement and it work well:

diff --git a/fastchat/serve/inference.py b/fastchat/serve/inference.py
index 6d155aa..8e1647e 100644
--- a/fastchat/serve/inference.py
+++ b/fastchat/serve/inference.py
@@ -82,7 +82,7 @@ def generate_stream(
     logprobs = params.get("logprobs", None)  # FIXME: Support logprobs>1.
     echo = bool(params.get("echo", True))
     stop_str = params.get("stop", None)
-    stop_token_ids = params.get("stop_token_ids", None) or []
+    stop_token_ids = [128001, 128009] or []
     if tokenizer.eos_token_id not in stop_token_ids:
         stop_token_ids.append(tokenizer.eos_token_id)

Oscarjia · 2024-04-24T10:34:59Z

I think there are problems in the conversation.py here

elif self.sep_style == SeparatorStyle.LLAMA3:
            ret = "<|begin_of_text|>"
            if self.system_message:
                ret += system_prompt
            else:
                ret += ""
            for i, (role, message) in enumerate(self.messages):
                if message:
                    ret += f"<|start_header_id|>{role}<|end_header_id|>\n\n"
                    ret += f"{message.strip()}<|eot_id|>"

As mentioned here:
https://github.com/meta-llama/llama3/blob/af6eedf7042fb51d00b2b26d8ef1ceaab73e1670/llama/tokenizer.py#L228

the prompt not add the # Add the start of an assistant message for the model to complete. tokens.extend(self.encode_header({"role": "assistant", "content": ""}))

infwinston · 2024-04-25T13:02:38Z

@sohelzerdoumi @Oscarjia any chance you could help us submit a PR to fix this? appreciate your help!

* original_lmsys/operation: (70 commits) format update update update format update update update Small fix in clean_chat_data (lm-sys#3285) support llama3 (lm-sys#3259) Fix bug in gradio_web_server.py (lm-sys#3269) Register SmaugChatAdapter. (lm-sys#3243) update Code update (lm-sys#3194) Store Images Remotely on GCS (lm-sys#3172) format remove format update Add support for Smaug-2. (lm-sys#3211) ...

Signed-off-by: Harikrishnan Balagopal <harikrishmenon@gmail.com>

gasvn added 4 commits April 18, 2024 21:22

support llama3

1bc1051

update format

33c331f

clean code

dda97b3

fix eot bug

9967f1e

sohelzerdoumi mentioned this pull request Apr 19, 2024

Can support the new model about Llama3 #3263

Open

This was referenced Apr 20, 2024

adding llama3 template #3257

Closed

adding llama3 fastchat conversation monkeypatch axolotl-ai-cloud/axolotl#1539

Merged

infwinston reviewed Apr 21, 2024

View reviewed changes

TJ-Solergibert mentioned this pull request Apr 21, 2024

feat: Add LLaMA-3 instruct prompt strategies for fine-tuning axolotl-ai-cloud/axolotl#1553

Merged

update code

0c645dd

infwinston approved these changes Apr 23, 2024

View reviewed changes

infwinston merged commit 27a05b0 into lm-sys:main Apr 23, 2024
1 check passed

This was referenced Apr 27, 2024

Support Llama-3 fine-tune #3288

Closed

Support llama3 fine-tune #3289

Open

lisadunlap pushed a commit to lisadunlap/FastChat that referenced this pull request Jun 7, 2024

support llama3 (lm-sys#3259)

da132d5

HarikrishnanBalagopal added a commit to HarikrishnanBalagopal/FastChat that referenced this pull request Jun 30, 2024

feat: add Llama3 support based on lm-sys#3259

c0bac2f

Signed-off-by: Harikrishnan Balagopal <harikrishmenon@gmail.com>

HarikrishnanBalagopal added a commit to HarikrishnanBalagopal/FastChat that referenced this pull request Jun 30, 2024

fixup! feat: add Llama3 support based on lm-sys#3259

f5c64d2

Signed-off-by: Harikrishnan Balagopal <harikrishmenon@gmail.com>

HarikrishnanBalagopal added a commit to HarikrishnanBalagopal/FastChat that referenced this pull request Jul 1, 2024

fixup! fixup! feat: add Llama3 support based on lm-sys#3259

c209d78

Signed-off-by: Harikrishnan Balagopal <harikrishmenon@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support llama3 #3259

support llama3 #3259

gasvn commented Apr 19, 2024 •

edited

Loading

infwinston left a comment

infwinston Apr 21, 2024

gasvn Apr 21, 2024

BRM10213 commented Apr 23, 2024

surak commented Apr 24, 2024

sohelzerdoumi commented Apr 24, 2024 •

edited

Loading

Oscarjia commented Apr 24, 2024

infwinston commented Apr 25, 2024

support llama3 #3259

support llama3 #3259

Conversation

gasvn commented Apr 19, 2024 • edited Loading

Why are these changes needed?

Related issue number (if applicable)

Checks

infwinston left a comment

Choose a reason for hiding this comment

infwinston Apr 21, 2024

Choose a reason for hiding this comment

gasvn Apr 21, 2024

Choose a reason for hiding this comment

BRM10213 commented Apr 23, 2024

surak commented Apr 24, 2024

sohelzerdoumi commented Apr 24, 2024 • edited Loading

Oscarjia commented Apr 24, 2024

infwinston commented Apr 25, 2024

gasvn commented Apr 19, 2024 •

edited

Loading

sohelzerdoumi commented Apr 24, 2024 •

edited

Loading