Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prompt format? #30

Closed
silvacarl2 opened this issue Nov 6, 2023 · 9 comments
Closed

prompt format? #30

silvacarl2 opened this issue Nov 6, 2023 · 9 comments
Labels
doc Your PR contains doc changes, no matter whether the changes are in markdown or code files. doc-not-needed Your PR changes do not impact docs. question Further information is requested regression

Comments

@silvacarl2
Copy link

this is not an issue but did not know where to put it. Is there a specific prompt format to use?

@mallorbc
Copy link

mallorbc commented Nov 6, 2023

Carl,

This is just a base model(for now), so there is no fancy prompting except for BOS and EOS token, which is <|startoftext|> and <|endoftext|> respectfully.

@silvacarl2
Copy link
Author

LOL THX

@loofahcus
Copy link
Contributor

loofahcus commented Nov 7, 2023

Carl,

This is just a base model(for now), so there is no fancy prompting except for BOS and EOS token, which is <|startoftext|> and <|endoftext|> respectfully.

Actually, just EOS (<|endoftext|>) in the base models :)

@ZhaoFancy ZhaoFancy added question Further information is requested regression doc Your PR contains doc changes, no matter whether the changes are in markdown or code files. labels Nov 7, 2023
@mallorbc
Copy link

mallorbc commented Nov 8, 2023

Carl,
This is just a base model(for now), so there is no fancy prompting except for BOS and EOS token, which is <|startoftext|> and <|endoftext|> respectfully.

Actually, just EOS (<|endoftext|>) in the base models :)

The tokenizer has a unique BOS token. It should not used for finetuning?

@loofahcus
Copy link
Contributor

Carl,
This is just a base model(for now), so there is no fancy prompting except for BOS and EOS token, which is <|startoftext|> and <|endoftext|> respectfully.

Actually, just EOS (<|endoftext|>) in the base models :)

The tokenizer has a unique BOS token. It should not used for finetuning?

I think it's OK using it in finetuning.

@ericzhou571
Copy link

@loofahcus I carefully check the tokenizer token map and find three speical token <|System|> <|Human|> <|Assistant|>.
Will that be the speical token you plan to use to build multi round conversation?
e.g.:
<|Human|>repeat "this is a multi-turn conversation1" pls<|Assistant|>this is a multi-turn conversation1<|endoftext|><|Human|>repeat "this is a multi-turn conversation2" pls<|Assistant|>this is a multi-turn conversation2<|endoftext|><|Human|>repeat "this is a multi-turn conversation3" pls<|Assistant|>this is a multi-turn conversation3<|endoftext|>

@loofahcus
Copy link
Contributor

@loofahcus I carefully check the tokenizer token map and find three speical token <|System|> <|Human|> <|Assistant|>. Will that be the speical token you plan to use to build multi round conversation? e.g.: <|Human|>repeat "this is a multi-turn conversation1" pls<|Assistant|>this is a multi-turn conversation1<|endoftext|><|Human|>repeat "this is a multi-turn conversation2" pls<|Assistant|>this is a multi-turn conversation2<|endoftext|><|Human|>repeat "this is a multi-turn conversation3" pls<|Assistant|>this is a multi-turn conversation3<|endoftext|>

They're reserved for chat models, but finally we did not use them for some reason

@ericzhou571
Copy link

@loofahcus I carefully check the tokenizer token map and find three speical token <|System|> <|Human|> <|Assistant|>. Will that be the speical token you plan to use to build multi round conversation? e.g.: <|Human|>repeat "this is a multi-turn conversation1" pls<|Assistant|>this is a multi-turn conversation1<|endoftext|><|Human|>repeat "this is a multi-turn conversation2" pls<|Assistant|>this is a multi-turn conversation2<|endoftext|><|Human|>repeat "this is a multi-turn conversation3" pls<|Assistant|>this is a multi-turn conversation3<|endoftext|>

They're reserved for chat models, but finally we did not use them for some reason

Would you mind to explain it a litte bit in detail? Because I am going to sft a chat model with these format. Will it lead to a poor conversation performance? Or should I use an other general chat format? e.g., CHATML?

@loofahcus
Copy link
Contributor

loofahcus commented Nov 9, 2023

@ericzhou571 It would not lead to a poor conversation performance once you re-initialize embed and lm_head in those positions~

@Yimi81 Yimi81 added the doc-not-needed Your PR changes do not impact docs. label Mar 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc Your PR contains doc changes, no matter whether the changes are in markdown or code files. doc-not-needed Your PR changes do not impact docs. question Further information is requested regression
Projects
None yet
Development

No branches or pull requests

6 participants