-
Notifications
You must be signed in to change notification settings - Fork 233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
why keep "###" before instruct text? #44
Comments
The assistant will learn to generate "###" if it wants to end the current round. So "###" can be understood as the EOS flag of each session. |
Thanks for the reply, I understand "###" works like EOS here, and I think the "###" in assistant text is more like EOS? howerver why do we want assistant to learn EOS before instruct? |
During inference, we will stop generating tokens once the assistant outputs "###". |
I mean you are also keep the first "###"(the one before "Human: Hi..."). |
Oh, I see. This place is just for convenience, and you can add a judgment, mask the first "###". |
Okay, thanks for the confirmation |
Video-LLaMA/video_llama/datasets/datasets/video_instruct_dataset.py
Lines 250 to 252 in 1728e14
I was reading
_mask_targets()
, I guess this function is using mask to ignore loss from instruct text, but why do you manually keep [curr_idx: curr_idx+2] which is "###" before the actual instruct text?The text was updated successfully, but these errors were encountered: