You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
the output of output_ids is tensor([[1, 2]], device='cuda:0')
Other output of the demo script is:
Question: A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER:
Please provide a detailed description of the video, focusing on the main subjects, their actions, and the background scenes ASSISTANT:
Response:
The text was updated successfully, but these errors were encountered:
The command: bash scripts/video/demo/video_demo.sh lmms-lab/LLaVA-NeXT-Video-7B-DPO vicuna_v1 32 2 True xxx.mp4
By the way, I found using pool_stride=4 can solve this, because the input token length with stride=2 is 4673 which is larger than the max_length of LLM (4096).
the output of output_ids is tensor([[1, 2]], device='cuda:0')
Other output of the demo script is:
Question: A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER:
Please provide a detailed description of the video, focusing on the main subjects, their actions, and the background scenes ASSISTANT:
Response:
The text was updated successfully, but these errors were encountered: