Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance of LLoVi with 7B llama2 #3

Open
Leo-Yuyang opened this issue Apr 3, 2024 · 2 comments
Open

Performance of LLoVi with 7B llama2 #3

Leo-Yuyang opened this issue Apr 3, 2024 · 2 comments

Comments

@Leo-Yuyang
Copy link

Dear author, thank you for your work! I would like to know the performance of LLoVi on next-qa, next-gqa and IntentQA, when using 7b llama2 as the LLM.
For larger model like gpt3.5 and gpt4, they are not open-source so we cannot do research on how to improve them on this task. So I think it's beneficial for the community to report the performance on smaller llm

@CeeZh
Copy link
Owner

CeeZh commented Apr 3, 2024

Hi Leo,

Thanks for reaching out. We did not test llama-7b on nextqa. However, it should be straightforward to modify this codebase to support it. You can refer to the setting of (EgoSchema + LLama-70B) and (NextQA + GPT-4).

@Leo-Yuyang
Copy link
Author

Dear author, thank you for your reply!
Yeah I just tried to test llama-7b on nextqa. And yeah it's very much straightforward.
python main.py \ --dataset nextqa \ --data_path data/nextqa/llava1.5_fps1.json \ --fps 0.5 \ --anno_path data/nextqa/val.csv \ --duration_path data/nextqa/durations.json \ --prompt_type qa_next \ --model gpt-4-1106-preview \ --output_base_path output/nextqa \ --output_filename gpt4_llava.json
I just downloaded the data.zip, and modified the --model part of the above command to llama-2-7b-chat.
However, I don't have the permission to access llama2 series models on huggingface so I can't run the code successfully.
Then I tried to download the model from https://github.com/meta-llama/llama.
However, the way of loading the model is different from the way it is loaded on huggingface. So I modified a lot. At the end of the day, I only get an acc of 2.02% on nextqa task.
I think this is not reasonable and maybe due to the way I load the model.
So I'd like to know weather it's convenient for you to help me reproduce it and get the performance. Looks like that when the environment is ready, all one needs to do is modifying the --model part of the above command to 'llama-2-7b-chat'.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants