Tips 🚀: We decrease lora_alpha
from 32
to 20
during inference to restore the model's language capabilities, which is very helpful for benchmarks in QA forms.
Please refer to FAQ.md for details.
link: https://github.com/llyx97/TempCompass
leaderboard: https://huggingface.co/spaces/lyx97/TempCompass
evaluation scripts:
-
firstly reset
MODEL_DIR
,ANNO_DIR
, andVIDEO_DIR
ineval_tempcompass.sh
-
run:
cd benchmark sh eval_tempcompass.sh
results:
link: https://github.com/OpenGVLab/Ask-Anything/tree/main/video_chat2
leaderboard: https://huggingface.co/spaces/OpenGVLab/MVBench_Leaderboard
evaluation scripts:
-
firstly reset
MODEL_DIR
,ANNO_DIR
, andVIDEO_DIR
ineval_mvbench.sh
-
run:
cd benchmark sh eval_mvbench.sh
results:
link: https://github.com/egoschema/EgoSchema
leaderboard: https://www.kaggle.com/competitions/egoschema-public/overview
evaluation scripts:
-
firstly reset
MODEL_DIR
,ANNO_DIR
, andVIDEO_DIR
ineval_egoschema.sh
-
run:
cd benchmark sh eval_egoschema.sh
results:
TBD