Request for NExTQA Dataset Evaluation Prompt and More Results on Challenging Datasets for Fair Comparison #3

patrick-tssn · 2024-05-10T14:59:12Z

To my knowledge, the videos in NExTQA dataset are relatively short, with an average video length of 44 seconds, and there is a noted static bias[1] in the ActivityNet QA dataset. Could you present further results on more demanding datasets for fair comparison, such as EgoSchema[2]? Additionally, Could I request that you supply the evaluation prompt for the NeXTQA dataset?

[1] Lei, Jie et al. “Revealing Single Frame Bias for Video-and-Language Learning.” ArXiv abs/2206.03428 (2022): n. pag.
[2] Mangalam, Karttikeya et al. “EgoSchema: A Diagnostic Benchmark for Very Long-form Video Language Understanding.” ArXiv abs/2308.09126 (2023): n. pag.

ZhangYuanhan-AI · 2024-05-14T05:21:29Z

Thanks for your advise. The evaluation on the EgoSchema is ongoing.

The prompt for the NeXTQA is: Answer the question using several words or phrase.'

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request for NExTQA Dataset Evaluation Prompt and More Results on Challenging Datasets for Fair Comparison #3

Request for NExTQA Dataset Evaluation Prompt and More Results on Challenging Datasets for Fair Comparison #3

patrick-tssn commented May 10, 2024

ZhangYuanhan-AI commented May 14, 2024

Request for NExTQA Dataset Evaluation Prompt and More Results on Challenging Datasets for Fair Comparison #3

Request for NExTQA Dataset Evaluation Prompt and More Results on Challenging Datasets for Fair Comparison #3

Comments

patrick-tssn commented May 10, 2024

ZhangYuanhan-AI commented May 14, 2024