Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long video test results did not meet expectations #38

Open
ffiioonnaa opened this issue Jun 10, 2024 · 2 comments
Open

Long video test results did not meet expectations #38

ffiioonnaa opened this issue Jun 10, 2024 · 2 comments

Comments

@ffiioonnaa
Copy link

Hi, thanks for your work!
When I used the demo script to test the highlight detection task and temporal grounding task on my customed video, I found that the output timestamp is different every time I run it, and when I input a 30min long video, the output timestamp is often small,such as 1x second or 1xx second

@rahulkrprajapati
Copy link

Hey @ffiioonnaa , I also ran in the same issue. It might be due to the sampling capped at 96 frames. Changing the sampling rate would affect accuracy. I actually split the video into 2 - 5 min chunks and then ran the same prompt on each video and adjusted for the time difference for each video by adding the number of seconds that had passed in the previous chunks.

The accuracy and the timestamps were still not too good for me but it does seem to perform better this way for longer videos.

@RenShuhuai-Andy
Copy link
Owner

hi, thanks for your interests.

As shown in table.1 in our paper, the average video duration of training data is 190 seconds. Therefore, the model performs better on videos around 190 seconds long. When the video duration is too long (such as half an hour), the model's performance may deteriorate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants