Skip to content

Seeking through long videos is slow #98

@briandwagner

Description

@briandwagner

Hello! First of all, thank you so much for all of your work on this. This program has saved my tail multiple times. I've recently run into a problem and I don't know if it would be fixable or not.

I'm working with some large (multiple hours), lossless, FFV1 AVI files, and I'm wanting to detect some scenes only at the end of the clip. Currently, I'm using the time --start command to start PySceneDetect at the last couple minutes of the clip, and I'm letting it run from there. However, PySceneDetect takes a long time just to find the time code within my file before it actually starts detecting scenes. When I run it with -v debug, it hangs on DEBUG: context.process_input(): Processing input... for several minutes. Once it finishes, it moves on to detect the scenes as normal.

I understand that seeking specific time code in a massive file is going to take a long time no matter what, but, I think in cases where a user only wants to detect scenes at the end of a file, this can be drastically sped up by reading the video file in reverse.

This is what I envision:

Lets say I want to look for scenes only in the last 5 minutes of a 4 hour lossless FFV1 clip and the clip is 200 GB. Currently, if I use the time --start 03:55:00 command, PySceneDetect reads through the whole file until it finds that time code, and then it starts running. This means it's reading almost the full 200 GB from my drive. This takes a very long time. Instead, I would like to be able to use a command such as time --reverse --duration 00:05:00. The --reverse function would cause PySceneDetect to jump directly to the end of the video stream and start playback backwards and then detect scenes as normal. In this example, it would start at 04:00:00, then go to 03:59:59, etc.

As for the results, it could display them in multiple ways. Let's say that there is a scene that goes from 03:59:50 until 04:00:00. Since it's running in reverse, PySceneDetect would find this scene as starting at 04:00:00 and ending at 03:59:50. When it lists scenes out, it of course could display it this way, or I could use list-scenes --reversecorrection to flip it back to starting at 03:59:50 and ending at 04:00:00 exactly as PySceneDetect does now if I run it with the normal time --start command.

Media Examples:
Unfortunately, I can't just share a massive lossless video file, but any file can be transcoded to FFV1 with ffmpeg by running ffmpeg -i videofile.mp4 -c:v ffv1 -level 3 -g 1 -threads THREADCOUNT -c:a pcm_s16le output.avi. Obviously transcoding a lossy file to lossless would be pointless in the real world, but it should be sufficient for testing.

Alternative Solutions:

Currently, running with time --start 03:55:00 achieves the same output, but the time it takes PySceneDetect to read through the file is too long.

Starting with smaller video files should help speed things along from a drive reading stand point, but in the video archival world (where I work) or in the broadcast world, starting from a non-lossless source isn't always an option. I am using FFV1 as my codec because it losslessly compresses better than any other lossless video codec. My files are as small as they can get.

I tried using ffmpeg to reverse the file and export it as a lossy h264 file to act as a proxy for PySceneDetect to read from, but even with NVENC and the fastest settings it allows for, ffmpeg still has to read the whole lossless file to transcode it.

I'm terribly sorry if I'm asking for the impossible. I know very little about programming as I am a video engineer, not a computer engineer. Thank you so much again for all of your work on this project. The passion of the open source community just makes my heart sing.

Metadata

Metadata

Assignees

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions