Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cut video without final frame #166

Merged

Conversation

SpangleLabs
Copy link
Contributor

Did a bit of experimenting this evening and got it figured out. You need to put the start and end arguments after the input argument. Also specify a frame before the end frame.

I have also swapped from using duration, to specifying the end timecode.
And using timecode for start, rather than number of seconds.

@jelias
Copy link

jelias commented Jun 2, 2020

Just wanted to leave a note, this build seems to be running much slower than the other method and current build of scenedetect.

Might be worth testing further before merging...

@SpangleLabs
Copy link
Contributor Author

Ah, yes, I hadn't wholly considered that, but it will be the case. FFmpeg docs about -ss say:

-ss position (input/output)
When used as an input option (before -i), seeks in this input file to position. Note that in most formats it is not possible to seek exactly, so ffmpeg will seek to the closest seek point before position. When transcoding and -accurate_seek is enabled (the default), this extra segment between the seek point and position will be decoded and discarded. When doing stream copy or when -noaccurate_seek is used, it will be preserved.

When used as an output option (before an output url), decodes but discards input until the timestamps reach position.

So basically, it's going to be reading and transcoding the entire file, rather than seeking through the raw file and reading from a point near the start. On my inputs this didn't make much difference, but on a large file, it will.

It might depend on your ffmpeg version, also. According to the ffmpeg Seeking article, before version 2.1, using combined seeking might be faster (-ss before and after the input).

I might need to try some benchmarks

@Breakthrough
Copy link
Owner

Breakthrough commented Jun 23, 2020

Hey @joshcoales;

Any further results by any chance? If this truly is the only possible option for frame accurate cuts, I might have to add an override option for it. That would suck though.

Outside of that, I don't see a way of accomplishing this outside of using the ffmpeg API directly (which may be where this is heading...). Either way, will be looking into this some more next week for the next release of PySceneDetect.

Thank you again for your help with this!!

@Breakthrough
Copy link
Owner

Will also look into adding specific unit tests to cover these cases. Not sure how to handle setting up ffmpeg in a test environment yet, but this is something I would like to look into to prevent regressions.

@SpangleLabs
Copy link
Contributor Author

Oh whoops, I only read this via email, so didn't see the edit to your first comment there!

I've not had time to try some performance tests, and my home server has been struggling, so not very good for doing comparisons.
As far as I can tell, this is the only way to do it, as ffmpeg is doing approximate seeking if you specify the start time before the input file. A bit of a waste to specify start time to the frame, only to then specify approximately. It is a little slower, but it does at least work.

As for trying it on travis, there's a few threads on it, this looks simple enough: https://discourse.julialang.org/t/how-to-add-ffmpeg-to-travis/9366/3
But I'm not sure how you will test that it has cut correctly, automatically? But adding some integration tests to ensure the library interacts with ffmpeg correctly, would be handy (though, then there might be questions about supporting multiple ffmpeg versions?)

@Breakthrough
Copy link
Owner

Awesome, thanks for the resources!

And yes, would pretty much be an integration test that compares the frame PySceneDetect got for the first/last frame, and compare with some metric to the frame that ffmpeg actually cut at (first/last of the split video). Then could use something like SSIM or MSE to compare the two frames, ensuring that the cuts PySceneDetect detected line up exactly with those ffmpeg produced. This is made easier with some of the manually created test cases I have, or could create a bank of test cases for this purpose.

This won't prevent all off-by-one errors, but will prevent "frame bleed" of adjacent scenes into other cuts.

It just sucks that it seems like the documentation for ffmpeg implies it produces exact seeking, but in practice it's approximate, so your concerns about version are also valid. I'll also try to estimate the performance overhead of this approach to see if it's a valid concern or not.

@SpangleLabs
Copy link
Contributor Author

Ahh, true, that would work reasonably well as a test, my fear was that the first reported frame of the cut video would be the one you wanted, but of course not. I've had a long day :P

In my earlier comment, I quote the docs, which do say that if -ss is used before -i it uses inexact seeking.. though hmm... Yeah, it does say that it's only inexact if it is doing stream copy rather than transcoding, and that doing transcoding it'll seek exactly. Might be worth testing it while specifying an output encoding to ensure that it actually transcodes it, and isn't copying the stream, but the docs say it defaults to re-encoding. I'll give that a go now, and see if it works

@SpangleLabs
Copy link
Contributor Author

It's -c:a copy, in the arg override in video splitter. Swap it for -c:a aac and it doesn't mess up.

Urgh, did a bunch of tests. No idea why I didn't see that option last time.
Here's the full batch, Each entry in here is the command, then the timings, then the result
https://gist.github.com/joshcoales/eda4c931e09a9dda194ad52cf83fa9bd

Basically:

Broken:
ffmpeg -ss 42.543 -i input.mp4 ... -c:a copy -t 24.992 -sn output.mp4
Fixed, but slow:
ffmpeg -i input.mp4 -ss 42.543 ... -c:a copy -t 24.992 -sn output.mp4
Fixed, still fast:
ffmpeg -ss 42.543 -i input.mp4 ... -c:a aac -t 24.992 -sn output.mp4

Damn! Wish I had found this earlier, frustrating how that works..
I'll push another commit!

@SpangleLabs
Copy link
Contributor Author

Very confused now. Going from slow seek to fast seek, my ffmpeg cuts went from 1m52s to 0m37s, but the full scene splitting went from 13m46s to 20m37s (it is 10m57s on release version!)
Oh gosh, heck. I also have to swap back from -to to using -t! This has made it even slower now.

I feel real bad about not getting that solution earlier, and being so confident that the slowdown was required. Sorry about that.

Also it's nearly 2am now, whoops.
I'll swap back to -t and give it another go

@SpangleLabs
Copy link
Contributor Author

SpangleLabs commented Jun 27, 2020

Yup, that cuts correctly, and speed looks good again:

real 6m20.370s
user 11m10.933s
sys 0m8.718s

Sorry again that I got this wrong at first, made it much slower, was convinced there was no other way. Thanks for convincing me to re-read the docs with a critical eye.

Could probably remove the previous_frame() method I added, but it seems semi-handy, and I really should head to sleep!

EDIT: Though, it's not a huge speedup. For contet though, my input video is 5m44s long, and 143MB. If you're working on bigger files, doing more cuts later in the file, I would assume it would be faster?

@Breakthrough
Copy link
Owner

Breakthrough commented Jun 27, 2020

No need to apologize - thank you so much for your time and effort! Strange how copying the audio codec is what messes it up, but now that I think about it, I can kind of see why that might make a difference. Kind of sucks that it seems like the best way to accomplish what PySceneDetect here needs would be to use the C API directly, but I'll leave that as a project for another day. I think what you have here is more than sufficient to get PySceneDetect rolling!

I'll get this merged up next week after some more investigation and addition of unit/integration tests on my end. Thank you so much for finding out what was causing the issue here, honestly I never would have thought it had anything to do with the audio codec copy.

@Breakthrough Breakthrough self-assigned this Jun 27, 2020
@Breakthrough Breakthrough added this to the v0.6 milestone Jun 27, 2020
@Breakthrough Breakthrough linked an issue Jun 27, 2020 that may be closed by this pull request
@SpangleLabs
Copy link
Contributor Author

Yeah, it was a bit of a "Oh, I can see how that makes sense I guess" moment.
I'm not sure how much more you would gain by using the C api, I guess it could let you seek through the video once, transcoding to different files as you go?

I know in my own testing, I've tried doing multiple cuts in one ffmpeg command, but generally it seems to lock up my server something fierce. I've lost all access to it for days at a time before because it has been trying to cut some video into multiple scenes at once, so I've no idea what it's doing under the hood.

ffmpeg does seem to be a tricky beast, hence the audio copy issue, the slow vs fast seek, etc.. A powerful tool, but a tricky one

Copy link
Owner

@Breakthrough Breakthrough left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one small comment, then will get this merged up!

@@ -227,15 +227,14 @@ def split_video_ffmpeg(input_video_paths, scene_list, output_file_template, vide
call_list += [
'-y',
'-ss',
str(start_time.get_seconds()),
str(start_time.get_timecode()),
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, why did you use get_timecode here as opposed to get_seconds?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

49c3258 is the reason I ask.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting.. There wasn't any particular reason, I simply find it neater, I didn't notice any difference in my testing. I'll swap it back, now that I'm using duration as well, makes sense for both to be in seconds

'-i',
input_video_paths[0]]
input_video_paths[0],
'-t',
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it matter if -t is specified before or after the codec?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, as it still needs to transcode through the video to this point, it can't do file-based seeking to get the end point like it does the start

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I misunderstood, hence talking of file-based seeking.

I don't think it matters if it's before or after arg_override, would you prefer it go after?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No preference, was just wondering if it was intentional or not. As-is is fine! :)

@Breakthrough Breakthrough merged commit ee38df6 into Breakthrough:master Jul 3, 2020
@Breakthrough
Copy link
Owner

Merged - thank you so so so so so so so so much for figuring this out!!!!! Definitely owe you a beer if you happen to make your way to the Toronto area :)

@SpangleLabs
Copy link
Contributor Author

Awesome! I'm glad I could contribute! Thanks again tonnes for this library, certainly helps a lot with my projects! Would love to visit Toronto sometime, my partner's been before and loved it, though that might need to wait quite a while before it's feasible :P

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Extra frames show up when using split-video with ffmpeg
3 participants