Use ffmpeg scene detection to improve chunked encoding #619

tkozybski · 2021-03-22T23:51:26Z

By splitting the frames evenly between chunks they will start/end in the middle of the scene, lowering compression efficiency and/or quality. I propose to add functionality to detect scene changes and split the chunks based on that. See here or here on how to do this.

For aomenc, first pass stats file could be parsed to get the keyframes for 100% accuracy thus in theory improving quality & parallelism at the same time (by not using multi threading options and encode in chunks instead). Av1an does that.

tkozybski · 2021-03-22T23:52:18Z

Wrong label...

JJKylee · 2021-03-23T18:42:50Z

This is very similar to what I'm pondering on these days. I was thinking about composing a StaxRip embedded PowerShell script that generates an I-frame index list. I found an obvious downside - too long process time - in using ffprobe for this purpose, so I turned to the DGIndexNV index file, dgi, instead to get the desired result. See here.

But using the scene option in ffmpeg as proposed by this post seems fit for more general use cases. Extracting the PTS time is not difficult. You just need to run the following DOS command line with INPUT to get OUTPUT.txt containing the result.

ffmpeg -hide_banner -i INPUT -vf "select='gt(scene,0.4)',metadata=print:file='OUTPUT.txt'" -f null NUL

OUTPUT.txt looks like this:

frame:0    pts:9510    pts_time:9.51
lavfi.scene_score=0.557776
frame:1    pts:15016   pts_time:15.016
lavfi.scene_score=0.691152
frame:2    pts:20021   pts_time:20.021
lavfi.scene_score=0.690279
frame:3    pts:21522   pts_time:21.522
lavfi.scene_score=0.532986
frame:4    pts:23524   pts_time:23.524
lavfi.scene_score=0.537670
frame:5    pts:28529   pts_time:28.529
lavfi.scene_score=0.619934
...

What is difficult, though, is how we can put it to work in StaxRip using the generated pts_time info. We need to strip unnecessary part and convert pts_time to a workable format like HH:MM:SS.nnn. But since there's already a tool that does this - PySceneDetect - maybe we better find a way to make use of it. As you may know, Av1an is also utilizing this tool to get the cut info.

That said, another big hurdle is in place. Currently StaxRip is using frame number info (evenly divided total frame numbers) to put it directly in each encoder's parameters that are used for chunk encoding. But in order to adopt this new tool, an overhaul of the code is inevitable since every chunk encoding should be done via mkvextract or ffmpeg to match the cut timecodes, not frame numbers. I think this is really a big matter and will take a lot of time. Big food for thought. 🙄

Last but not least, there's a critical problem with this ffmpeg - scene option approach: it fails on some sources. For example, this Dolby Vision trailer - Chameleon.m2ts on Dolby Trailers - does not work well with this method even after the m2ts file is remuxed to mkv. It yields this error message and OUTPUT.txt file is simply empty.

[hevc @ 00000153cd5aca00] Invalid NAL unit 36, skipping.

I don't know if PySceneDetect is free of this kind of issues, but if not, then it's not reliable to use for general purposes. That's a big hurdle. 🤔

stax76 · 2021-03-24T16:40:27Z

I wonder if the index file created by ffms2 and L-Smash-Works contains info about I-Frames (I guess so) and if the format of the index file is easy to understand. It could not only be useful for chunk encoding, but also for cutting without re-encoding.

JJKylee · 2021-03-24T18:53:30Z

@stax76, that’s right. I’m wondering if the authors are willing to change the format. Hmm...

stax76 · 2021-03-24T19:48:34Z

Probably not. Vapoursynth is modern and powerful, generally has rich metadata support, so a source filter could provide this info so that it can be accessed with the vapoursynth API, maybe it's already supported, or it can be requested from ffms2, l-smash and dgdecnv. But reading it from the index file would be significantly faster, it would not require requesting all frames, maybe the index format isn't so complex.

DJATOM · 2021-03-24T19:57:09Z

From my experience... Don't try to split and merge open-gop hevc streams, it will produce bad things in result.

JJKylee · 2021-03-24T20:16:32Z

Yeah, esp. in stream copy.
In that respect, I-frame list or scene-detected frame(timecode) list alone may raise an issue for stream copy cutting with open-GOP stream structures like HEVC.

Since chunk encoding also involves stream copy cutting (either by the encoder itself at frame indexes, or via mkvextract/ffmpeg for timecode-based cutting), it may raise an issue in the same vein. 🙄

So at this point, another issue comes up. Can we extract only IDR frames which have good scene values? To do that, maybe we need to include another criterion that identifies whether a given frame is IDR or not. Food for thought. 🤔

JJKylee · 2021-03-24T21:05:22Z

On second thought, frame index cutting by the encoder may not be a problem.
Since the encoder receives decoded frames served by the frameserver via an avs/vpy script, it's not a stream copy cutting.

OTOH, cutting by mkvextract or ffmpeg does not involve any prior decoding process, so it's basically a stream copy cutting.

Therefore, it seems that timecode-based cutting for chunk encoding raises another issue in this regard. Hmm...

Ding-adong · 2021-12-06T15:12:32Z

Any update news on this.

Presently I use a roundabout way of chunk at scene change.

Vdub to get precise frame number.
staxrip cut the first half and create job - filename1
cut 2nd half and create job - filename2
start filename1
open another instance of staxrip
start filename2
merge both filename 1 and 2

I do wonder if this could be automatically processed?

tkozybski added the question label Mar 22, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use ffmpeg scene detection to improve chunked encoding #619

Use ffmpeg scene detection to improve chunked encoding #619

tkozybski commented Mar 22, 2021

tkozybski commented Mar 22, 2021

JJKylee commented Mar 23, 2021 •

edited

stax76 commented Mar 24, 2021

JJKylee commented Mar 24, 2021

stax76 commented Mar 24, 2021

DJATOM commented Mar 24, 2021

JJKylee commented Mar 24, 2021 •

edited

JJKylee commented Mar 24, 2021 •

edited

Ding-adong commented Dec 6, 2021

Use ffmpeg scene detection to improve chunked encoding #619

Use ffmpeg scene detection to improve chunked encoding #619

Comments

tkozybski commented Mar 22, 2021

tkozybski commented Mar 22, 2021

JJKylee commented Mar 23, 2021 • edited

stax76 commented Mar 24, 2021

JJKylee commented Mar 24, 2021

stax76 commented Mar 24, 2021

DJATOM commented Mar 24, 2021

JJKylee commented Mar 24, 2021 • edited

JJKylee commented Mar 24, 2021 • edited

Ding-adong commented Dec 6, 2021

JJKylee commented Mar 23, 2021 •

edited

JJKylee commented Mar 24, 2021 •

edited

JJKylee commented Mar 24, 2021 •

edited