-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow lossy encoding #308
Comments
Looking at each crf=32 video two differences stand out.
Any idea what's changed @galenlynch? The default encoding settings going from 12s to 68s is a bit strong, so good if we can avoid it v0.8.4
v0.9.0-DEV
|
oof that's unfortunate. I haven't noticed anything similar to this. I normally use lossy encoding. |
Oh I might have an idea of what's happening, though I haven't done much testing myself so it could be wrong. I haven't tracked it down, but I think this is related to #283. In your lossy example for both versions you didn't specify the color range of the video, so the ffmpeg default is to use the mpeg color range (limited) and not the jpeg (full). Previously, any input values outside of the limited range would simply be clipped (which I think is "wrong"). In the new version of the code, if you don't specify the input color range to VIdeoIO it will assume you're using the full color range, and rescale your inputs to the limited color range to avoid clipping. If you specify either Some simple testing suggests this is the case. Both of the following are on the new version:
Here I specify the output color_range to be 2 (aka jpeg, full) to avoid the automatic rescaling:
So I think this is the unfortunate consequence of not "damaging" the input during encoding more than you need to. You can always opt out by either using the expanded color range in the encoding, or telling VideoIO that your inputs are already in the limited color range (and if they're not that you're ok with just clipping the values outside of that range). A side note on performance: it's best to use |
Just to clarify, the new version will rescale only if there's a mismatch between the input color range (matrices that the user supplies) and the encoding |
There's clearly something wrong with the rescaling code's performance. The allocations are ridiculous. I can track down the problem: very likely type instability. |
Oh yeah, I'm remembering thinking about this when I wrote it. The way that gray values are currently rescaled has some inherent type instability in it. Fixing it won't be low-hanging fruit, unfortunately. The relevant function is here: Lines 153 to 159 in b6790fa
Right now the source and destination types for the rescaling function are inherently unstable since they're based on enums from ffmpeg. We could lock them in when the writer object is made to reduce the type instability, but that would require a bit of "doing," and reduce flexibility. Alternatively we could get rid of this whole gray-rescaling stuff and try to use ffmpeg's sws_scale for this. I was having a fair amount of difficulty getting sws_scale to accurately rescale monochrome input, so wrote this gray recsaling stuff out of frustration. I figured at the time that getting the "right" results was worth the slowdown, and didn't want to spend a lot of time making it hyper-optimized or coercing sws_scale to do what I wanted since I had already put so much time into the PR, and figured it could be left for later. Don't know what to do here. It would take some effort to make this faster, and I probably won't be able to get to it that soon. I personally still feel like the accuracy outweighs the slowdown, but it's a subjective tradeoff. |
Actually, it wouldn't be that hard to make the existing rescaling code that performant by locking in the types when the object is created, but I remember thinking it was a bad idea, though I can't remember why... |
Quick thought. Are there existing enums we can provide for |
That's down to the way ffmpeg is wrapped with Clang.jl... |
Also I don't think that would solve the type instability, even though it would be easier to use. At the end of the day the source and destination types are determined by looking at integers stored in FFMPEG's AVFrame structs, and seeing if it's a 1 or 2. |
Whoops, I got mixed up... the relevant enum for the source and destination types is not the |
One more thing... here |
Given the default for Lines 442 to 449 in 76a0f64
I think it's ok for us to switch to a default of Basically it would make VideoIO more focused on numerical accuracy, than perceptual compression efficiency. But that requires a little namedtuple/dict fanangaling as it's not simply a kwarg that we can set a default for. That would buy us time to optimize the scaling function, by making it not invoked by default/most use cases. |
A lot of video players don't accept jpeg color range videos. |
ok. I think a narrative of this for the changelog would be ok: The default encoder settings got a little (x%) slower, because we now make the assumption that 1) the input data is full color range, 2) you want a video that will play in most video players. Therefore a color space transform will take place to compress the full color range input, to the limited "mpeg" color space. You have a few options to get faster encoding:
As long as we can make the default only a little slower. By the way, do you use the julia slack? It might be good for us to chat more informally on there about this stuff |
I... might have an account? I'll try to dig it up. |
Guidance now added to CHANGELOG.md on how to handle/avoid this |
@galenlynch Am I doing something wrong, or do we need to look into this before v0.9?
Lossless goes from 4s to 2s 👍🏻
Default lossy goes from 12s to 68s 🤔, and file size is repeatably smaller.. is v0.9 trying harder, and some hidden setting is different?
For
imgstack = map(x-> rand(UInt8, 2048, 1536), 1:100)
v0.8.4
v0.9.0-dev
Note that I see the same when setting
encoder_private_options
insteadSide note.. I love how quiet v0.9 is
The text was updated successfully, but these errors were encountered: