Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow toggling on "frame perfect" mode for the tone mapping code #40

Closed
haasn opened this issue Sep 18, 2018 · 1 comment
Closed

Allow toggling on "frame perfect" mode for the tone mapping code #40

haasn opened this issue Sep 18, 2018 · 1 comment

Comments

@haasn
Copy link
Owner

haasn commented Sep 18, 2018

The current tone mapping / peak detection algorithm has its results delayed by a frame for performance/simplicity reasons. Often times, it's undesirable to have results delayed like this - e.g. when transcoding offline, or when used on a still image scenario. We should have an option to allow turning on "frame perfect" mode at the cost of performance.

Possible ways to achieve this:

  1. Duplicate the current shader up to that point and run it in a second pass to measure the peak (more compute time, less texture bandwidth/memory required)
  2. Finalize the current shader and render it out to a cache texture while measuring the peak, then dispatch a lightweight second pass that just samples from this cache (no redundant computations, more texture sampling/bandwidth/memory required)

In either case, the intermediate step would require dispatching some shader - so the use of a pl_dispatch becomes unavoidable. As such, the tone mapping shader would have to be split up into two passes, each of which gets called separately by the parent (e.g. pl_renderer).

In order to determine which strategy of the above two is best, we could support both and apply some heuristics in pl_renderer to guess whether or not the shader would be cheaper to re-execute or not, either based on the settings or based on some property of the pl_shader itself (e.g. how many texture samples have been recorded into it)

Or we could just agree on one of the two strategies and use that always.

@haasn
Copy link
Owner Author

haasn commented Sep 23, 2018

A downside to approach 2 also is that it forces use of FBOs even when not otherwise necessary, which could very much hurt e.g. mobile GPUs. So if anything, we would need either method 1 exclusively or method 1 and 2 dynamically.

@haasn haasn closed this as completed Dec 4, 2018
haasn added a commit that referenced this issue Jan 5, 2019
The previous approach of using an FIR with tunable hard threshold for
scene changes had several problems:

- the FIR involved annoying dynamic buffer sizes, high VRAM usage,
  and the FIR sum was prone to numerical overflow which limited the
  number of frames we could average over.

- the hard scene change detection was prone to both false positives and
  false negatives, each with their own (annoying) issues.

Scrap this entirely and switch to a dual approach of using a simple
single-pole IIR low pass filter to smooth out noise, while using a
softer scene change curve (with tunable low and high thresholds), based
on `smoothstep`. The IIR filter is extremely simple in its
implementation and has an arbitrarily user-tunable cutoff frequency,
while the smoothstep-based scene change curve provides a good, tunable
tradeoff between adaptation speed and stability - without exhibiting
either of the traditional issues associated with the hard cutoff.

Another way to think about the new options is that the "low threshold"
provides a margin of error within which we don't care about small
fluctuations in the scene (which will therefore be smoothed out by the
IIR filter).

While redesigning the algorithm, I also redesigned the API - so that
peak detection and tone mapping are separate, discrete steps that can be
done as part of two different shaders. (Or as part of the same shader)
This is required for #40, and in particular, means that issue is now
within reach.

cf. mpv-player/mpv#6415
haasn added a commit that referenced this issue Jan 5, 2019
This also allows us to finally separate peak detection from color
management. The current place in the code actually has almost no
drawbacks, since it's effectively free unless FBOs are disabled.

One annoying consequence is that this means we will now always perform
peak detection at the source resolution, even if the display is smaller.
In the relatively common case of 4K video on 1080p displays, this is a
performance regression. To fix it, we could try investigating whether to
do the analysis after up/downscaling, but then we have more special
cases to think about, so I think I'll live with the status quo for now.
Peak detection isn't the end of the world even at 4K.

Closes #40.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant