Possible RAVU improvement ideas #10

haasn · 2017-08-03T13:09:37Z

To make sure they don't get forgotten to time:

Add naive anti-ringing. You could do something like bilinear or bicubic sampling along the “edge direction” in a line (or small bundle of offset lines), gathering a few samples, and clamping your output pixel to this value range. Alternatively, you could try adapting the “in-place” antiringing filter from my antiring.hook and adding it as a separate post-RAVU pass, slightly adjusted to account for the fact that you introduce an offset.
Train kernels differently? Right now, you said you use a linear regression to combine multiple scaler kernels - but how do you actually choose the kernels to combine to begin with? Are they jinc functions? Or am I misunderstanding how the algorithm works?

bjin · 2017-08-03T15:00:14Z

Add naive anti-ringing.

If naive anti-ringing is not done in-place, it can always be applied independent from prescaler. Save the texture before prescaling, filtering after prescaling with offset considered. So I'm not in favor of integrating naive anti-ringing unless it can be proven to have reasonably small quality loss (in PSNR/SSIM, for example). But I will at least try some in-place anti-ringing method, probably with edge direction considered. I'm currently working on smoothtest2, which uses an even more aggressive smooth function, there are also a few other ideas I also would like to try to reduce ringing fundamentally.

Right now, you said you use a linear regression to combine multiple scaler kernels - but how do you actually choose the kernels to combine to begin with? Are they jinc functions? Or am I misunderstanding how the algorithm works?

It's not how RAVU works. No math functions like jinc are used during training. Just plain linear regression. Once key is obtained from local gradient, samples will be added into corresponding bucket. For each bucket (associated with particular key, an angle+strength+coherence combination), least square method is used to solve the linear regression problem (which minimizing mean squared error, i.e. PSNR).

I mentioned combination of different existing kernel function, just to demonstrate an idea that tries to make RAVU work with arbitrary scaling factor. But I later realized the model will only be trained to "reverse" certain downscaling kernel. So please ignore it.

haasn · 2017-08-03T17:16:26Z

It doesn't make sense to use a linear regression for a convolution, surely? Why not fit a cubic function or something instead?

bjin · 2017-08-03T17:47:50Z

It doesn't make sense to use a linear regression for a convolution, surely? Why not fit a cubic function or something instead?

You mean cubic function as model (weighted sum of x^3 for all sample x), or as cost function for training?

I assume you mean that we need to use alternative cost function like cubic function. Then we have to use algorithm like gradient descent to solve it. Doable but requires more effects, and much much longer training time.

haasn · 2017-08-03T19:04:34Z

I think I'm completely lost. What exactly is linear? What do the weights look like? My point is that I'd expect the trained weights themselves to look like (appropriately shaped) jinc functions, but your description makes it seem like they're linear functions or something.

bjin · 2017-08-03T19:38:53Z

Okay, I will try to explain. Suppose we are interpolate from four (known) points p1,p2,p3,p4 with convolution kernel w1,w2,w3,w4, the final result is p_predict=p1*w1+p2*w2+p3*w2+p4*w4. There is also a p_real which is the actual value from training image. Now we collected a huge amount of (p1,p2,p3,p4) and p_real pairs (samples). We want to solve a linear regression problem, finding some solution (w1_min,w2_min,w3_min,w4_min), that for all samples, the sum of (p_real-p_predict)^2 is minimized.

The final answer (w1_min,w2_min,w3_min,w4_min) which minimized the cost function will also maximize the overall PSNR. But those four values (weight function) are not linear. It's the model (convolution kernel, the way to calculate p_predict) that is linear.

haasn · 2017-08-03T20:20:15Z

Oh, I see now. Hmm; I still wonder if you could find a way to visualize the kernels for different keys. Would almost surely help with optimizing RAVU; if we could get a visual understanding of what's going on.

bjin · 2017-08-23T15:17:49Z

I still wonder if you could find a way to visualize the kernels for different keys.

Here is the visualization of all ravu-r3 and ravu-r4 convolution kernels grouped by (angle, strength, coherence) keys.

red is negative weights, blue is positive weights. Both Are normalized and sigmoidized.

There are 24 columns, from angle=0 to angle=23, means the direction of main gradient contributor.

There are 21 rows, divided into three row groups, from coherence=0 to coherence=2, means whether strength of second gradient contributor is similar to strength of main gradient contributor.

Each row group is divided into 9 rows, from strength=0 to strength=8, means strength of main gradient contributor.

//!DESC RAVU r3 visualizer
//!HOOK MAIN
//!BIND HOOKED
//!BIND ravu_lut3
//!WIDTH 144
//!HEIGHT 162

const int radius = 3;
const int quant_angle = 24;
const int quant_strength = 9;
const int quant_coherence = 3;

const int n = radius * 2;
const int width = quant_angle * n; // 144
const int height = quant_strength * quant_coherence * n; // 162

const vec4 red = vec4(1.0, 0.0, 0.0, 0.0);
const vec4 blue = vec4(0.0, 0.0, 1.0, 0.0);
const vec4 white = vec4(1.0, 1.0, 1.0, 0.0);

vec4 hook() {
    ivec2 pos = ivec2(floor(HOOKED_pos * vec2(float(width), float(height))));
    int angle = pos.x / n;
    int coherence = pos.y / n / quant_strength;
    int strength = pos.y / n % quant_strength;
    int id = (pos.x % n) * n + (pos.y % n);
    float w = texelFetch(ravu_lut3, ivec2(id / 4, (angle * quant_strength + strength) * quant_coherence + coherence), 0)[id % 4];
    w *= n * n;
    w = w / (1 + abs(w));
    if (w < 0) {
        w = -w;
        return mix(white, red, vec4(w));
    }
    return mix(white, blue, vec4(w));
}

//!DESC RAVU r4 visualizer
//!HOOK MAIN
//!BIND HOOKED
//!BIND ravu_lut4
//!WIDTH 192
//!HEIGHT 216

const int radius = 4;
const int quant_angle = 24;
const int quant_strength = 9;
const int quant_coherence = 3;

const int n = radius * 2;
const int width = quant_angle * n; // 192
const int height = quant_strength * quant_coherence * n; // 216

const vec4 red = vec4(1.0, 0.0, 0.0, 0.0);
const vec4 blue = vec4(0.0, 0.0, 1.0, 0.0);
const vec4 white = vec4(1.0, 1.0, 1.0, 0.0);

vec4 hook() {
    ivec2 pos = ivec2(floor(HOOKED_pos * vec2(float(width), float(height))));
    int angle = pos.x / n;
    int coherence = pos.y / n / quant_strength;
    int strength = pos.y / n % quant_strength;
    int id = (pos.x % n) * n + (pos.y % n);
    float w = texelFetch(ravu_lut4, ivec2(id / 4, (angle * quant_strength + strength) * quant_coherence + coherence), 0)[id % 4];
    w *= n * n;
    w = w / (1 + abs(w));
    if (w < 0) {
        w = -w;
        return mix(white, red, vec4(w));
    }
    return mix(white, blue, vec4(w));
}

EDIT: fix ravu-r3-vis.png

haasn · 2017-08-24T00:03:47Z

As I thought, we could almost certainly make use of mirror symmetry in this file, and probably also rotational symmetry within each “section”. So we could reduce the width to one fourth of what it is currently.

haasn · 2017-08-24T00:11:03Z

Also, what does it change if you apply a linear factor to the negative weights? (e.g. 0.5 * w)

bjin · 2017-08-24T03:12:22Z

So we could reduce the width to one fourth of what it is currently.

Yes, the training time and LUT size could be reduced by 75%. However, it won't make rendering significantly faster since the number of LUT calls is not reduced. Actually, it will become slower since we need to rotate/flip the weight matrix after they are fetched from LUT.

Also, what does it change if you apply a linear factor to the negative weights? (e.g. 0.5 * w)

It will make result noticeably blurrier (assuming all weights are normalized after that). If we want to clamp the negative weights, the proper way would be regularize it with cost function. For example, use MSE(predict_i - actual_i)+alpha*MSE(negative_weights) instead of MSE(predict_i - actual_i) alone. But then, we have to use general optimization method like gradient descent.

haasn · 2017-08-24T03:23:18Z

Yes, the training time and LUT size could be reduced by 75%.

Well, you could still train on the “reduced” image set and then rotate it when generating the weight texture. That way you get more samples and fewer equations to train, which would in theory lead to a better result with fewer images; since the “identical” kernels will all have shared weights.

bjin · 2017-08-24T03:30:43Z

Well, you could still train on the “reduced” image set and then rotate it when generating the weight texture.

I did that already. With rotation and flipping we have 7 times more samples.

bjin · 2017-09-25T19:16:00Z

Closing since antiring is solved by smoothtest in some degree

haasn · 2019-09-05T14:25:11Z

However, it won't make rendering significantly faster since the number of LUT calls is not reduced.

Rereading through this and related issues to understand RAVU again, this comment stood out to me. Even if the number of LUT calls would not change, making the texture smaller improves cache locality which can make those LUT calls significantly faster.

Also, we don't need to extend this by complicated flipping logic, if we can arrange for the texture to be laid out with perfect horizontal/vertical symmetry we can simply use the "mirrored repeat" sampling mode, which reflects/mirrors any out-of-bounds texture read.

bjin · 2019-09-06T13:42:37Z

Currently, the weight texture isn't exactly arrange in the way the above visualization pictures have shown. Instead, one dimension is the kernel ID (with size 24*9*3), the other dimension is location in the kernel (with size 4*radius*radius). Simple "mirrored repeat" probably won't work.

However, the weight texture indeed have been reduced to half of its size sometime ago, utilizing the symmetry. (Yes, the visualization shader above won't work with the current weight texture.) This change was meant to reduce the shader file size at that time though.

bjin closed this as completed Sep 25, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible RAVU improvement ideas #10

Possible RAVU improvement ideas #10

haasn commented Aug 3, 2017

bjin commented Aug 3, 2017

haasn commented Aug 3, 2017

bjin commented Aug 3, 2017 •

edited

Loading

haasn commented Aug 3, 2017

bjin commented Aug 3, 2017 •

edited

Loading

haasn commented Aug 3, 2017

bjin commented Aug 23, 2017 •

edited

Loading

haasn commented Aug 24, 2017

haasn commented Aug 24, 2017

bjin commented Aug 24, 2017

haasn commented Aug 24, 2017

bjin commented Aug 24, 2017

bjin commented Sep 25, 2017

haasn commented Sep 5, 2019

bjin commented Sep 6, 2019 •

edited

Loading

Possible RAVU improvement ideas #10

Possible RAVU improvement ideas #10

Comments

haasn commented Aug 3, 2017

bjin commented Aug 3, 2017

haasn commented Aug 3, 2017

bjin commented Aug 3, 2017 • edited Loading

haasn commented Aug 3, 2017

bjin commented Aug 3, 2017 • edited Loading

haasn commented Aug 3, 2017

bjin commented Aug 23, 2017 • edited Loading

haasn commented Aug 24, 2017

haasn commented Aug 24, 2017

bjin commented Aug 24, 2017

haasn commented Aug 24, 2017

bjin commented Aug 24, 2017

bjin commented Sep 25, 2017

haasn commented Sep 5, 2019

bjin commented Sep 6, 2019 • edited Loading

bjin commented Aug 3, 2017 •

edited

Loading

bjin commented Aug 3, 2017 •

edited

Loading

bjin commented Aug 23, 2017 •

edited

Loading

bjin commented Sep 6, 2019 •

edited

Loading