-
Notifications
You must be signed in to change notification settings - Fork 264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better scale quality? #780
Comments
The export scaling in Gyroflow is Lanczos4 already. |
Here is video-comparison. Youtube eats video-quality, but difference are still visible. |
Added the selector in 7b48ed3, in Export settings -> Advanced |
Selector is great, but nothing is changed :( |
yeah that's why I think there's something else going on, please do more testing:
|
Short Results: All of different (png, prores, h265, etc.) export codecs with 720p : stairs-edges. |
Yep. It's also giving the stairs-edges. |
Can you send me the sample file? |
i hope link is working. In the video, this effect will be very noticeable at the beginning - bridge lines and fence lines. |
Meanwhile, i've tested mobile version of gyroflow. To exclude any possibility of impact the result by my particular pc-hardware. |
I have the same problem with GoPro videos. It looks like there isn't any filters used for scaling. |
Are you using high enough FOV? |
Ok I see it
We should look into changing our scaling code to be based on pillow's: https://github.dev/zurutech/pillow-resize/blob/main/src/PillowResize/PillowResize.cc |
/bounty $300 |
💎 $300 bounty • GyroflowSteps to solve:
Additional opportunities:
Thank you for contributing to gyroflow/gyroflow! |
I've made some progress, I re-implemented the pillow's algorithm in Rust with a minimal example, comparing OpenCV's implementations (the code currently in Gyroflow) and Pillow's implementation. The Pillow's one resizes the images better, however, the way it's structured is that it calculates coefficients up front for resizing from input to output. This is a problem, because in Gyroflow we need to feed input coordinates to the resampling function, because they will be rotated (stabilized), so it's different than simple resizing (where coordinates map in a simple linear way from input to output) Example project: resizing.zip there's I've implemented I'm reducing the bounty to $150 since most of the work is done, just need to figure out a way how to precompute the coefficients for that use case. The tricky thing is that it depends on scale (which ideally should be |
Not sure if this would work, but what if a few pixels (say four in a square) at the center of the output image are sampled to get the input pixels coordinates (for determining the scaling factor), which is then used to pre-compute the coefficients for the whole image? Or is that also too slow? |
I implemented it without precomputed coefficients in order to test, and the rendered video is now much nicer, check out the files there: https://drive.google.com/drive/folders/1brliOo0b4RLHOKbhraUBvIyRtsMIS-uj?usp=sharing However, this improves things only when downscaling (massive improvement), but for regular videos (where there is mostly slight upscaling - see This kinda makes sense, because the main difference between these implementations is the the sampling area is scaled up (when resizing down), but it's never scaled down (when resizing up), so the case where we upscale the video should be pretty much equivalent between current and this one This implementation is 2-3x slower to render
This is a good idea, but I think this new implementation only makes sense if we use the fov (which changes per frame because of dynamic zoom), and precomputing coeffs for every frame might be too much memory (and transferring it to the gpu). It would have to be benchmarked though |
indeed |
For reference, here's the wgsl implementation if anyone wants to play with it fn bilinear_filter(x_: f32) -> f32 { let x = abs(x_); if x < 1.0 { return 1.0 - x; } else { return 0.0; } }
fn hamming_filter(x_: f32) -> f32 { var x = abs(x_); if x == 0.0 { return 1.0; } else if x >= 1.0 { return 0.0; } else { x = x * 3.14159265359; return (sin(x) / x) * (0.54 + 0.46 * cos(x)); } }
fn bicubic_filter(x_: f32) -> f32 { let x = abs(x_); let A: f32 = -0.5; if x < 1.0 { return ((A + 2.0) * x - (A + 3.0)) * x * x + 1.0; } else if x < 2.0 { return (((x - 5.0) * x + 8.0) * x - 4.0) * A; } else { return 0.0; } }
fn sinc_filter(x: f32) -> f32 { if x == 0.0 { return 1.0; } else { let xx = x * 3.14159265359; return sin(xx) / xx; } }
fn lanczos_filter(x: f32) -> f32 { if x >= -3.0 && x < 3.0 { return sinc_filter(x) * sinc_filter(x / 3.0); } else { return 0.0; } }
fn sample_input_at2(uv_param: vec2<f32>) -> vec4<f32> {
let filter_support = 3.0;
let scale = min(params.fov, 10.0);
let filter_scale = max(scale, 1.0);
let support = filter_support * filter_scale;
let ss = 1.0 / filter_scale;
var kx = array<f32, 64>();
var ky = array<f32, 64>();
let fix_range = bool(flags & 1);
let bg = params.background * params.max_pixel_value;
var sum = vec4<f32>(0.0);
var uv = uv_param;
if (params.input_rotation != 0.0) {
uv = rotate_point(uv, params.input_rotation * (3.14159265359 / 180.0), vec2<f32>(f32(params.width) / 2.0, f32(params.height) / 2.0));
}
if (bool(flags & 32)) { // Uses source rect
uv = vec2<f32>(
map_coord(uv.x, 0.0, f32(params.width), f32(params.source_rect.x), f32(params.source_rect.x + params.source_rect.z)),
map_coord(uv.y, 0.0, f32(params.height), f32(params.source_rect.y), f32(params.source_rect.y + params.source_rect.w))
);
}
////////////////////////////////
let xcenter = uv.x + 0.5 * scale;
let xmin = i32(floor(max(xcenter - support, 0.0)));
let xmax = max(i32(ceil(min(xcenter + support, f32(params.width)))) - xmin, 0);
var xw = 0.0;
for (var x: i32 = 0; x < xmax; x = x + 1) {
let f: f32 = (f32(x) + f32(xmin) - xcenter + 0.5) * ss;
kx[x] = lanczos_filter(f);
xw += kx[x];
}
if (xw != 0.0) { for (var x: i32 = 0; x < xmax; x = x + 1) { kx[x] /= xw; } }
////////////////////////////////
let ycenter = uv.y + 0.5 * scale;
let ymin = i32(floor(max(ycenter - support, 0.0)));
let ymax = max(i32(ceil(min(ycenter + support, f32(params.height)))) - ymin, 0);
var yw = 0.0;
for (var y: i32 = 0; y < ymax; y = y + 1) {
let f: f32 = (f32(y) + f32(ymin) - ycenter + 0.5) * ss;
ky[y] = lanczos_filter(f);
yw += ky[y];
}
if (yw != 0.0) { for (var y: i32 = 0; y < ymax; y = y + 1) { ky[y] /= yw; } }
////////////////////////////////
let sx = xmin;
let sy = ymin;
for (var yp: i32 = 0; yp < ymax; yp = yp + 1) {
if (sy + yp >= params.source_rect.y && sy + yp < params.source_rect.y + params.source_rect.w) {
var xsum = vec4<f32>(0.0, 0.0, 0.0, 0.0);
for (var xp: i32 = 0; xp < xmax; xp = xp + 1) {
var pixel: vec4<f32>;
if (sx + xp >= params.source_rect.x && sx + xp < params.source_rect.x + params.source_rect.z) {
pixel = read_input_at(vec2<i32>(sx + xp, sy + yp));
pixel = draw_pixel(pixel, u32(sx + xp), u32(sy + yp), true);
if (fix_range) {
pixel = remap_colorrange(pixel, bytes_per_pixel == 1);
}
} else {
pixel = bg;
}
xsum = xsum + (pixel * kx[xp]);
}
sum = sum + xsum * ky[yp];
} else {
sum = sum + bg * ky[yp];
}
}
return vec4<f32>(
min(sum.x, params.pixel_value_limit),
min(sum.y, params.pixel_value_limit),
min(sum.z, params.pixel_value_limit),
min(sum.w, params.pixel_value_limit)
);
} Replaces |
I tried implementing the same algorithm as is used in imagemagick's distort operator - elliptical weighted average with cubic bc filtering. This is just a test to evaluate performance, it does not do any real distortion, it only applies a given affine transformation https://github.com/VladimirP1/gpu-warp . To use this in gyroflow we'd have to calculate affine approximations of the transformation at each pixel of the undistorted image, which is not much harder than computing the transformation itself. Some test transformation 4000x3000 -> downscale by 2.2 onto 1920x1080 canvas + 0.1 rad of rotation: |
And it seems that there are some bugs left in my implementation |
hmm. looks interesting. Does it possible to implement some custom warping? Like, you now, lens distortion: |
that's not related, it's a different issue |
Progress so far: I am using numeric differentiation now (so basically running the distort transformation three times per output pixel instead of one). It is possible to do it in one step, but that would require adding jacobian calculation into lens models. |
Is there an existing feature request for this?
Description
Good evening. Me again.
Is it possible it implement a better scale quality option?
Typical case: i have a 4k video, open'd it in Gyroflow, apply stabilisation, and export to a 720p ProRes proxy, for fast editing. And this 720p video looks very "aliased". Like in games, you know, when you turn AA completely off.
For example, this picture. 4k video.
Upper one exported from Gyroflow in 720p.
Lower one exported from Gyroflow in native 4k, and downscaled it video-editor. It's standard scale-processing, "bilinear", or something.
So we can clearly see the difference. Post (tall metal white tiny thing) in the center - upper one has this typical "stairs-edges". Same "stairs-edges" we can see on the wires, and on the vertical lines of beige building at the right. Meanwhile on the lower picture - all these lines are softer and smoother.
And in dynamic video it is much more visible, than in a static picture.
So i think better scale quality will be great. At least for exporting videos.
I guess bilinear works fine. Also i am using "Lanczos" in XnView for scaling pictures - it gives a good quality.
The text was updated successfully, but these errors were encountered: