Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What could be causing this -- ffmpeg related, but also zimg, I believe -- ? #177

Closed
bokeron2020 opened this issue Jun 14, 2022 · 13 comments
Closed

Comments

@bokeron2020
Copy link

Hi.
I'm asking it here, too, because I believe it may be related to some recent change in zimg that has cascaded down to ffmpeg and collided with my customized Wondows 10 x64 install.

This is my issue open at one of the current publishers of compiled ffmpeg binaries for windows. I'll post it here so I don't paste the whole thing.

GyanD/codexffmpeg#57

In short, it seems [to me] that recent changes in zimg now require something to be available in Windows that it didn't require before. So, using zscale for resizing image in ffmpeg throws an error in my customized OS install, an error that never appeared before... and what's weirder to me, it only happns to a limited set of files wich might be malformed... but the error doesn't happen in a vanilla Windows 10 install.

I hope this makes some sense, and I also hope is a matter than can be adressed here. I'm just a user without knowledge or experience to know how to deal with this.

@sekrit-twc
Copy link
Owner

	if (state.active_width <= 0 || state.active_height <= 0)
		error::throw_<error::InvalidImageSize>("active window must be positive");

Your image has negative dimensions.

@sekrit-twc
Copy link
Owner

FFMPEG started setting this field in FFmpeg/FFmpeg@d0aefc3 . There is probably a bug in the calculation.

@bokeron2020
Copy link
Author

bokeron2020 commented Jun 14, 2022

Your image has negative dimensions

What does this mean in layman terms for an image/video?

FFMPEG started setting this field in FFmpeg/FFmpeg@d0aefc3 . There is probably a bug in the calculation.

What's weird to me is latest version ffmpeg-zscale works OK with a vanilla Windows 10 x64 install but on my slimmed down install is showing this error.

Therefore I assume something I deleted or disabled on my install is also interfering but I can't imagine what does zimg/ffmpeg now needs from Windows that didn't need before [because previous version worked just fine].

Could you please explain a little bit about if zimg is depending on some protocol/library/whatever it expects to find in a Windows install in your latest version that didn't required before?

Thanks for taking the time to answer!

PS: I don't know programming but browsing commits to vf_zscale.c there's this commit made on a date which I suspect marked the beginning of the problem for me.
FFmpeg/FFmpeg@52a14b8

I know I'm asking too much but could you please take a quick look at so maybe you can see something causing the problem I'm having?

@sekrit-twc
Copy link
Owner

sekrit-twc commented Jun 15, 2022

Your image has negative dimensions

What does this mean in layman terms for an image/video?

It means that FFMPEG has a bug.

FFMPEG started setting this field in FFmpeg/FFmpeg@d0aefc3 . There is probably a bug in the calculation.

What's weird to me is latest version ffmpeg-zscale works OK with a vanilla Windows 10 x64 install but on my slimmed down install is showing this error.

Therefore I assume something I deleted or disabled on my install is also interfering but I can't imagine what does zimg/ffmpeg now needs from Windows that didn't need before [because previous version worked just fine].

Could you please explain a little bit about if zimg is depending on some protocol/library/whatever it expects to find in a Windows install in your latest version that didn't required before?

It could be the result of a Windows API returning a different value that is consumed by FFMPEG in a buggy way.

Thanks for taking the time to answer!

PS: I don't know programming but browsing commits to vf_zscale.c there's this commit made on a date which I suspect marked the beginning of the problem for me. FFmpeg/FFmpeg@52a14b8

I know I'm asking too much but could you please take a quick look at so maybe you can see something causing the problem I'm having?

I see that this commit introduces a call to av_rescale_rnd with nb_threads as input. Perhaps there is a difference in the number of threads that reveals an incorrect rounding calculation. Could it be that the errors occur when processing videos with very low vertical resolution (height) while running on a CPU with many threads?

@bokeron2020
Copy link
Author

Could it be that the errors occur when processing videos with very low vertical resolution (height) while running on a CPU with many threads?

Hmmm... not the files themselves, but I've observed and tested a weird behaviour that looks related to this!

I used ffmpeg to generate a 4K test file so I wasn't getting weird results because of a malformed file.
I've found that, applying only the zscale filter and any video codec, say x265, to process the test file, I'm reproducing the error consistently and in a particular number succession.

I'm using an AMD 5900x, 12 cores, 24 threads
I've tested resizing the 3840x2160 file following this steps:

  • First I tested scaling only the horizontal dimension 3840...1920... 1280... 99... 5321... any number. No problem at all.
  • Then I did the same with vertical dimension, keeping horizontal dimension at 3840... and I started finding errors [zscale error code 2050] on a few numbers, and error plus ffmpeg.exe crash in a range of numbers.
  • I've taken a couple snapshots where I try to resume what I've found and what numbers trigger what result. I hope is clear enough to be understood.

zscale_1

zscale_2

@sekrit-twc
Copy link
Owner

Please report these findings to FFMPEG. Thanks.

@richardpl
Copy link

FFMPEG started setting this field in FFmpeg/FFmpeg@d0aefc3 . There is probably a bug in the calculation.

Calculation is fine. Look at changes after that, latest version correctly use doubles for active region calculation.
Should zimg compare with 0.0 and not 0 in that code linked above?

@bokeron2020
Copy link
Author

@bokeron2020 used a virtual machine to check whether it is windows problem

Well, until a few days ago ffmpeg was a magic black box to me [still is], so my tests just tried to limit possible sources of problems... though not very well thought out, as you just pointed. I wasn't even considering cores to be a factor until recently.

Anyways...

I've made more tests, and once I realized CPU cores were factor, I made sure VMware Workstation was using 2 cores, so ffmpeg only detected 2 logical cores.
But I've also tested on my main machine but limiting threads ffmpg uses to 2.

I still need to test assigning more cores to VM and using vanilla Windows 10 to see what happens then, but it's being a PITA setting VM like that.

Results so far seem to indicate [at least my modified Windows combined with] 'high-ish' CPU cores/threads have a problem with changes made to ffmpeg after May 12, 2022.

Many-threads + specific target numbers chosen for height in zscale=size filter >> ERROR
Two threads + 'Custom' Win OS >> NO ERROR, no matter what target size I set for zscale=size

I've explained it better, or so I hope, at ffmpeg trac.

@sekrit-twc
Copy link
Owner

sekrit-twc commented Jun 16, 2022

I had the time to review vf_zscale.c in some more detail, and the problem is obvious:

static void slice_params(ZScaleContext *s, int out_h, int in_h)
{
    int slice_size;

    slice_size = (out_h + s->nb_threads - 1) / s->nb_threads;
    if (slice_size % 2)
        slice_size += 1;
    s->out_slice_start[0] = 0;
    s->out_slice_end[0] = FFMIN(out_h, slice_size);
    for (int i = 1; i < s->nb_threads - 1; i++) {
        s->out_slice_start[i] = s->out_slice_end[i-1];
        s->out_slice_end[i] = s->out_slice_start[i] + slice_size;
    }

    if (s->nb_threads > 1) {
        s->out_slice_start[s->nb_threads - 1] = s->out_slice_end[s->nb_threads - 2];
        s->out_slice_end[s->nb_threads - 1] = out_h;
    }

    for (int i = 0; i < s->nb_threads; i++) {
        s->in_slice_start[i] = s->out_slice_start[i] * in_h / (double)out_h;
        s->in_slice_end[i]   = s->out_slice_end[i]   * in_h / (double)out_h;
    }
}

Consider out_h=505, nb_threads=24:

    slice_size = (out_h + s->nb_threads - 1) / s->nb_threads;  //< slice_slice = 22

    s->out_slice_start[0] = 0;
    s->out_slice_end[0] = FFMIN(out_h, slice_size); //< out_slice_end[0] = 22
    for (int i = 1; i < s->nb_threads - 1; i++) {
        s->out_slice_start[i] = s->out_slice_end[i-1];
        s->out_slice_end[i] = s->out_slice_start[i] + slice_size; //< out_slice_end[22] = 22 + 22 * 22 = 506
    }

    if (s->nb_threads > 1) {
        s->out_slice_start[s->nb_threads - 1] = s->out_slice_end[s->nb_threads - 2]; //< out_slice_start[23] = 506
        s->out_slice_end[s->nb_threads - 1] = out_h; //< out_slice_end[23] = 505
    }

The code crashes on some resolutions, because out_slice_end[N-2] is beyond the end of the image. Error 2050 is because out_slice_start[N-1] = out_slice_end[N-2], which is beyond the end of the image.

Beside the logic error in this code, the benefit of running a filter with support=4*2160/505=17 (bicubic) in 21-line chunks is rather dubious. It appears that there was a check at some point to ensure the number of rows be at least 64, but it was deleted.

@sekrit-twc
Copy link
Owner

sekrit-twc commented Jun 16, 2022

Problem is just being hidden... Consider out_h=4158, nb_threads=64

slice_size = (out_h + (s->nb_threads / 2)) / s->nb_threads;  // slice_size = (4158+ 32) / 64 = 65
if (slice_size % 2)
    slice_size += 1;  // slice_size = 66
s->out_slice_start[0] = 0;
s->out_slice_end[0] = FFMIN(out_h, slice_size);  // out_slice_end[0] = 66
for (int i = 1; i < s->nb_threads - 1; i++) {
    s->out_slice_start[i] = s->out_slice_end[i-1];
    s->out_slice_end[i] = s->out_slice_start[i] + slice_size; // out_slice_end[62] = 66 + 62 * 66 = 4158
}

if (s->nb_threads > 1) {
    s->out_slice_start[s->nb_threads - 1] = s->out_slice_end[s->nb_threads - 2]; // out_slice_start[63] = 4158
    s->out_slice_end[s->nb_threads - 1] = out_h; // out_slice_end[63] = 4158
}

Without the statement if (slice_size % 2) slice_size += 1, then the code is safe, but only because MAX_THREADS=64. It fails again at out_h=8385, nb_threads=130.

BtbN pushed a commit to FFmpeg/FFmpeg that referenced this issue Jul 10, 2022
Do not insist on a fixed slice height, because that can still cause overflows
in corner cases as described in this comment:

sekrit-twc/zimg#177 (comment)

Signed-off-by: Marton Balint <cus@passwd.hu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants
@richardpl @sekrit-twc @bokeron2020 and others