You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm a FreeBSD developer working on maintaining our copy of clang and llvm in the FreeBSD base system. While doing a test build run of all ports, the following libjpeg-turbo test failure was reported to me:
It might seem partially related, since it only occurs when compiling with gcc 4.9.1, and specifying -march=haswell. However, the original reporter also mentions it occurred on a 32-bit system.
In any case, after some analysis (using the Undefined Behavior Sanitizer, among others) it turns out that the quantize() function in jcdctmgr.c uses undefined behavior in some cases.
In particular, it sometimes tries to right-shift a 32-bit unsigned integer by 65551 bits, which is obviously illegal. For example, from a test run with UBSan enabled:
./cjpeg -sample 2x2 -quality 100 -dct fast -prog -outfile testout_420_q100_ifast_prog.jpg ./testimages/testorig.ppm
jcdctmgr.c:415:15: runtime error: shift exponent 65551 is too large for 32-bit type 'UDCTELEM2' (aka 'unsigned int')
jcdctmgr.c:410:15: runtime error: shift exponent 65551 is too large for 32-bit type 'UDCTELEM2' (aka 'unsigned int')
Initially the problem is triggered by the assignment on line 405, because the particular 'divisors' element chosen is -1 (!) at that point:
Since shift is an unsigned short (in case of this build), it gets assigned 65535. Then, on lines 410 and 415, this causes the right shift of product by 65535 + 2 * 8 = 65551 bits.
Of course, the next question is: how does one of the shift elements of divisors[] become -1? Then we end up in compute_reciprocal(), which does not seem to handle a divisor argument of 1 very well:
171 LOCAL(int)
172 compute_reciprocal (UINT16 divisor, DCTELEM * dtbl)
173 {
174 UDCTELEM2 fq, fr;
175 UDCTELEM c;
176 int b, r;
177
178 b = flss(divisor) - 1;
... if 'divisor' is 1, 'b' becomes 0 ...
179 r = sizeof(DCTELEM) * 8 + b;
... r becomes 16 ...
180
181 fq = ((UDCTELEM2)1 << r) / divisor;
182 fr = ((UDCTELEM2)1 << r) % divisor;
183
184 c = divisor / 2; /* for rounding */
185
186 if (fr == 0) { /* divisor is power of two */
187 /* fq will be one bit too large to fit in DCTELEM, so adjust */
188 fq >>= 1;
189 r--;
... since 'fr' will be 0, here 'r' becomes 15 ...
190 } else if (fr <= (divisor / 2U)) { /* fractional part is < 0.5 */
191 c++;
192 } else { /* fractional part is > 0.5 */
193 fq++;
194 }
195
196 dtbl[DCTSIZE2 * 0] = (DCTELEM) fq; /* reciprocal */
197 dtbl[DCTSIZE2 * 1] = (DCTELEM) c; /* correction + roundfactor */
198 dtbl[DCTSIZE2 * 2] = (DCTELEM) (1 << (sizeof(DCTELEM)*8*2 - r)); /* scale */
199 dtbl[DCTSIZE2 * 3] = (DCTELEM) r - sizeof(DCTELEM)*8; /* shift */
... and finally, this 'dtbl' element becomes -1.
The call to compute_reciprocal() with a divisor argument of 1 is done from start_pass_dctmgr(), in the JDCT_IFAST case:
At this point, qtbl->quantval[i] is 1, which appears to be legal, and aanscales[i] is 2446, resulting in a divisor of 1.
So it looks to me like a divisor of 1 is legal, but how should this be handled properly in compute_reciprocal()? The resulting "real" reciprocal should be 1, obviously, but in terms of the function's variables, they should likely be:
b = 0
r = (word size) [16, in our case]
f = 2^(word size) [65536, in our case]
Unfortunately, with DCTELEM being a 16-bit short, the resulting f value does not fit! So how to solve this conundrum? I am unsure at this point, and any advice is welcome. :-)
The text was updated successfully, but these errors were encountered:
This is partly related to an old issue (from 4 years ago.) Basically, it was discovered that, when using the IFAST forward DCT and quality >= 98, a similar overflow could occur because the x86 SIMD routines are using 16-bit reciprocal values. The solution to that was to detect the overflow in compute_reciprocal and, if it occurs, drop back to using the non-SIMD quantization routine. This slows performance considerably, which is why there's a draconian warning in README-turbo.txt that says basically "don't use IFAST with quality >= 98." Even when it's working properly, there are other degenerate behaviors with the IFAST algorithm (including the C version) with quality >= 98. The compression ratio is as much as 15% worse for those quality levels than when using ISLOW.
As you discovered, compute_reciprocal() isn't producing correct values when divisor==1, which is apparently a rare enough situation that it hasn't caused major problems so far. 4cfa3f4 should fix this issue-- it makes compute_reciprocal() generate special values for the divisor==1 case, and those values make the C quantization function behave like an identity function.
I'm a FreeBSD developer working on maintaining our copy of clang and llvm in the FreeBSD base system. While doing a test build run of all ports, the following libjpeg-turbo test failure was reported to me:
http://package22.nyi.freebsd.org/data/headamd64PR201377-default/2015-08-20_16h45m13s/logs/errors/libjpeg-turbo-1.4.0.log
This failure occurs on FreeBSD/amd64, not on FreeBSD/i386:
Searching the internet for similar test failures, I found this particular ticket, from February 2015:
http://sourceforge.net/p/libjpeg-turbo/bugs/85/
It might seem partially related, since it only occurs when compiling with gcc 4.9.1, and specifying
-march=haswell. However, the original reporter also mentions it occurred on a 32-bit system.In any case, after some analysis (using the Undefined Behavior Sanitizer, among others) it turns out that the
quantize()function injcdctmgr.cuses undefined behavior in some cases.In particular, it sometimes tries to right-shift a 32-bit unsigned integer by 65551 bits, which is obviously illegal. For example, from a test run with UBSan enabled:
Initially the problem is triggered by the assignment on line 405, because the particular 'divisors' element chosen is -1 (!) at that point:
Since
shiftis an unsigned short (in case of this build), it gets assigned 65535. Then, on lines 410 and 415, this causes the right shift ofproductby 65535 + 2 * 8 = 65551 bits.Of course, the next question is: how does one of the shift elements of
divisors[]become -1? Then we end up incompute_reciprocal(), which does not seem to handle adivisorargument of 1 very well:The call to
compute_reciprocal()with adivisorargument of 1 is done fromstart_pass_dctmgr(), in theJDCT_IFASTcase:At this point,
qtbl->quantval[i]is 1, which appears to be legal, andaanscales[i]is 2446, resulting in a divisor of 1.So it looks to me like a divisor of 1 is legal, but how should this be handled properly in compute_reciprocal()? The resulting "real" reciprocal should be 1, obviously, but in terms of the function's variables, they should likely be:
Unfortunately, with
DCTELEMbeing a 16-bit short, the resultingfvalue does not fit! So how to solve this conundrum? I am unsure at this point, and any advice is welcome. :-)The text was updated successfully, but these errors were encountered: