Unify kroundup macros and fix gcc warnings #1051

daviesrob · 2020-03-17T15:08:54Z

Since 29c294e, which fixed overflow bugs in kroundup32() and kroundup_size_t() among other things, samtools and bcftools have been getting gcc warnings like htslib/kstring.h:163:16: error: argument 1 value ‘18446744073709551615’ exceeds maximum object size 9223372036854775807 [-Werror=alloc-size-larger-than=]. These apparently appear due to jump threading optimisations which generate code paths for the case when kroundup avoids the overflow and then warns that realloc() has been called on the result. (See https://developers.redhat.com/blog/2019/03/13/understanding-gcc-warnings-part-2/ for an explanation.)

This is my best attempt to fix this. It replaces several kroundup32 and kroundup_size_t macro definitions with a new unified implementation. The new one will work correctly for all integer types (both signed and unsigned) up to 64 bits. It also avoids conditionals with large numbers in the macro definition to work around the jump threading problem. This mostly works - with this change all of the warnings go from bcftools and only one is left in samtools (where a signed value is rounded up and then multiplied by two in the realloc statement, so probably actually legitimate).

It's been tested on 32-bit and 64-bit builds, and works on both (barring the issue with samtools).

Includes extra tests to ensure the new kroundup works as desired.

Replaces #1048, which failed on 32-bit.

jkbonfield · 2020-03-17T15:32:23Z

What does all this ghastly code end up compiling to? This is heavily inlined. Will it be oveflowing the sizes of what is permissible for inline with the extra logic, or does the compiler manage to realise it's mostly const and just replace it with 0 or 1?

jkbonfield · 2020-03-17T17:21:43Z

Looking at godbolt it appears that a function doing:

size_t kround(size_t size)
{
       return kroundup_size_old(size);
}

adds 4 instructions for gcc9.2 (x86_64) and a whopping 24 instructions if using clang, compared to my branch. That may have implications for inlining.

I'm wondering if we can just do something to do disable the bogus warning, or completely ditch this and only round up if values are < some fixed size. The case the compilers are spotting are only for sizes which wouldn't actually work anyway, so failing to round up and just allocate what the user asked for isn't an issue as rounding up or otherwise it will result in a memory allocation failure. The difference is asking for 9 yotta bytes vs 18 I think :-)

My PR already had checks to stop it rounding up if it got too big, but apparently that still triggered the compiler warnings. Was that because the use of INT_MAX is still too large? I haven't managed to reproduce any of these warnings, even using an -m32 capable compiler.

Do you know which piece of code gave the warning originally? (Edit: no matter - it's bam_fastq.c:28. I'll try pasting that into godbolt to explore different compiler/arch.)

jkbonfield · 2020-03-17T17:45:51Z

The same macro (32 bit version at least) turns up in hts.h, khash.h, kseq.h, rbuf.h, bgzf.c and faidx.c also. Blergh.

I guess we could ignore the original problem which was memory failure when you overflow and round down to zero, becoming a free statement instead, but that feels like we're just ignoring known bugs. Still we're no worse off at least as they're already there.

However in every single use case it is prior to a realloc (occasionally 2 of the same size) or a malloc, with very variable degrees of bounds and overflow checking. It's not even appropriate to put size checking in kroundup either as sometimes we then take that rounded up value and do things like this:

            kroundup32(heap->mdat);                                     \
            heap->dat = (kheap_t*)realloc(heap->dat, heap->mdat*sizeof(kheap_t));

What this makes me think is that kroundup is a total red herring and what we want is simply a krealloc function which automatically does more some rounding up and has uniformly good overflow checks.

However, in the interests of expediency, I'm inclined to go with this as a temporary workaround for the over-zealous warnings reported by some gcc/system combos. Comments?

daviesrob · 2020-03-17T17:58:35Z

Did you remember to use -O2 in Godbolt, and change the language to 'C'? I get about 4 extra lines on clang 9.0.0 compared to #1048 (also checkout the ARM64 gcc 8.2 version - it's tiny!)

The same macro (32 bit version at least) turns up in hts.h, khash.h, kseq.h, rbuf.h, bgzf.c and faidx.c also. Blergh.

The version in rbuf.h would be an issue for bcftools. I think I found all of them in HTSlib.

It's not even appropriate to put size checking in kroundup either as sometimes we then take that rounded up value and do things like this (multiply by another value without checking for overflow)

Yes, we do need to sort out the places where the rounded-up value gets multiplied by something, although most are likely harmless because the array in question is growing slowly and you run out of memory before it gets the chance to overflow. Some of these (e.g. kstring) might work better with a slower growth factor. I think others (e.g. khash) rely on the size being a power of two so it's possible to use a bit-mask to restrict values to the size of the allocated array.

jmarshall · 2020-03-17T22:18:57Z

However in every single use case it is prior to a realloc (occasionally 2 of the same size) or a malloc, with very variable degrees of bounds and overflow checking.

Yep and for this reason IMHO there's little point in inlining the rounding up — you're about to do a relatively expensive memory allocation anyway. In the case of ks_resize() it's worth keeping the s->m >= size fast case inline, but the rest could be via a function call. See for example 04b71e6 — which suffices to avoid the warnings in samtools, and with the roundup now inside libhts.so the embiggening algorithm can be tweaked with impunity.

…which is basically equivalent to James's krealloc concept:

What this makes me think is that kroundup is a total red herring and what we want is simply a krealloc function which automatically does more some rounding up and has uniformly good overflow checks.

(IMHO the kroundupXYZ macro has always basically been a premature optimisation: it's written obscurely and tightly so that it can be a macro that's inlined all over the place, but there's really no reason to do that.)

Replace several kroundup32 and kroundup_size_t macro definitions with a new unified implementation. The new one will work correctly for all integer types (both signed and unsigned) up to 64 bits. An attempt to round up that would overflow results in the maximum value that can be stored in the given type. The implementation avoids conditionals involving large values in an attempt to fix some gcc warnings in samtools and bcftools. They appeared after commit 29c294e which fixed a bug where kroundup would wrap around to zero on large values. The warnings are caused by jump threading optimisations that create code paths for the large value cases, which gcc then warns about. See: https://developers.redhat.com/blog/2019/03/13/understanding-gcc-warnings-part-2/ Includes extra tests to ensure the new kroundup works as desired.

daviesrob · 2020-03-18T09:39:38Z

@jmarshall Yes, I've also considered moving most of ks_resize() out of the inline function. Let's go for it 😄

It doesn't make that much difference to the size of the binary:

File	Before	After
libhts.a	8388028	8349306
libhts.so	4440834	4403764

but checking with -Winline shows that quite a few missed inlines are fixed:

Before:

    148 htslib/kstring.h warning: inlining failed in call to 'kputc': --param inline-unit-growth limit reached [-Winline]
     70 htslib/kstring.h warning: inlining failed in call to 'kputsn': --param inline-unit-growth limit reached [-Winline]
     48 htslib/kstring.h warning: inlining failed in call to 'kputw': --param inline-unit-growth limit reached [-Winline]
     42 htslib/kstring.h warning: inlining failed in call to 'kputc_': --param inline-unit-growth limit reached [-Winline]
     28 htslib/kstring.h warning: inlining failed in call to 'kputsn_': --param inline-unit-growth limit reached [-Winline]
      6 htslib/kstring.h warning: inlining failed in call to 'kputs': --param inline-unit-growth limit reached [-Winline]
      6 htslib/kstring.h warning: inlining failed in call to 'kputll': --param inline-unit-growth limit reached [-Winline]
      4 htslib/kstring.h warning: inlining failed in call to 'kputuw': --param inline-unit-growth limit reached [-Winline]
      2 htslib/kstring.h warning: inlining failed in call to 'kputsn': call is unlikely and code size would grow [-Winline]

After:

     99 htslib/kstring.h warning: inlining failed in call to 'kputc': --param inline-unit-growth limit reached [-Winline]
     50 htslib/kstring.h warning: inlining failed in call to 'kputw': --param inline-unit-growth limit reached [-Winline]
     30 htslib/kstring.h warning: inlining failed in call to 'kputsn': --param inline-unit-growth limit reached [-Winline]
      9 htslib/kstring.h warning: inlining failed in call to 'kputc_': --param inline-unit-growth limit reached [-Winline]
      8 htslib/kstring.h warning: inlining failed in call to 'kputsn_': --param inline-unit-growth limit reached [-Winline]
      2 htslib/kstring.h warning: inlining failed in call to 'kputsn': call is unlikely and code size would grow [-Winline]

jkbonfield · 2020-03-19T09:41:47Z

Looks like we've reached a good consensus here. I'm happy, but will do a quick bit of compilation and testing and hopefully merge soon.

PS. Yes, I forgot to turn on -O2 in godbolt. Oops. I see what you mean about arm too. Nice instruction set!

jkbonfield · 2020-03-19T10:18:26Z

Merged, although the samtools cut_target.c:66 warning is still to be resolved. I suspect something like this would work:

diff --git a/cut_target.c b/cut_target.c
index e59f51b..8eae0d1 100644
--- a/cut_target.c
+++ b/cut_target.c
@@ -61,8 +61,8 @@ static uint16_t gencns(ct_t *g, int n, const bam_pileup1_t *plp)
     int i, j, ret, tmp, k, sum[4], qual;
     float q[16];
     if (n > g->max_bases) { // enlarge g->bases
-        g->max_bases = n;
-        kroundup32(g->max_bases);
+        g->max_bases = n*2; // prevent overflow in kroundup32
+        kroundup32(g->max_bases); g->max_bases /= 2;
         g->bases = realloc(g->bases, g->max_bases * 2);
     }
     for (i = k = 0; i < n; ++i) {

Although ideally we'd have kmalloc and krealloc and it'd just work without horrid shenanigans. Patch now to stop compilation errors and we can get more time to come up with a better solution.

daviesrob · 2020-03-19T10:26:05Z

Thanks. Easiest solution for cut_target.c for now is to simply cast to size_t in the realloc:

        g->bases = realloc(g->bases, (size_t) g->max_bases * 2);

As it goes from signed to unsigned, it always fits.

jkbonfield · 2020-03-19T10:27:52Z

Good point. I'm happy for you to make that 1 line tweak without PR if you wish provided it doesn't trigger any warnings. Looks like you've verified that already.

daviesrob · 2020-03-19T10:58:30Z

I've pushed the fix to samtools.

daviesrob and others added 2 commits March 18, 2020 09:11

Move the bulk of ks_resize() to a non-inlined helper function

04b71e6

daviesrob force-pushed the kroundup branch from 8f03b41 to 04b71e6 Compare March 18, 2020 09:40

jkbonfield merged commit 04b71e6 into samtools:develop Mar 19, 2020

jkbonfield mentioned this pull request Mar 19, 2020

Removed gcc9 warning in kroundup. #1048

Closed

daviesrob deleted the kroundup branch March 19, 2020 10:42

daviesrob mentioned this pull request Jul 29, 2020

kstring.h ks_free() use of free() crashes alternative compilers #1114

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unify kroundup macros and fix gcc warnings #1051

Unify kroundup macros and fix gcc warnings #1051

daviesrob commented Mar 17, 2020

jkbonfield commented Mar 17, 2020

jkbonfield commented Mar 17, 2020 •

edited

jkbonfield commented Mar 17, 2020

daviesrob commented Mar 17, 2020

jmarshall commented Mar 17, 2020 •

edited

daviesrob commented Mar 18, 2020

jkbonfield commented Mar 19, 2020

jkbonfield commented Mar 19, 2020

daviesrob commented Mar 19, 2020

jkbonfield commented Mar 19, 2020

daviesrob commented Mar 19, 2020

Unify kroundup macros and fix gcc warnings #1051

Unify kroundup macros and fix gcc warnings #1051

Conversation

daviesrob commented Mar 17, 2020

jkbonfield commented Mar 17, 2020

jkbonfield commented Mar 17, 2020 • edited

jkbonfield commented Mar 17, 2020

daviesrob commented Mar 17, 2020

jmarshall commented Mar 17, 2020 • edited

daviesrob commented Mar 18, 2020

jkbonfield commented Mar 19, 2020

jkbonfield commented Mar 19, 2020

daviesrob commented Mar 19, 2020

jkbonfield commented Mar 19, 2020

daviesrob commented Mar 19, 2020

jkbonfield commented Mar 17, 2020 •

edited

jmarshall commented Mar 17, 2020 •

edited