When compiled with UBSan and ASan on in a recent clang, many, many undefined behavior bugs get smoked out. I have now lost count of how many I've fixed in my own ZIMH fork. Essentially none of these are harmless.
To reproduce this, try compiling with a recent Clang with AddressSanitizer and UBSan enabled through, say,
-fsanitize=address,undefined,local-bounds
plus
-fno-sanitize-recover=all.
Runtime options should stop on the first report, symbolize stacks, enable leak detection when supported by the selected Clang runtime, and turn on strict AddressSanitizer checks.
Then try running things. The normal SIMH tests are a good start, but actually booting operating systems and running them is important if you want to get a sense of scale.
You will hit one thing after another, everything from out of bounds memory reads and writes to signed arithmetic problems caused by the mistaken belief that in C an int and an unsigned int are basically the same (no, they are not) and that thus int is a fine type for representing uninterpreted machine words and addresses (no, it is not.) There are also a bunch of memory leaks, though these are less dangerous. I haven't yet hit fun things like addresses of stack variables outlasting the lifetime of their function but I haven't been that thorough yet about code coverage.
Fixing all the bugs needed to get a Vax simulator successfully booting NetBSD with these options on is especially entertaining; the Vax simulators have tons of UB in them.
And please don't blame the messenger, meaning the tools that find such bugs, they're good tools and they're valuable to run on any project written in C. Claims to the effect that undefined behavior is the fault of the standards committee or evil compiler authors or the Illuminati and that it's fine to ignore it are wrong, btw, though they always eventually appear in such conversations.
When compiled with UBSan and ASan on in a recent clang, many, many undefined behavior bugs get smoked out. I have now lost count of how many I've fixed in my own ZIMH fork. Essentially none of these are harmless.
To reproduce this, try compiling with a recent Clang with AddressSanitizer and UBSan enabled through, say,
-fsanitize=address,undefined,local-boundsplus
-fno-sanitize-recover=all.Runtime options should stop on the first report, symbolize stacks, enable leak detection when supported by the selected Clang runtime, and turn on strict AddressSanitizer checks.
Then try running things. The normal SIMH tests are a good start, but actually booting operating systems and running them is important if you want to get a sense of scale.
You will hit one thing after another, everything from out of bounds memory reads and writes to signed arithmetic problems caused by the mistaken belief that in C an int and an unsigned int are basically the same (no, they are not) and that thus int is a fine type for representing uninterpreted machine words and addresses (no, it is not.) There are also a bunch of memory leaks, though these are less dangerous. I haven't yet hit fun things like addresses of stack variables outlasting the lifetime of their function but I haven't been that thorough yet about code coverage.
Fixing all the bugs needed to get a Vax simulator successfully booting NetBSD with these options on is especially entertaining; the Vax simulators have tons of UB in them.
And please don't blame the messenger, meaning the tools that find such bugs, they're good tools and they're valuable to run on any project written in C. Claims to the effect that undefined behavior is the fault of the standards committee or evil compiler authors or the Illuminati and that it's fine to ignore it are wrong, btw, though they always eventually appear in such conversations.