|
| 1 | +--- |
| 2 | +layout: post |
| 3 | +title: "Why C is a dangerous programming language" |
| 4 | +categories: |
| 5 | +- programming |
| 6 | +--- |
| 7 | + |
| 8 | +In this blog post, I describe one particular aspect of the C language that can lead to unpleasant surprises if you’re not careful. |
| 9 | + |
| 10 | +I was writing some code [for the Perlang standard library](https://github.com/perlang-org/perlang/issues/406) which implements printing of floating-point numbers, using the freely available [double-conversion library](https://github.com/google/double-conversion) from Google. To be able to write tests for this, I wrote a wrapper function which overrides `puts()` with some custom logic. The function looks like this; |
| 11 | + |
| 12 | +```c |
| 13 | +const char *captured_output = NULL; |
| 14 | + |
| 15 | +extern "C" void __real_puts(const char *s); |
| 16 | + |
| 17 | +extern "C" void __wrap_puts(const char *s) { |
| 18 | + if (puts_mocked) { |
| 19 | + captured_output = s; |
| 20 | + } |
| 21 | + else { |
| 22 | + __real_puts(s); |
| 23 | + } |
| 24 | +} |
| 25 | +``` |
| 26 | +
|
| 27 | +Then, in my calling code, I would do something like this (very simplified example with all error handling removed): |
| 28 | +
|
| 29 | +```c++ |
| 30 | +const int PRINT_BUFFER_SIZE = 100; |
| 31 | +
|
| 32 | +void print(double d) |
| 33 | +{ |
| 34 | + char buffer_container[PRINT_BUFFER_SIZE]; |
| 35 | + double_conversion::Vector<char> buffer(buffer_container, PRINT_BUFFER_SIZE); |
| 36 | + int length; |
| 37 | + int point; |
| 38 | + bool status = FastDtoa(d, double_conversion::FAST_DTOA_SHORTEST, 0, buffer, &length, &point); |
| 39 | +
|
| 40 | + char formatted_number[length + 1]; |
| 41 | + int i = 0; |
| 42 | +
|
| 43 | + while (i < length) { |
| 44 | + // TODO: Add decimal point at the right place |
| 45 | + formatted_number[i] = buffer[i]; |
| 46 | + i++; |
| 47 | + } |
| 48 | +
|
| 49 | + formatted_number[length] = '\0'; |
| 50 | +
|
| 51 | + puts(formatted_number); |
| 52 | +} |
| 53 | +``` |
| 54 | + |
| 55 | +This worked quite fine, in the sense that the expected value (`12345`) was being printed to the screen. |
| 56 | + |
| 57 | +There was just one problem: the assertion in my test was failing miserably. Here's the assertion (`CHECK_EQ` is an assertion method defined in the `double-conversion` library mentioned above): |
| 58 | + |
| 59 | +```c |
| 60 | +TEST(PrintDouble) { |
| 61 | + puts_mocked = true; |
| 62 | + perlang::print(123.45); |
| 63 | + puts_mocked = false; |
| 64 | + |
| 65 | + // This is the failing assertion |
| 66 | + CHECK_EQ("123.45", captured_output); |
| 67 | +} |
| 68 | +``` |
| 69 | +
|
| 70 | +The output from running the tests looked like this: |
| 71 | +
|
| 72 | +``` |
| 73 | +/home/per/git/perlang/src/stdlib/cmake-build-debug/cctest print |
| 74 | +/home/per/git/perlang/src/stdlib/test/print.cc:27: |
| 75 | + CHECK_EQ("123.45", captured_output) failed |
| 76 | +# Expected: 123.45 |
| 77 | +# Found: ␛ |
| 78 | +Signal: SIGABRT (Aborted) |
| 79 | +``` |
| 80 | +
|
| 81 | +What?? Why?! I don’t understand anything. I left the project at the end of the day with a feeling that not much was working as intended... |
| 82 | +
|
| 83 | +Then, the morning after when I was still in bed, I realized what was probably happening. It was a typical “C mistake”. |
| 84 | +
|
| 85 | +Remember my `__wrap_puts` method above? Take a close look at it again. It takes a parameter and saves the value of this parameter for later use. Then look at my `print()` method above: it declares a **stack variable**, builds up the correct string inside this variable, then calls `puts()`... and then returns. |
| 86 | +
|
| 87 | +I've highlighted "stack variable" here, because this is precisely the problem here. _Pointers to the stack cannot be saved like this_, it's doomed to fail. Unfortunately, the stack will be overwritten between the `puts()` call and the method which checks the `captured_output` content. |
| 88 | +
|
| 89 | +### The solution |
| 90 | +
|
| 91 | +```c |
| 92 | +extern "C" void __wrap_puts(const char* s) |
| 93 | +{ |
| 94 | + if (puts_mocked) { |
| 95 | + // Note: calling this multiple times will both leak memory and overwrite the captured |
| 96 | + // output from previous calls. Don't use this in the real world without adding suitable |
| 97 | + // free() calls. |
| 98 | + captured_output = strdup(s); // <-- calling strdup(s) here is the fix |
| 99 | + } |
| 100 | + else { |
| 101 | + __real_puts(s); |
| 102 | + } |
| 103 | +} |
| 104 | +``` |
| 105 | + |
| 106 | +By duplicating the string, which can be either located on the stack or on the heap, we are safe, and the assertion will now fail as expected instead: |
| 107 | + |
| 108 | +``` |
| 109 | +/home/per/git/perlang/src/stdlib/cmake-build-debug/cctest print |
| 110 | +/home/per/git/perlang/src/stdlib/test/print.cc:27: |
| 111 | + CHECK_EQ("123.45", captured_output) failed |
| 112 | +# Expected: 123.45 |
| 113 | +# Found: 12345 |
| 114 | +Signal: SIGABRT (Aborted) |
| 115 | +``` |
| 116 | + |
| 117 | +(The reason it fails is because the implementation is not yet complete; it doesn't add the decimal point at the right place. This is fine and a perfectly "normal" software bug. Figuring out why a garbage value was being printed is a much more "annoying" kind of bug to me.) |
| 118 | + |
| 119 | +### Conclusion |
| 120 | + |
| 121 | +> "C makes it easy to shoot yourself in the foot; C++ makes it harder, but when you do it blows your whole leg off" _Bjarne Stroustrup_ |
| 122 | +
|
| 123 | +With great power comes great responsibility. The fact that you _can_ easily allocate objects on the stack is actually a really nice and useful thing, and one of the things that has historically given (and perhaps still gives) C and C++ an advantage for writing performance-critical code. To be able to completely eliminate memory allocation and deallocation for short-lived objects is simply very nice. But: like many other advanced features in C (like pointer arithmetics...), it comes at a price. And the real nasty part is that there are very few "safety belts" in C. So arguably, the greatest strengths of the C language are also its greatest weaknesses. |
| 124 | + |
| 125 | +<sub>The careful reader might be wanting to point out that my example code above is technically not C but C++ - I know. Traditional pointers (e.g. `const char *`) are much more "C-style" than "C++-style" though. In C++, references are typically favoured over plain-old pointers. </sub> |
0 commit comments