Skip to content

Commit 029b5ac

Browse files
committed
en: New post (2023-08-19-why-c-is-a-dangerous-programming-language)
1 parent 81254d8 commit 029b5ac

File tree

1 file changed

+125
-0
lines changed

1 file changed

+125
-0
lines changed
Lines changed: 125 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,125 @@
1+
---
2+
layout: post
3+
title: "Why C is a dangerous programming language"
4+
categories:
5+
- programming
6+
---
7+
8+
In this blog post, I describe one particular aspect of the C language that can lead to unpleasant surprises if you’re not careful.
9+
10+
I was writing some code [for the Perlang standard library](https://github.com/perlang-org/perlang/issues/406) which implements printing of floating-point numbers, using the freely available [double-conversion library](https://github.com/google/double-conversion) from Google. To be able to write tests for this, I wrote a wrapper function which overrides `puts()` with some custom logic. The function looks like this;
11+
12+
```c
13+
const char *captured_output = NULL;
14+
15+
extern "C" void __real_puts(const char *s);
16+
17+
extern "C" void __wrap_puts(const char *s) {
18+
if (puts_mocked) {
19+
captured_output = s;
20+
}
21+
else {
22+
__real_puts(s);
23+
}
24+
}
25+
```
26+
27+
Then, in my calling code, I would do something like this (very simplified example with all error handling removed):
28+
29+
```c++
30+
const int PRINT_BUFFER_SIZE = 100;
31+
32+
void print(double d)
33+
{
34+
char buffer_container[PRINT_BUFFER_SIZE];
35+
double_conversion::Vector<char> buffer(buffer_container, PRINT_BUFFER_SIZE);
36+
int length;
37+
int point;
38+
bool status = FastDtoa(d, double_conversion::FAST_DTOA_SHORTEST, 0, buffer, &length, &point);
39+
40+
char formatted_number[length + 1];
41+
int i = 0;
42+
43+
while (i < length) {
44+
// TODO: Add decimal point at the right place
45+
formatted_number[i] = buffer[i];
46+
i++;
47+
}
48+
49+
formatted_number[length] = '\0';
50+
51+
puts(formatted_number);
52+
}
53+
```
54+
55+
This worked quite fine, in the sense that the expected value (`12345`) was being printed to the screen.
56+
57+
There was just one problem: the assertion in my test was failing miserably. Here's the assertion (`CHECK_EQ` is an assertion method defined in the `double-conversion` library mentioned above):
58+
59+
```c
60+
TEST(PrintDouble) {
61+
puts_mocked = true;
62+
perlang::print(123.45);
63+
puts_mocked = false;
64+
65+
// This is the failing assertion
66+
CHECK_EQ("123.45", captured_output);
67+
}
68+
```
69+
70+
The output from running the tests looked like this:
71+
72+
```
73+
/home/per/git/perlang/src/stdlib/cmake-build-debug/cctest print
74+
/home/per/git/perlang/src/stdlib/test/print.cc:27:
75+
CHECK_EQ("123.45", captured_output) failed
76+
# Expected: 123.45
77+
# Found: ␛
78+
Signal: SIGABRT (Aborted)
79+
```
80+
81+
What?? Why?! I don’t understand anything. I left the project at the end of the day with a feeling that not much was working as intended...
82+
83+
Then, the morning after when I was still in bed, I realized what was probably happening. It was a typical “C mistake”.
84+
85+
Remember my `__wrap_puts` method above? Take a close look at it again. It takes a parameter and saves the value of this parameter for later use. Then look at my `print()` method above: it declares a **stack variable**, builds up the correct string inside this variable, then calls `puts()`... and then returns.
86+
87+
I've highlighted "stack variable" here, because this is precisely the problem here. _Pointers to the stack cannot be saved like this_, it's doomed to fail. Unfortunately, the stack will be overwritten between the `puts()` call and the method which checks the `captured_output` content.
88+
89+
### The solution
90+
91+
```c
92+
extern "C" void __wrap_puts(const char* s)
93+
{
94+
if (puts_mocked) {
95+
// Note: calling this multiple times will both leak memory and overwrite the captured
96+
// output from previous calls. Don't use this in the real world without adding suitable
97+
// free() calls.
98+
captured_output = strdup(s); // <-- calling strdup(s) here is the fix
99+
}
100+
else {
101+
__real_puts(s);
102+
}
103+
}
104+
```
105+
106+
By duplicating the string, which can be either located on the stack or on the heap, we are safe, and the assertion will now fail as expected instead:
107+
108+
```
109+
/home/per/git/perlang/src/stdlib/cmake-build-debug/cctest print
110+
/home/per/git/perlang/src/stdlib/test/print.cc:27:
111+
CHECK_EQ("123.45", captured_output) failed
112+
# Expected: 123.45
113+
# Found: 12345
114+
Signal: SIGABRT (Aborted)
115+
```
116+
117+
(The reason it fails is because the implementation is not yet complete; it doesn't add the decimal point at the right place. This is fine and a perfectly "normal" software bug. Figuring out why a garbage value was being printed is a much more "annoying" kind of bug to me.)
118+
119+
### Conclusion
120+
121+
> "C makes it easy to shoot yourself in the foot; C++ makes it harder, but when you do it blows your whole leg off" _Bjarne Stroustrup_
122+
123+
With great power comes great responsibility. The fact that you _can_ easily allocate objects on the stack is actually a really nice and useful thing, and one of the things that has historically given (and perhaps still gives) C and C++ an advantage for writing performance-critical code. To be able to completely eliminate memory allocation and deallocation for short-lived objects is simply very nice. But: like many other advanced features in C (like pointer arithmetics...), it comes at a price. And the real nasty part is that there are very few "safety belts" in C. So arguably, the greatest strengths of the C language are also its greatest weaknesses.
124+
125+
<sub>The careful reader might be wanting to point out that my example code above is technically not C but C++ - I know. Traditional pointers (e.g. `const char *`) are much more "C-style" than "C++-style" though. In C++, references are typically favoured over plain-old pointers. </sub>

0 commit comments

Comments
 (0)