# Another pointless comparison of binary size between Rust and C++

I recently saw a Youtube video of someone going through a laundry list of complaints about Rust. The merits of the discussion notwithstanding, one of the points raised was about the difference in binary size produced from different compilers.

There are [many](https://github.com/johnthagen/min-sized-rust), [many](https://os.phil-opp.com/freestanding-rust-binary/), [many](https://lifthrasiir.github.io/rustlog/why-is-a-rust-executable-large.html), [many](https://kripken.github.io/blog/binaryen/2018/04/18/rust-emscripten.html), [many](https://stackoverflow.com/questions/29008127/why-are-rust-executables-so-huge), [many](https://github.com/rust-embedded/wg/issues/5) discussions about Rust binary sizes online and I'm not going to rehash any of that. I was just curious about reproducing the data in the Youtube video, and in particular I am generally interested in producing static binaries for simplified deployment reasons.  I realise the static-vs-dynamic debate is a whole other flamewar online and I don't really care 😃

_This post was made in a Jupyter Notebook using the awesome [evcxr](https://github.com/google/evcxr) Rust interpreter. All the cells were executed live in the notebook._

## Utility functions

In [2]:
/// Tool to easily run shell commands.
fn sh(dir: &str, cmd: &str, args: &[&'static str]) -> Result<(), Box<dyn std::error::Error>> {
    let output = std::process::Command::new(cmd)
        .current_dir(dir)
        .args(args)
        .output()?;
    if output.stdout.len() > 0 {
        println!("stdout:\n{}", String::from_utf8(output.stdout)?);
    }
    if output.stderr.len() > 0 {
        println!("stderr:\n{}", String::from_utf8(output.stderr)?);
    }
    Ok(())
}

# C++ binary size

The example given in the video was like this:

In [3]:
std::fs::remove_dir_all("~/tmp/cppsize/").ok();
std::fs::create_dir_all("~/tmp/cppsize/")?;
std::fs::write("~/tmp/cppsize/main.cc", "int main() {}")?;

In [4]:
sh("~/tmp/cppsize", "ls", &vec!["-lah"])?;

stdout:
total 12K
drwxrwxr-x 2 caleb caleb 4.0K Mar 20 14:46 .
drwxrwxr-x 3 caleb caleb 4.0K Mar 20 14:46 ..
-rw-rw-r-- 1 caleb caleb   13 Mar 20 14:46 main.cc



Compiling the empty file, according to the aforementioned video:

In [5]:
sh(
    "~/tmp/cppsize",
    "g++",
    &vec!["-o", "main", 
        "main.cc", 
        "-Ofast", 
        "-std=c++20", 
        "-s", 
        "-flto", 
        "-static-libgcc", 
        "-static-libstdc++"
    ]
)?;

In [6]:
sh("~/tmp/cppsize", "ls", &vec!["-lah"])?;

stdout:
total 28K
drwxrwxr-x 2 caleb caleb 4.0K Mar 20 14:46 .
drwxrwxr-x 3 caleb caleb 4.0K Mar 20 14:46 ..
-rwxrwxr-x 1 caleb caleb  14K Mar 20 14:46 main
-rw-rw-r-- 1 caleb caleb   13 Mar 20 14:46 main.cc



In the Youtube video, the very low binary size of `main` was as given above, around **14 kB**. However, this is not the whole truth. C++ also has a runtime library and the above binary will dynamically link to that runtime. You can see this with `ldd`:

In [7]:
sh("~/tmp/cppsize", "ldd", &vec!["main"])?;

stdout:
	linux-vdso.so.1 (0x00007ffe4d3fa000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fb549b1c000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fb549d61000)



By default, Rust will include its standard library into the compiled binary. We can add one small extra option to the compiler options for this empty C++ example to produce a fully static binary:

In [8]:
sh(
    "~/tmp/cppsize",
    "g++",
    &vec!["-o", "main", 
        "main.cc", 
        "-Ofast", 
        "-std=c++20", 
        "-s", 
        "-flto", 
        "-static-libgcc", 
        "-static-libstdc++",
        "-static"  // <--- THIS IS THE EXTRA OPTION
    ]
)?;

In [9]:
sh("~/tmp/cppsize", "ls", &vec!["-lah"])?;

stdout:
total 808K
drwxrwxr-x 2 caleb caleb 4.0K Mar 20 14:46 .
drwxrwxr-x 3 caleb caleb 4.0K Mar 20 14:46 ..
-rwxrwxr-x 1 caleb caleb 793K Mar 20 14:46 main
-rw-rw-r-- 1 caleb caleb   13 Mar 20 14:46 main.cc



Now the size is much bigger, around **793 kB**. `ldd` tells us the binary no longer links dynamically to anything:

In [10]:
sh("~/tmp/cppsize", "ldd", &vec!["main"])?;

stderr:
	not a dynamic executable



## Rust binary size

In the Youtube video, an empty rust file was compiled in the following way:

In [11]:
std::fs::remove_dir_all("~/tmp/rustsize/").ok();
std::fs::create_dir_all("~/tmp/rustsize/")?;
std::fs::write("~/tmp/rustsize/a.rs", "fn main() {}")?;

In [12]:
sh("~/tmp/rustsize", "ls", &vec!["-lah"])?;

stdout:
total 12K
drwxrwxr-x 2 caleb caleb 4.0K Mar 20 14:47 .
drwxrwxr-x 4 caleb caleb 4.0K Mar 20 14:47 ..
-rw-rw-r-- 1 caleb caleb   12 Mar 20 14:47 a.rs



For some reason the video author didn't want to use Cargo 🤷. Anyway let's go with it:

In [13]:
sh(
    "~/tmp/rustsize",
    "rustc",
    &vec![
        "-O", 
        "-C", "strip=symbols",
        "a.rs", 
    ]
)?;

In [14]:
sh("~/tmp/rustsize", "ls", &vec!["-lah"])?;

stdout:
total 300K
drwxrwxr-x 2 caleb caleb 4.0K Mar 20 14:47 .
drwxrwxr-x 4 caleb caleb 4.0K Mar 20 14:47 ..
-rwxrwxr-x 1 caleb caleb 287K Mar 20 14:47 a
-rw-rw-r-- 1 caleb caleb   12 Mar 20 14:47 a.rs



Indeed, we find almost the exact same number as the author in the video, around **290 kB**. The author concludes that Rust-induced bloat is 290/14 => 20X bigger.

However, there are a couple things missing from the Rust compiler flags. Let's add those in:

In [15]:
sh(
    "~/tmp/rustsize",
    "rustc",
    &vec![
        "-O", 
        "-C", "strip=symbols",
        "-C", "lto=on",
        "-C", "codegen-units=1",
        "a.rs", 
    ]
)?;

In [16]:
sh("~/tmp/rustsize", "ls", &vec!["-lah"])?;

stdout:
total 272K
drwxrwxr-x 2 caleb caleb 4.0K Mar 20 14:47 .
drwxrwxr-x 4 caleb caleb 4.0K Mar 20 14:47 ..
-rwxrwxr-x 1 caleb caleb 259K Mar 20 14:47 a
-rw-rw-r-- 1 caleb caleb   12 Mar 20 14:47 a.rs



Unsurprisingly, adding LTO gives a 1 - 259/287 => 10% reduction in binary size.

There are a few more extra options beyond what was used in the C++ example, but these don't make a significant difference to a completely-empty program.

In [17]:
sh(
    "~/tmp/rustsize",
    "rustc",
    &vec![
        "-O", 
        "-C", "strip=symbols",
        "-C", "lto=on",
        "-C", "codegen-units=1",
        
        // Extra
        "-C", "opt-level=s",  // Optimize for size
        "-C", "panic=abort",  // Disable stack-unwinding
        
        "a.rs", 
    ]
)?;

In [18]:
sh("~/tmp/rustsize", "ls", &vec!["-lah"])?;

stdout:
total 256K
drwxrwxr-x 2 caleb caleb 4.0K Mar 20 14:47 .
drwxrwxr-x 4 caleb caleb 4.0K Mar 20 14:47 ..
-rwxrwxr-x 1 caleb caleb 243K Mar 20 14:47 a
-rw-rw-r-- 1 caleb caleb   12 Mar 20 14:47 a.rs



Now we're at (compared to the original Youtube example) 1 - 243/287 => 15.3% smaller.

Anyway, these extra flags and settings are making marginal improvements. The biggest impact is of course what is being linked into the binary and what is not. Currently we have a Rust binary with the stdlib statically linked at **243 kB**, and the C++ empty-project binary at **793 kB**. It doesn't seem to me that C++ is doing all that well here!

Or is it?

We haven't yet checked whether the Rust binary is linking dynamically to anything yet.  Apples-to-apples, remember? So let's look into that:

In [19]:
sh("~/tmp/rustsize", "ldd", &vec!["a"])?;

stdout:
	linux-vdso.so.1 (0x00007ffc4e1e0000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f070cb8d000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f070c965000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f070cbfd000)



Woah, the Rust binary is dynamically linked to libc! Why does it need to do that?  Here's a great concise explanation [from 2015](https://news.ycombinator.com/item?id=9436004):

> erickt on April 24, 2015:
>
> No it's not a stupid question. libc (or CRT on windows) really is the library that exposes all the user space system libraries. It contains the functions to do IO, sockets, threads, and etc. So we use it to expose that functionality to rust users.
>
> Now there are some languages, namely Go, that skip libc and just implement directly against the syscall interface. Go has the advantage of being able to draw from Google's vast experience interacting deep within the system, so it was comparatively cheap for them to do this.
>
> For rust, it never really felt like it was worth the effort for the benefit we'd get out of it. It was more important to get the language done. 

So Rust is still using the C runtime to interact with the OS through userspace libraries provided in the C runtime. So it doesn't need the C or C++ standard libraries, only the runtime for OS interaction.

Ok so for a fair comparison we should compare the sizes after statically linking against the CRT.


In [20]:
sh(
    "~/tmp/rustsize",
    "rustc",
    &vec![
        "-O", 
        "-C", "strip=symbols",
        "-C", "lto=on",
        "-C", "codegen-units=1",
        "-C", "opt-level=s",  // Optimize for size
        "-C", "panic=abort",  // Disable stack-unwinding
        
        // New
        "-C", "target-feature=+crt-static",
        
        "a.rs", 
    ]
)?;

In [21]:
sh("~/tmp/rustsize", "ldd", &vec!["a"])?;

stderr:
	not a dynamic executable



And the size?

In [22]:
sh("~/tmp/rustsize", "ls", &vec!["-lah"])?;

stdout:
total 1.2M
drwxrwxr-x 2 caleb caleb 4.0K Mar 20 14:48 .
drwxrwxr-x 4 caleb caleb 4.0K Mar 20 14:47 ..
-rwxrwxr-x 1 caleb caleb 1.2M Mar 20 14:48 a
-rw-rw-r-- 1 caleb caleb   12 Mar 20 14:47 a.rs



Now it's a whopping 1.2 MB. So if we do all the work to produce a fully static executable from an empty `main()` for both Rust and C++, we indeed find that C++ produces the smaller binary. It's 1 - 793/1200 => **34%** smaller than the Rust binary, which is clearly a much smaller difference than the ~ 20X comparison I gave at the start.

## Conclusion

What do you think this is, a research paper? My parting words are that none of this really matters. Binary size is certainly a concern in certain environments, but the real impact of whether you use Rust, C or C++ is not going to matter. The people working in the embedded space are already all over these issues. If you're interested in the embedded domain, do go check out the [Embedded Rust Working Group](http://blog.rust-embedded.org/).