Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,20 +1,20 @@
---
title: Basics of Compilers
title: Compiler basics
weight: 2

### FIXED, DO NOT MODIFY
layout: learningpathall
---

## Introduction to C++ and Compilers
## Introduction to C++ and compilers

The C++ language gives the programmer the freedom to be expressive in the way they write code - allowing low-level manipulation of memory and data structures. Compared to managed languages, such as Java, C++ source code is generally less portable, requiring recompilation to the target Arm architecture. In the context of optimizing C++ workloads on Arm, significant performance improvements can be achieved without modifying the source code, simply by using the compiler correctly.
The C++ language gives you the freedom to be expressive in the way you write code - allowing low-level manipulation of memory and data structures. Compared to managed languages, such as Java, C++ source code is generally less portable, requiring recompilation to the target Arm architecture. In the context of optimizing C++ workloads on Arm, significant performance improvements can be achieved without modifying the source code, simply by using the compiler correctly.

Writing performant C++ code is a topic in itself and out of scope for this learning path. Instead we will focus on how to effectively use the compiler to target Arm instances in a cloud environment.
Writing performant C++ code is a complex topic, but you can learn how to effectively use the compiler to target the Arm architecture for a Linux application.

## Purpose of a Compiler
## What is the purpose of a compiler?

The g++ compiler is part of the GNU Compiler Collection (GCC), which is a set of compilers for various programming languages, including C++. The primary objective of the g++ compiler is to translate C++ source code into machine code that can be executed by a computer. This process involves several high-level stages:
The G++ compiler is part of the GNU Compiler Collection (GCC), which is a set of compilers for various programming languages, including C++. The primary objective of the g++ compiler is to translate C++ source code into machine code that can be executed by a computer. This process involves several high-level stages:

- Preprocessing: In this initial stage, the preprocessor handles directives that start with a # symbol, such as `#include`, `#define`, and `#if`. It expands included header files, replaces macros, and processes conditional compilation statements.

Expand All @@ -24,19 +24,25 @@ The g++ compiler is part of the GNU Compiler Collection (GCC), which is a set of

- Linking: The final stage involves linking the object code with necessary libraries and other object files. The linker merges multiple object files and libraries, resolves external references, allocates memory addresses for functions and variables, and generates an executable file that can be run on the target platform.

An interesting fact about the g++ compiler is that it is designed to optimize both the performance and the size of the generated code. The compiler performs various optimizations based on the knowledge it has of the program, and it can be configured to prioritize reducing the size of the generated executable.
An interesting fact about the GNU compiler is that it is designed to optimize both the performance and the size of the generated code. The compiler performs various optimizations based on the knowledge it has of the program, and it can be configured to prioritize reducing the size of the generated executable.

### Compiler versions

### Compiler Versioning
Two popular compilers of C++ are the GNU Compiler Collection (GCC) and LLVM - both of which are open-source compilers and have contributions from Arm engineers to support the latest architectures. Proprietary or vendor-specific compilers, such as `nvcc` for compiling for NVIDIA GPUs, are often based on these open-source compilers. Alternative proprietary compilers are often designed for specific use cases. For example, safety-critical applications may need to comply with various ISO standards, which also include the compiler. The functional safety [Arm Compiler for Embedded](https://developer.arm.com/Tools%20and%20Software/Arm%20Compiler%20for%20Embedded%20FuSa) is an example of a C/C++ compiler.

Two popular compilers of C++ are the GNU Compiler Collection (GCC) and LLVM - both of which are open-source compilers and have contributions from Arm engineers to support the latest architectures. Proprietary or vendor-specific compilers, such as `nvcc` for compiling for NVIDIA GPUs, are often based on these open-source compilers. Alternative proprietary compilers are often designed for specific use cases. For example, safety-critical applications may need to comply with various ISO standards, which also include the compiler. The functional safety [Arm Compiler for Embedded](https://developer.arm.com/Tools%20and%20Software/Arm%20Compiler%20for%20Embedded%20FuSa) is such an example of a C/C++ compiler.
If you are an application developer who is not working in the safety qualification domain, you can use the open-source GCC/G++ compiler.

Most application developers are not in this safety qualification domain so we will be using the open-source GCC/G++ compiler for this learning path.
There are multiple Linux distributions available to choose from. Each Linux distribution has a default compiler.

There are multiple Linux distribtions available to choose from. Each Linux distribution and operating system has a default compiler. For example after installing the default g++ on an `r8g` AWS instance, the default g++ compiler as of January 2025 is below.
Print the version information for your compiler:

``` output
```bash
g++ --version
```

For example, after installing `g++` on Ubuntu 24.04, the default compiler as of January 2025 is shown below.

```output
g++ (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
Expand All @@ -59,9 +65,9 @@ Red Hat EL8 | 8*, 9, 10 | 10
SUSE Linux ES15 | 7*, 9, 10 | 7


The biggest and most simple performance gain can be achieved by using the most recent compiler available. The most recent optimisations and support will be available through the latest compiler.
The easiest way to achieve a performance gain is by using the most recent compiler available. The most recent optimizations and support are available through the latest compiler.

Looking at the g++ documentation as an example, the most recent version of GCC available at the time of writing, version 14.2, has the following support and optimisations listed on their website [change note](https://gcc.gnu.org/gcc-14/changes.html).
Looking at the G++ documentation as an example, the most recent version of GCC, version 14.2, has the following support and optimizations listed in the [release notes](https://gcc.gnu.org/gcc-14/changes.html).

```output
A number of new CPUs are supported through the -mcpu and -mtune options (GCC identifiers in parentheses).
Expand All @@ -70,14 +76,13 @@ A number of new CPUs are supported through the -mcpu and -mtune options (GCC ide
- Arm Cortex-A720 (cortex-a720).
- Arm Cortex-X4 (cortex-x4).
- Microsoft Cobalt-100 (cobalt-100).
...
```

Sufficient due diligence should be taken when updating your C++ compiler because the process may reveal bugs in your source code. These bugs are often undefined behaviour caused by not adhering to the C++ standard. It is rare that the compiler itself will introduced a bug. However, in such events known bugs are made publicly available in the compiler documentation.
Sufficient due diligence should be taken when updating your C++ compiler because the process may reveal bugs in your source code. These bugs are often undefined behavior caused by not adhering to the C++ standard. It is rare that the compiler itself will introduced a bug. However, in such events known bugs are made publicly available in the compiler documentation.

## Basic g++ Optimisation Levels
## Basic G++ optimization levels

Using the g++ compiler as an example, the most course-grained dial you can adjust is the optimisation level, denoted with `-O<x>`. This adjusts a variety of lower-level optimsation flags at the expense of increased computation time, memory use and debuggability. When aggresive optimisation is used, the optimised binary may not show expected behaviour when hooked up to a debugger such as `gdb`. This is because the generated code may not match the original source code or program order, for example from loop unrolling and vectorisation.
Using the G++ compiler as an example, the most course-grained dial you can adjust is the optimization level, denoted with `-O<x>`. This adjusts a variety of lower-level optimization flags at the expense of increased computation time, memory use and debuggability. When aggressive optimization is used, the optimized binary may not show expected behavior when hooked up to a debugger such as `gdb`. This is because the generated code may not match the original source code or program order, for example from loop unrolling and vectorization.

A few of the most common optimization levels are in the table below.

Expand All @@ -90,4 +95,4 @@ A few of the most common optimization levels are in the table below.
| `-Os` | Optimizes code size, reducing the overall binary size. |
| `-Ofast` | Enables optimizations that may not strictly adhere to standard compliance. |

Please refer to your compiler documentation for full details on the optimisation level, for example [GCC](https://gcc.gnu.org/onlinedocs/gcc-14.2.0/gcc/Optimize-Options.html).
Please refer to your compiler documentation for full details on the optimization level, for example you can review the G++ [Options That Control Optimization ](https://gcc.gnu.org/onlinedocs/gcc-14.2.0/gcc/Optimize-Options.html).
Original file line number Diff line number Diff line change
@@ -1,22 +1,26 @@
---
title: Setup Your Environment
title: Set up Your Environment
weight: 3

### FIXED, DO NOT MODIFY
layout: learningpathall
---

If you are new to cloud computing, please refer to our learning path on [Getting started with Servers and Cloud Computing](https://learn.arm.com/learning-paths/servers-and-cloud-computing/intro/).
If you are new to cloud computing, please refer to [Getting started with Servers and Cloud Computing](https://learn.arm.com/learning-paths/servers-and-cloud-computing/intro/). It provides an introduction to the Arm servers available from various cloud service providers.

## Connect to an AWS Arm-based Instance
## Connect to an AWS Arm-based instance

In this example we will be building and running our C++ application on an AWS Graviton 4 (`r8g.xlarge`) instance running Ubuntu 24.04 LTS. Once connected run the following commands to confirm the operating system and archiecture version.
In this example you will build and run a C++ application on an AWS Graviton 4 (`r8g.xlarge`) instance running Ubuntu 24.04 LTS.

Create the AWS instance using your AWS account. Connect to the instance using SSH or AWS Session Manager so you can enter shell commands.

Once connected, run the following commands to confirm the operating system and architecture.

```bash
cat /etc/*lsb*
```

You will see an output such as the following:
You see output similar to:

```output
DISTRIB_ID=Ubuntu
Expand All @@ -25,58 +29,65 @@ DISTRIB_CODENAME=noble
DISTRIB_DESCRIPTION="Ubuntu 24.04.1 LTS"
```

Next, we will confirm we are using a 64-bit Arm-based system using the following command
Next, confirm we are using a 64-bit Arm-based system using the following command

```bash
uname -m
```

You will see the following output.
You see the following output:

```output
aarch64
```

## Enable Environment modules
## Enable environment modules

Environment modules is a tool to quickly modify your shell configuration and environment variables. For this activity, it allows you to quickly switch between different compiler versions to demonstrate potential improvements.

Environment modules are a tool to quickly modify your shell configuration and environment variables. For this learning path, it allows us to quickly switch between different compiler versions to demonstrate potential improvements.
First, you need to install the environment modules package.

Install Environment Modules
In your terminal and run the following command:

First, you need to install the environment modules package. Open your terminal and run the following command:
```bash
sudo apt update
sudo apt install environment-modules
```
```bash
sudo apt update
sudo apt install environment-modules
```

Load environment modules after the package is installed.
Load environment modules after the package is installed:

```bash
sudo chmod 755 /usr/share/modules/init/bash
source /usr/share/modules/init/bash
```
Reload your shell configuration.

Reload your shell configuration:

```bash
source ~/.bashrc
```

Install various compiler version on your Ubuntu system. For this example we will install version 9 of the gcc/g++ compiler to demonstrate potential improvements your application could achieve.
Install multiple compiler versions on your Ubuntu system. For this example you can install GCC version 9 to demonstrate potential improvements your application could achieve.

Install GCC version 9:

```bash
sudo add-apt-repository ppa:ubuntu-toolchain-r/test
sudo apt update
sudo apt install gcc-9 g++-9
sudo apt install gcc-9 g++-9 -y
```

Create a module file for each compiler installed.

```bash
mkdir -p ~/modules/gcc
nano ~/modules/gcc/9
```
Copy and paste the text below into the nano text editor and save the file
```ouput

Use a text editor to modify the file `~/modules/gcc/9`

Copy and paste the text below into the file and save it.

```console
#%Module1.0
prepend-path PATH /usr/bin/gcc-9
prepend-path PATH /usr/bin/g++-9
Expand Down
Loading