Skip to content
This repository has been archived by the owner on Sep 2, 2018. It is now read-only.

Getting Started

Dylan McKay edited this page May 3, 2017 · 21 revisions

NOTE

This backend has been completely merged into LLVM master. Please do not use this repository, use upstream LLVM instead.

This repository is not kept up to date! This tutorial also points to a very old version of the code

Getting started with AVR-LLVM

Caveat: Do not trust AVR-LLVM to generate correct, unbroken, and working output files. It is a work in progress, and in using it, it will more than likely bite off your ear. Be warned.

What does LLVM do?

LLVM is a compiler middle end and backend.

A compiler can usually be generalised into three stages:

|-----------|      |-------------|      |-------------|
|           |      | Middle End  |      | Back End    |
| Front End | ---> | (Optimizer) | ---> | (Generates  |
|  (Parser) |      |             |      | code for a  |
|-----------|      |-------------|      | real CPU)   |
                                        |-------------|

The front end parses a specific programming language. For one end to interface with another, they must use a common language. A naive way of implementing this would be to write a middle end and a backend for each front end, i.e programming language. Concretely, the C++ parser gives parsed C++ code to the middle end, which then optimises it into simpler, faster C++, and the backend then converts it to machine code.

This does not scale well. Rewriiting two thirds of the code for each source language would be a nightmare - imagine GCC, with the dozen or so programming languages it supports - that's the bulk of the code duplicated 12 times!

Most compiler authors solve this the same way that most computer scientists solve problems - by simply adding a layer of abstraction.

LLVM defines its own programming language for interoperation between the various ends of the compiler. This language is called LLVM Intermediate Representation (LLVM IR). LLVM IR looks quite a bit like assembly language, but it is designed to work independent of any CPU or programming language.

This way, each front end has the job of converting its source language into LLVM IR, and then passing it to the middle end which then optimises it, and the backend when then outputs some target-specific file. This allows us to write one middle end for all programming languages. Because of this, we only need one middle end and one backend -- these should then be able to work with any programming language, providing that it can be converted to IR.

Due to all of this saved time, we can place much more time and effort on creating a good middle and back end. The LLVM optimiser is normally on par with GCC's, and every now and then comes up on top. LLVM is a high quality compiler library, and due to this, and the ease of use, any language can use it to produce high-quality, fast, and reliable code.

Of course, if you are reading this, you probably already know all of this.

What does AVR-LLVM add?

Once IR is simplified and transformed in the middle end, it is then the back end's job to convert into assembly or machine code. AVR-LLVM simply adds another backend to LLVM - one that can lower IR into assembler code for the AVR architecure.

How do I build it?

The first step is to grab the sources.

# get a copy of the avr-llvm repository
git clone https://github.com/avr-llvm/llvm.git

At this point, you can optionally build avr-clang - a port of the LLVM C/C++ compiler for AVR.

To do this, you must clone two repos into the original llvm repository.

# first, `cd` into the `llvm` directory.
cd llvm/

# grab the AVR clang sources and put them in `llvm/tools/clang`.
cd tools/
git clone https://github.com/avr-llvm/clang.git
cd ../

# also grab the LLVM compiler runtime library, a clang dependency.
# this one goes in `llvm/projects/compiler-rt`.
cd projects/
git clone https://github.com/avr-llvm/compiler-rt.git
cd ../

# now go back out of the cloned `llvm/` directory
cd ../

So long as these repositories exist, they will automatically be built when compiling LLVM.

Note that tools/clang, projects/compiler-rt and tools/lld are in LLVM's .gitignore, so you won't face any versioning troubles in placing them inside the LLVM repository.

Now generate GNU Makefiles for LLVM, using CMake:

# place all output files in `build/`
mkdir build
cd build/

# generate makefiles for LLVM and clang
cmake -G "Unix Makefiles" ../

# OR, don't compile LLVM for other targets - we only
# want to use the AVR backend. this saves us compile time.
cmake -G "Unix Makefiles" -DLLVM_TARGETS_TO_BUILD="AVR" ../

You can now simply use the generated Makefiles:

make

How do I use it?

LLVM is primarily a library. However, it does come with several tools.

Assembling AVR assembly files

LLVM comes with a tool named llvm-mc for converting assembly files to machine code.

To assemble main.s:

# assemble `main.s` into `main.o`
llvm-mc -arch=avr -filetype=obj main.s -o main.o

Assembling or compiling LLVM IR files

The llc tool is used to lower LLVM IR into AVR assembly or machine code.

Given the LLVM IR file main.ll:

# lower `main.ll` into the AVR assembly file `main.s`
llc main.ll -march=avr -filetype=asm -o main.s

# lower `main.ll` into the AVR ELF object file `main.o`
llc main.ll -march=avr -filetype=obj -o main.o

Compile C/C++ sources

If you cloned clang and compiler-rt in the build process, you should then have clang and clang++ in the bin/ directory. Clang is designed to be compatible with GCC, so you should be able to swap invocations of gcc and g++ with clang and clang++ respectively.

To compile main.c:

# compile `main.c` into the AVR ELF object file `main.o`
clang -c --target=avr main.c -o main.o -mmcu=atmega328p

# compile `main.c` into the AVR assembly file `main.s`
clang -c -S --target=avr main.c -o main.s -mmcu=atmega328p

# compile `main.c` into the LLVM IR file `main.ll`
clang -c -S --target=avr -emit-llvm main.c -o main.ll -mmcu=atmega328p

Linking

AVR-LLVM does not currently include a linker. In the meantime, it is possible to use avr-ld from the AVR-GCC package. If it exists, clang will use it automatically if you do not pass the -c flag.

Clone this wiki locally