# ARM assembler in Raspberry Pi

Machine language is built up from discrete statements or instructions implemented by a particular processor. ARM is a family of instruction set architectures for computer processors and is the one used by the processor of the Raspberry Pi. The machine language is interpreted by the computer in term of binary codes. Binary code is what a computer can run. It is composed of instructions, that are encoded in a binary representation (such encodings are documented in the ARM manuals). You could write binary code encoding instructions but that would be painstaking. So instead we will write assembler language. Assembler language is just a thin syntax layer on top of the binary code.

Since the computer cannot run assembler we have to get binary code from it. We use a tool called, well, assembler to assemble the assembler code into a binary code that we can run. The tool to do this is called as. In particular GNU Assembler, which is the assembler tool from the GNU project, sometimes it is also known as gas for this reason. This is the tool we will use to assemble our programs.

## Our first ARM Program

We will start with a simple program which does nothing.

In [None]:
.global main         /* 'main' is our entry point and must be global */

main:                /* This is main */
    mov r0, #2       /* Put a 2 inside the register w0 */

In the program above, we have put a 2 inside the register 0. We can check the registers with the following command:

In [None]:
>>> show registers

We can see a list of all register from 0 to 12. However, notice that in the register 0, there is a 2 in there now

## Well, what happened?
We made things easier by using an emulated processor. Although the Raspberry Pi 5 we are using has 64-bit ARM architecture, this notebook emulates an ARM Cortex M3 processor, which is a 32-bit ARM processor. We wrote a simple assembly program that only assigns `2` to `r0` and then we inspected the value of registers.

Let's review every line of our assembly code step by step:

The comments are enclosed in `/*` and `*/`. Use them to document your assembler as they are ignored during execution. Avoid nesting `/*` and `*/` inside `/*` because it does not work.

```asm
.global main         /* 'main' is our entry point and must be global */
```

This is a directive for GNU Assembler. A directive tells GNU Assembler to do something special. They start with a dot (`.`) followed by the name of the directive and some arguments. In this case we are saying that main is a global name. This is needed because the C runtime will call main. If it is not global, it will not be callable by the C runtime and the linking phase will fail.

```asm
main:                /* This is main */
```

In GNU Assembler, every line that isn’t a directive typically follows the format `label: instruction`. Both `label:` and `instruction` can be omitted (empty or blank lines are ignored). A line containing only a `label:` applies that label to the next instruction. Multiple labels can refer to the same instruction in this way. The instruction itself is written in ARM assembly language. In the example above, we are simply defining the `main` label without any associated instruction.

```asm
mov r0, #2       /* Put a 2 inside the register w0 */
```
    
Whitespace at the beginning of the line is ignored, but indentation visually indicates that this instruction belongs to the `main` function. <br/>
The `mov` instruction stands for move, and in this case we move value `2` to the register `r0`. In the next chapter we will see more about register. ARM syntax places the destination on the left, so the command reads as "move the immediate value `2` into register `r0`" (this effectively overwrites whatever register `r0` may have at that point). We'll cover more about registers and immediate values in the next chapter.

```
>>> show registers
```

This line of code is included in the next cell. `show` commands are not part of the ARM Assembly language; rather they are part of a collection of commands that help us inspect the state of the processor and execute actions outside of ARM's capabilities. This command is built into the ARM Kernel within this notebook.


## Registers

At its core, a processor in a computer is nothing but a powerful calculator. Calculations can only be carried using values stored in very tiny memories called registers. The ARM processor in a Cortex M3 has 16 integer registers and 32 floating point registers. A processor uses these registers to perform integer computations and floating point computations, respectively. We will put floating registers aside for now and eventually we will get back to them in a future installment. Let’s focus on the integer registers.

ARM processors have 16 integer registers, named `r0` to `r15`. Registers `r0`-`r12` are 32-bit general purpose registers. However, `r13`-`r15` are unique registers to serve specific purposes: `r13` contains two stack pointers, `r14` is the link register, and `r15` is the program counter. In this lab, we won’t need many registers; at most, we will only use up to 3 registers. That said, it is convenient to represent integers in two's complement as there are instructions which perform computations assuming this encoding. So from now, except noted, we will assume our registers contain integer values encoded in two's complement.

## Instructions

Some of the instructions for the ARM assembler are different from the ones for x86 covered in lecture. One such difference is that many instructions now have an explicit destination register, rather an implicit one. For example, the addition instruction `add` takes 3 operands instead of 2 – the first is the destination register where the result will be stored, and the next two operands are the numbers you want to add together, e.g. if you want to sore the result of `r1 + r2` in the register `r0`, you would use:

```asm
add, r0, r1, r2
```

## Branching

In x86, the commands used to branch use the “jump” nomenclature, and as such usually begin with `j`, or use `jmp` for an unconditional branch. With ARM assembly, you branch with the instruction `b` followed by an optional condition. You still use the `cmp` instruction to compare two numbers, it’s just the branch statement that changes. For example, an unconditional jump to the label `exit` would use the command:

```asm
b exit
```

Whereas a jump to the label `top`, based on whether or not the value in register `r0` was less than 0 would use the commands:

```asm
cmp r0, #0
blt top
```

The following table shows the different conditions for x86 and ARM (we’ll stick to signed number comparisons)

| Condition                    | x86 Instruction | ARM Instruction |
| ---------------------------- | --------------- | --------------- |
| equal (==)                   | je              | beq             |
| not equal (!=)               | jne             | bne             |
| less than (<)                | jl              | blt             |
| less than or equal to (≤)    | jle             | ble             |
| greater than (>)             | jg              | bgt             |
| greater than or equal to (≥) | jge             | bge             |

## Exercise #1

Almost every processor can do some basic arithmetic computations using the integer registers. So do ARM processors.  You can **add** two registers. Let's retake our example from above:

In [None]:
.global main

main:
    mov r1, #3
    mov r2, #4 
    add r0, r1, r2

>>> show registers[0-2]

If we compile and run this program, `r0` will contain `7`.

## Exercise #2

Write an assembly language program that loops through the numbers 1 through 10, adding each to a total, and exiting the loop as appropriate. Once the loop is complete, the total value should be returned, just as we have in the previous exercise and above examples such that it can be examined from the command line. In this example, the result should be 55 (1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 = 55).

In [None]:
/* Code here */