# An Introduction to Assembly

In this lesson, you will learn to read and write assembly code, the instructions that a CPU executes to complete a task. Understanding Assembly language is key to understand how a machine executes a program, and can help discover security vulnerabilities. Read through the following lesson and answer the questions following each section.

## Part 1: Instruction Sets

What are instruction sets? Think of them as a list of things the CPU can do. We use a combination of these instructions to create a program. Where other programing languages have syntax for logic and data structures, assembly languages only have this instruction set to modify data. Though there aren't complex language structures, like there are in Python and C, we are able to execute our instructions directly on the chip using assembly. In fact, all programs that you write are eventually translated to assembly before being executed. It is important to understand how these programs work at the assembly level since this is exactly how they are executing on the processor.
		
Different types of processors have different instruction sets, just like different programming languages have different syntax. This means assembly written for an Intel CPU will not run on an ARM processor, and vice versa. Even different types of CPUs within the same family might not even use the same instruction set. 
		
Intel CPUs are CISC (Complex Instruction Set Computing) meaning it has more instructions that it is capable of executing, but that also makes the hardware more complex. CISC processors are often used in PCs, servers, and consoles.
		
Arm processors are RISC (Reduced Instruction Set Computing), and have a smaller instruction set. The smaller instruction set means that programs need to be written with efficiency in mind, while a CISC program may be shorter, since it has more pre-defined functions. 
		
Additionally, CISC processors have a lot of memory management instructions, so accessing areas of memory is common. While CISC processors have registers, RISC processors have more and operate using values loaded into these registers. Therefor, the only memory management instructions available to RISC is to load a value to a register or store that value to memory. 

### Exercise

Why might an embedded device benefit from using a RISC unit above a CISC unit. Additionally, why might a more traditional unit utilize a CISC processor? Can you think of any examples of the opposite occuring?

## Part 2: The ARM Instruction Set
		
Writing a program in assembly is like assembling a set of building blocks. These blocks are the instructions that the CPU can follow. A general instruction looks like this:
		
    MNEMONIC{S}{condition} {Rd}, Operand1, Operand2
		
- MNEMONIC: The short name for the instruction
- {S}: An optional suffix for the instruction, it could modify how the instruction executes
- {condition}: An optional qualifier for the instruction. If the condition is met, the instruction is executed.
- {Rd}: Register where the result of the operation (instruction) is stored.
- Operand1: This can be a register or a constant value.
- Operand2: Also a register or constant value.
		
Operands and values will be covered later, but now let's go over some common instructions:

![instructions](instructions.png)
		
With these instructions, you are almost ready to write a program! But first you must understand the structure of an assembly program. You can define variables in the top part of the code, this is called the .data section and labeled so. The variables declared here won't have predefined memory addresses, but they can be referenced in the instructions. The instructions are defined in the .text section of the assembly. Here is an example program with both these sections defined:

### Exercise

Define a .data section that can store a 128 bit symmetric key.

## Part 3: Operands and Values:
		
You can call these instructions to operate on set values, or on values located in registers. 
		
A constant is defined with a pound sign, like so: #123
		
A register is defined by an Rx, like: R1, R2
		
The amount of registers is dependent on the processor, this information can be found in the datasheet. 
		
Each instruction, after translating to machine code, is 32 bit (hence, 32 bit architecture). This means that space is limited when it comes to operands. If you try to use an invalid value, the assembler will respond with an error. If this is the case, you can split a value into two halves, and load them into adjacent registers. the LDR instruction (mentioned below) will also handle this.
		
You may wish to label values so that you can refer to them in multiple places, sort of like a variable. These are called literals, and are generally defined at the bottom of the text section (code). You label them in the same way you might label a subroutine (more on that later). Here is an example of a literal definition:
		
    adr_var1: .word var1
		
This assigns the address of var1 to a word sized (4 byte) constant called adr_var.


### Exercise

Write a line of assembly that uses the value stored in R5. You can use any instruction from the graphic above in part 2!

## Part 4: Memory Instructions
		
As mentioned earlier, ARM assembly allows you to read information from memory to a register, do some processing on it, and store it back in memory. This section will go over how to use the load and store instructions to achieve this. 
		
First, look at this example of incrementing a value from memory. The address of the value is currently stored in R0 (Register 0).
		
    ldr r1, [r0]
    add r1, r1 #1
    str r1, [r0]
		
The brackets are used to denote that the register holds an address that we want to read from or write to.
		
There are additional forms of addressing:
		
Offset: [r1, #2]
		This will add 2 to the address held in r1, but r1 will remain unmodified.
		
Pre-indexed: [r1, #4]!
This is the same as offset, the address will be the value in r1 plus 4, but now r1 will be updated to be r1+4.
		
Post-indexed: [r1], #4
This will use the address held in r1, but after the instruction, r1 will be updated to be r1+4

### Exercise

Determine the values stored in each register (R1, R2, R3) from this set of assembly instructions

## Part 5: Conditionals
	
I mentioned earlier that assembly is different than a programming language like C because of C's program structure. Well, assembly does have a structure, in fact, it can execute the same type of logic that any high level programming language can, through the use of conditional statements and branching.
		
Each ARM processor has a Current Program Status Register (CPSR) that is used to monitor the current program status. It contains information about a carry bit, if the processor is doing adding, if an overflow occurred during an operation, and if an operation resulted in a 0 or not. The last status bit is known as the zero bit. It can be set to zero when comparing two values. For example, if you wanted to check if two values were equal, you could subtract one from the other. If your result was zero, the zero bit would be set to 1! Similarly, if they were not equal, the zero flag would be set to 0. A decision can then be made about the program based on this zero flag.
		
ARM processors use a set of condition codes that compare the values then check these status bits. If the conditions are met, then the instruction will execute. The condition codes are added as suffixes to the instruction that depends on the condition. Here is a list of possible conditional codes in ARM assembly:

![conditionals](conditionals.png)

To use conditionals, compare two values using the CMP instruction. In this example, the value in r0 is being compared to the value 3 twice. This CMP operation will set flags in the CPSR, and following instructions, like the ADDLT instruction, will use these flags to conditionally execute.

### Exercise 

Create a program that compares the values of two registers (R1 & R2). If R1's value is larger, add R2's value to R1's. If R2's value is larger, subtract R1's value from R2. If they are even, both registers should be set to the value 0.

## Part 6: Branches and Subroutines
		
Branching allows a program to "jump" to another part and start executing instructions from there. The B instruction consumes a memory address, and, just like if you were to call a function in a C program, the next line which is being executed is now not the next line down, it is somewhere else in the program.
		
You can define sections of assembler code just like you can a function, making it easier to jump to. This is called a subroutine. There is a main routine, and a set of other subroutines that exist in a program. In reality, it is just a memory address where to start executing code, but you can reference the subroutine by its name, like this:

Branches do not need to be conditional, they can just jump to continue to another part of code, however, what if you need to return to just after the branch statement? This is where the Link Register (LR) comes into play. By using the B instruction with an L suffix, the memory address following the branch instruction is stored in the LR. When the subroutine finishes, the program will jump to whatever is in the LR, and the program will continue.

### Exercise

Re-write the solution to exercise 5 to use subroutines. That is, move each result into its own subroutine and jump to that subroutine based on the result of the comparison.

## Part 7: The Stack
		
When a computer is executing a program, it may not know how much space it will need to store all its variables and information. To solve this, it uses a growing data structure called the stack. A stack is able consume information, and when asked, will return the most recent thing it was given. Think about a stack of plates, you can add more and more plates, but only remove the top plate, then the next\85 These operations are called "pushing" onto the stack, and "popping" off the stack.
		
In reality, a stack is just a sequential space in memory, sort of like an array. Since we have a limited amount of registers, we can't remember every address of data that we add, but we can give the stack that data, and come back to it when we need it. To achieve this, we use a register called the Stack Pointer (SP). Depending on whether the stack is growing up or down, when we push a value onto the stack, the stack pointer will increase or decrease to the next available memory address. 
		
This is an example program that shows how we can use the stack to temporarily store data, then recover it when we need it:

Another aspect of the stack is the Frame Pointer (FP). When the assembly code jumps to another function, a section of the stack is set aside in anticipation for any local variables that are created in that function. At the bottom of the frame, meaning the frame will grow above it, exists the address where the FP points. Just below that is where the LR points, to assist in returning to the calling function. The memory in the frame is used to store local values, and they can be referenced by using an offset address with the FP. The creation of frames on the stack help functions stay organized.

### Exercise

How might the stack be used to save the value of one of the registers we have talked about so far? Think about how subroutines must return even if a program is multiple subroutines deep. Next, describe what this segment of code does:

## Review
a. ARM instruction sets are simpler than Intel instruction sets, and use registers to hold information.

b. ARM instructions are 32 bit, and follow the format: MNEMONIC{S}{condition} {Rd}, Operand1, Operand2.

c. Immediate values are denoted with a pound sign, registers begin with a lowercase r.

d. You can load and store information from memory to registers and vice versa using the LDR and STR instructions.

e. The CPSR is used by conditional instructions to determine execution.

f. The CMP instruction will impact the CPSR.

g. Branching conditionally allows a program to take multiple paths.

h. The Stack grows as more values are added to it.

i. Frames are added to the stack with each function call.