

# ARM Assembly – Part 2

Jianjun Chen (<u>Jianjun.Chen@xjtlu.edu.cn</u>)

ARM Assembly – Part 2

#### Contents

- Accessing the Main Memory
- Flags and the CPSR Register
  - Setting flags
  - Instructions that read flags
    - Branch instruction
    - Arithmetic with the carry bit.

# Accessing the Main Memory

Preparing data in memory
Store and Load



## Declaring Large Data in Memory

- You can use mov to assign values to a register.
- The range of immediate values is limited.
- To create larger values, you can use the OR instruction to combine small parts:

```
      mov
      r0, #0x6300000

      orr
      r0, r0, #0x00640000

      orr
      r0, r0, #0x00006500

      orr
      r0, r0, #0x0000066
```

• The process is tedious though.

## Declaring Large Data in Memory

• Instead of combining immediate values, you can declare words with data or constants or empty words directly in memory and then load it to registers.

|        |                                                                | IM ala (1817                          |
|--------|----------------------------------------------------------------|---------------------------------------|
| Opcode | Meaning 4 mge                                                  | Format                                |
| DCD    | Declare Word(s) in Memory                                      | name DCD value_1, value_2, value_N    |
| EQU    | Create a symbol in the symbol table. Can also create constants | name equ expression                   |
| FILL   | Declare Empty Word(s) in Memory                                | {name} FILL A N must be a multiple of |
|        |                                                                |                                       |

- The names of these memory blocks are called symbols.
- DCD, EQU and FILL are called assembler directives.

## DCD and FILL

- DCD: Each piece of integer is stored in a separate word.
- FILL: These named words in memory will be filled with zeros.



## **EQU**

• Constants can be found in the "Symbols" window:



## Symbols and Values

- Each symbol is associated with a value.
- If the symbol is created with DCD or FILL, the value is the memory address allocated.
- If the symbol is created with  $\mathbb{EQU}$ , the value is just a normal constant.



## Getting the Address of a Symbol

• To get the address/value associated with a symbol, use ADR:

ADR { cond } Rd, label
 Example 3.1:

 After running the code on the right.
 Registers' values are changed.

 Adr r0, integer1 adr r1, const1 adr r2, var1

|             |        |         | _  |           |         |         |
|-------------|--------|---------|----|-----------|---------|---------|
| Registers   | Memory | Symbols | Ι. |           |         |         |
| Data Symbol |        | Value   | ш  | Registers | Memory  | Symbols |
| integer1    |        | 0×200   | Н  | RU        | → 0×200 |         |
| var1        |        | 0×208   | Ш  | R1        | 0×14    |         |
| EQU Symbol  |        | Value   |    | R2        | 0×208   |         |
| const1      |        | 0×14    | Ι' |           |         |         |
| const2      |        | 0×29    |    |           |         |         |

## Loading Data from Memory

• With a memory address in a register, you can load data from

```
that address to a register:

| The property of the property of
```

#### • Items:

- Rt is the register to load.
- Rn is the register on which the memory address is based.
- #offset is the amount of offset in bytes.
- {type}: can be either B (store a byte) or omitted (store a word).
- {cond}: explained in "Flags and Conditions".

## Storing Data to Memory

STR stores the content of Rt to the main memory :

```
STR{type}{cond} Rt, [Rn {, #offset}]
STR{type}{cond} Rt, [Rn, #offset]!
STR{type}{cond} Rt, [Rn], #offset
```

#### • Items:

- Rt is the register holding the data to be stored into the memory.
- Rn is a register on which the memory address is based.
- #offset is an immediate value indicating the offset from the memory address of Rn.
- {type}: Can be either B (store a byte) or omitted (store a word).
- {cond}: explained in "Flags and Conditions".

## Store/Load with Immediate Offset

#### Syntax:

```
STR{type}{cond} Rt, [Rn {, #offset}]
LDR{type}{cond} Rt, [Rn {, #offset}]
```

Symbol

Address







## Store/Load with Immediate Offset

Also in example 3.1.



• LDRB clears the whole register, then load the value from memory to register:



• STRB writes the <u>least significant byte</u> of a register to the target memory byte:



## LDR and STR: More {type}

- Only LDR/STR and LDRB/STRB are supported by VisUAL.
- Other modes are available on real ARM processors.

| Table | A 4 12 | Load/store | instructions |
|-------|--------|------------|--------------|
| rabie | A4-12  | Load/Store | mstructions  |

|                          |       | ,     |                      |                       |                    |                     |
|--------------------------|-------|-------|----------------------|-----------------------|--------------------|---------------------|
| Data type                | Load  | Store | Load<br>unprivileged | Store<br>unprivileged | Load-<br>Exclusive | Store-<br>Exclusive |
| 32-bit word              | LDR   | STR   | LDRT                 | STRT                  | LDREX              | STREX               |
| 16-bit halfword          | -     | STRH  | -                    | STRHT                 | -                  | STREXH              |
| 16-bit unsigned halfword | LDRH  | -     | LDRHT                | -                     | LDREXH             | -                   |
| 16-bit signed halfword   | LDRSH | -     | LDRSHT               | -                     | -                  | -                   |
| 8-bit byte               | -     | STRB  | -                    | STRBT                 | -                  | STREXB              |
| 8-bit unsigned byte      | LDRB  | -     | LDRBT                | -                     | LDREXB             | -                   |
| 8-bit signed byte        | LDRSB | -     | LDRSBT               | -                     | -                  | -                   |
| Two 32-bit words         | LDRD  | STRD  | -                    | -                     | -                  | -                   |
| 64-bit doubleword        | -     | -     | -                    | -                     | LDREXD             | STREXD              |
|                          |       |       |                      |                       |                    |                     |

# LDR and Memory Alignment

• Example 3.2:



The memory address involved in LDR is aligned to words:

(must be a multiply of 4)



The offset value must also be a multiply of 4

```
ldr r2, [r0, #4] Offset is 4, correct usage

ldr r2, [r0, #1] Offset is 1, incorrect usage
```

## LDRB and Memory Alignment

• Example 3.2:

| Symbol   | Address | Value      |
|----------|---------|------------|
| integer1 | 0×200   | 0×61626364 |
|          | 0×204   | 0×65666768 |

• The memory address involved in LDRB is aligned to bytes:



The offset value can be any integers:



# STR and Memory Alignment

• Example 3.2:



• The memory address involved in STR is aligned to words: (must be a multiply of 4)



• The offset value must also be a multiply of 4



# STRB and Memory Alignment

• Example 3.2:



The memory address involved in STRB is aligned to bytes:



- The offset value can be any integers.
  - Go read example 3.2

# S/L with Pre-Indexed Immediate Offset

• Syntax:

```
STR{type}{cond} Rt, [Rn, #offset]!
LDR{type}{cond} Rt, [Rn, #offset]!
```

#### Mechanism:

- The offset is added to or subtracted from the base register to form the memory address.
- The base register is then updated with this new address, to permit automatic indexing through an array or memory block.
- See example 3.3
  - Works like Array[++x].

## S/L with Post-Indexed Immediate Offset

#### Syntax:

#### Mechanism:

- The value of the base register alone is used as the memory address.
- The offset is then added to or subtracted from the base register. and this value is stored back in the base register, to permit automatic indexing through an array or memory block.
- See example 3.3
  - Works like Array [x++].

## LDR Pseudo Instruction

 You can use the LDR pseudo instruction to obtain the address of a label or setting a large value:



- ARM assembler will automatically convert them into their corresponding "true" instructions when executed.
- In example 3.3.

## Other Variants

 There are more variants of STR/LDR supported in real arm CPUs.

Check the official instruction sets:

https://www.keil.com/support/man/docs/armasm/armasm/dom1361289906890.htm

# Condition Flags and Related Instructions

Flags, Branch, Condition code, branch

# Flags



- Flags are small pieces of memory (bits) that indicate some status.
  - Stored in CPSR and SPSR registers
- The bits are set according to the most recently executed ALU instruction that has the special "s" suffix.

- Flags are kept until another s-suffix instruction is executed.
- Some other instructions like (CMP) can also set flags.
- These bits can be used for conditional execution of subsequent instructions.

# Flags

N: Negative. The N flag is set by an instruction if the result is negative. In practice, N is set to the two's complement sign bit of the result (bit 31).

Z: Zero. The Z flag is set if the result of the flag-setting instruction is zero.  $0 \times 7 - 0 \times 7 = 0$ 

C: Carry (or Unsigned Overflow). The C flag is set when an operation results in a carry, or when a subtraction results in no borrow.

# Instructions that Set Flags (s suffix)

• Compare (CMP):

```
CMP{cond} Rn, #imm32
CMP{cond} Rn, Rm
```

```
CMN{cond} Rn, #imm32
CMN{cond} Rn, Rm
```

#### Mechanism:

- Calculates the result of (Rn #imm32) or (Rn Rm).
- If the result is negative, set the N flag to 1.
- If the result is zero, set the Z flag to 1.
- If the result has carry digit, set the  $\mathbb C$  flag to 1.
- If the calculation has overflow, set the ∨ flag to 1.
- A similar instruction is called compare negative (CMN):
  - Calculates (Rn + #const) or (Rn + Rm)
  - Then set NZCV flags.

## Instructions that Set Flags

#### See example 3.4:

- Oxfffffff Ox0000001 -> NandC
  - = 0xfffffffe
- 0xffffffff + 0x0000001 -> ZandC
  - =  $0 \times 0$
  - If treated as unsigned number, the calculation overflows
- $0 \times 000000001 + 0 \times 000000001 ->$  None set!
  - =  $0 \times 10$
- $0 \times 0 0 0 0 0 0 0 f + 0 \times 7 f f f f f f f -> N and V$ 
  - =  $0 \times 8000000e$  (a negative number)
  - 0x7fffffff is the largest prepresentable positive number
  - If operands are treated as signed number, the result is negative, there's overflow. That's why V is set.

# Signed Overflow & Unsigned Overflow



- - So 0x7fffffff + 0x7ffffffff = 0xFFFFFFE triggers a signed overflow.
- But if  $0 \times 7 fffffffff$  is treated as an unsigned number, the result does not overflow.
  - Unsigned overflow is indicated by the carry bit set.

# Instructions that Use Flags (cond)

Branching to a target address:

- Items:
  - Branch redirects the execution of CPU to a new position indicated by a label.
  - A label is a symbol that represents the memory address of an instruction or data.
  - {cond}: We have seen it for many times, it will be explained soon!
- Example:
  - This is an infinite loop.



## **Condition Codes**



| Code            | Meaning (for CMP or SUBS)               | Flags Tested     |
|-----------------|-----------------------------------------|------------------|
| eq              | eqaul                                   | Z==1             |
| ne              | Not equal                               | Z==0             |
| cs or hs        | Unsigned higher or same (or carry set). | C==1             |
| cc or lo        | Unsigned lower (or carry clear)         | C==0             |
| mi              | Negative (minus)                        | N==1             |
| pl              | Positive or zero. (plus)                | N==0             |
| VS              | Signed overflow. (V set)                | V==1             |
| VC              | No signed overflow. (V clear)           | V==0             |
| hi              | Unsigned higher                         | (C==1) && (Z==0) |
| Is              | Unsigned lower or same                  | (C==0)    (Z==1) |
| ge              | Signed greater than or equal            | N==V             |
| It              | Signed less than                        | N!=V             |
| gt              | Signed greater than                     | (Z==0) && (N==V) |
| le              | Signed less than or equal               | (Z==1)    (N!=V) |
| al (or omitted) | Always executed                         | None tested      |

## Branch

- You can apply anyone of the condition code to the {cond} field of an instruction (that supports {cond}).
- How can we redesign the program so that it finishes when r0 = r1?



• Answer is in example 3.5.

exit

## BL & BX

• The previous solution works for a small task.



- These three numbers are stored in r0, r1 and r2.
- The sum should be put into r5.
- We wish to <u>call this function whenever needed</u>.
- With branch (See the second part of example 3.5):
  - We can jump to the sum function with its label.
  - But how can we make sure the program can return to where the sum function is called if there are multiple places call it?
  - We can mark it in a register, then use if/else in the sum function to jump back to the corresponding position. (Elegant solution?)

### BL & BX

• Solution: use "Branch with Link" together with "Branch and Exchange"

BL{cond} label

• The BL instruction causes a branch to label, and copies the address of the next instruction into LR (R14, the link register).

BX {cond} Rm

- $\bullet$  The BX instruction causes a branch to the address contained in  $\ensuremath{\mathsf{Rm}}$ 
  - And exchanges the instruction set, if required. (This is not supported in VisUAL)
  - <a href="https://www.keil.com/support/man/docs/armasm/armasm\_dom136">https://www.keil.com/support/man/docs/armasm/armasm\_dom136</a> 1289866466.htm

### BL & BX

- In general, BL allows a program to remember its previous position before jumping. The position is saved in LR.
- BX allows a program to jump to any point indicated by a register, which can be LR.
- Can you fix the problem of the second part of example 3.5 now?

# Solution for the get\_sum()

- Check example 3.6
- However, the VisUAL emulator does not support bx.
  - Alternative way? Change the program counter register directly!
  - See slides later.

## Arithmetic with Carry

- The carry (C) flag is set when an operation results in a carry, or when a subtraction results in no borrow.
  - For an addition, including the comparison instruction CMN, C is set to 1 if the addition produced a carry (that is, an unsigned overflow), and to 0 otherwise.
  - For a subtraction, including the comparison instruction CMP, C is set to 0 if the subtraction produced a borrow (that is, an unsigned underflow), and to 1 otherwise.
  - For non-additions/subtractions that incorporate a shift operation, C is set to the last bit shifted out of the value by the shifter.
  - For other non-additions/subtractions, C is normally left unchanged.

## Carry Bit: Subtraction

- The carry (C) flag is set (to 1) when an operation results in a carry, or when a subtraction results in no borrow.
  - For a subtraction, including the comparison instruction CMP, C is set to 0 if the subtraction produced a borrow (that is, an unsigned underflow), and to 1 otherwise.
  - <u>Produced a borrow</u>: the first number being smaller than the second number.
- Some examples (more can be found in example 3.7):

|   | Operands   | C bit | Reason                              |
|---|------------|-------|-------------------------------------|
|   | 10 - 11    | 0     | 10 is less than 11, produced borrow |
| X | 10 0       | 1     | There's no borrow                   |
|   | 0xffffffff | 1     | There's no borrow (example 3.4)     |
|   | <b>–</b> J |       |                                     |
|   | 0x00000001 |       |                                     |

## Carry Bit: Shift Operations

• For non-additions/subtractions that incorporate a shift operation, C is set to the last bit shifted out of the value by the shifter. Examples:

| examples                        | С | Reason                                    |  |
|---------------------------------|---|-------------------------------------------|--|
| mov r0, #0x0000001              | 0 | 0 at the most significant bit is shifted  |  |
| ls1s r0, r0, #1                 |   | out.                                      |  |
| mov r0, #0x0000001              | 1 | 1 at the least significant bit is shifted |  |
| lsrs r0, r0, #1                 |   | out.                                      |  |
| mov r0, #0x0000001              | 0 | 1 at the LSB is first shifted out, then   |  |
| lsrs r0, r0, #2                 | 0 | followed by 0.                            |  |
| mov r0, #0x0000001              | 1 | 1 at the LSB is rotated out               |  |
| rors r0, r0, #1                 |   |                                           |  |
| Do more experiments by yourself |   |                                           |  |

# Arithmetic with Carry Bit

• There are several arithmetic instructions that can include the carry bit as a part of input.

items between {}

| Name                        | Syntax are optional             |
|-----------------------------|---------------------------------|
| Add with Carry              | ADC{S}{cond} {Rd}, Rn, Operand2 |
| Subtract with Carry         | SBC{S}{cond} {Rd}, Rn, Operand2 |
| Reverse Subtract with Carry | RSC{S}{cond} {Rd}, Rn, Operand2 |

- Operand2 can be another register or an immediate.
- You can use ADC, SBC or RSC to synthesize multiword arithmetic.

# Arithmetic with Carry Bit

## When the carry bit is 1:

- ADC r0, r1, #2 -> r0 = r1 + 2
- SBC r0, r1, r2  $\rightarrow$  r0 = r1 r2
- RSC r0, r1, r2  $\rightarrow$  r0 = r2 r1

## • When the carry bit (is 0:

- ADC r0, r1, #2 -> r0 = r1 + 2
- SBC r0, r1, r2  $\rightarrow$  r0 = r1 r2 1
- RSC r0, r1, r2  $\rightarrow$  r0 = r2 r1 1

## Multiword Arithmetic Example

• Assume two 64-bit integers are stored in (r0, r1) and (r2, r3), the following two instructions will do a multiword addition:

