# Linker & Loader



#### Assembler

#### Assembler

- The assembler turns the assembly language program into an object file.
- Symbol table: A table that matches names of labels to the addresses of the memory words that instruction occupy.

| Object file header     |              |                  |            |
|------------------------|--------------|------------------|------------|
|                        | Name         | Procedure A      |            |
|                        | Text<br>size | 100hex           |            |
|                        | Data<br>size | 20hex            |            |
| Text segment           | Address      | Instruction      |            |
|                        | 0            | ld x10,0(x3)     |            |
|                        | 4            | Jal x1, 0        |            |
|                        |              |                  |            |
| Data segment           | 0            | (X)              |            |
|                        |              |                  |            |
| Relocation information | Address      | Instruction Type | Dependency |
|                        | 0            | ld               | Х          |
|                        | 4            | jal              | В          |
| Symbol table           | Label        | Address          |            |
|                        | X            | _                |            |
|                        | В            | _                |            |
|                        |              |                  |            |

ld x10, x jal x1, B

### Assembler (cont.)

- Psudoinstruction: a common variation of assembly language instructions often treated as if it were an instruction in its own right.
  - □ li x9, 123 -> addi x9, x0, 123
  - mv x10, x11 -> addi x10, x11, 0

### Linker (Link editor)

- Linker takes all the independently assembled machine language programs and "stitches" them together to produce an executable file that can be run on a computer.
- There are three steps for the linker:
  - Place code and data modules symbolically in memory.
  - Determine the addresses of data and instruction labels.
  - 3. Patch both the internal and external references.

|          | Object file header |           |                | SP → 0000 003f ffff fff0 <sub>hex</sub> | Stack        |
|----------|--------------------|-----------|----------------|-----------------------------------------|--------------|
|          |                    | Name      | Procedure A    |                                         | ↓ viden      |
|          |                    | Text size | 100hex         |                                         | <b>†</b>     |
|          |                    | Data size | 20hex          |                                         | Dynamic data |
| <u> </u> |                    | Data dize | Zenex          |                                         | Static data  |
|          | Text segment       | Address   | Instruction    | x3 → 0000 0000 1000 0000 <sub>hex</sub> | Text         |
|          |                    | 0         | ld x10 , 0(x3) | PC → 0000 0000 0040 0000 <sub>hex</sub> | Reserved     |
|          |                    | 4         | jal x1,0       |                                         |              |

|                        | 4         | jal              | В          |   | Text    |
|------------------------|-----------|------------------|------------|---|---------|
| Symbol table           | Label     | Address          |            |   |         |
|                        | x         | _                |            |   |         |
|                        | В         | _                |            |   |         |
| Object file header     |           |                  |            |   |         |
|                        | Name      | Procedure B      |            |   |         |
|                        | Text size | 200hex           |            |   |         |
|                        | Data size | 30hex            |            |   |         |
| Text segment           | Address   | Instruction      |            |   | D . 1 . |
|                        | 0         | sd \$x11, 0(x3)  |            |   | Data    |
|                        | 4         | jal x1, 0        |            |   |         |
|                        |           |                  |            |   |         |
| Data segment           | 0         | (Y)              |            |   |         |
|                        |           |                  |            |   |         |
| Relocation information | Address   | Instruction Type | Dependency |   |         |
|                        | 0         | sd               | Y          | · |         |
|                        | 4         | jal              | A          |   |         |
| Symbol table           | Label     | Address          |            |   |         |
|                        | Y         | _                |            |   |         |

(X)

Instruction

Type

Dependency

X

0

**Address** 

Data segment

Relocation

information

| Executable file header |              |                   |
|------------------------|--------------|-------------------|
|                        | Text size    | 300hex            |
|                        | Data size    | 50hex             |
| Text segment           | Address      | Instruction       |
|                        | 0040 0000hex | ld \$x10, 0 (x3)  |
|                        | 0040 0004hex | Jal x1, 252ten    |
|                        |              |                   |
|                        | 0040 0100hex | sd \$x11, 32 (x3) |
|                        | 0040 0104hex | jal x1, -260ten   |
|                        |              |                   |
| Data segment           | Address      |                   |
|                        | 1000 0000hex | (X)               |
|                        |              |                   |
|                        | 1000 0020hex | (Y)               |
|                        |              |                   |

Procedure A

Procedure B

## Loading a Program

- Load from image file on disk into memory
  - 1. Read header to determine segment sizes
  - 2. Create virtual address space
  - 3. Copy text and initialized data into memory
    - Or set page table entries so they can be faulted in
  - 4. Set up arguments on stack
  - 5. Initialize registers (including sp, fp, gp)
  - 6. Jump to startup routine
    - Copies arguments to x10, ... and calls main
    - When main returns, do exit syscall

```
main();

_start_up:
lw x10, offset($sp) ## load arguments
jal x1, main;
exit
```

### Dynamically Linked Libraries (DLL)

- Disadvantages with traditional statically linked library
  - Library updates
  - Loading the whole library even if all of the library is not used
- Dynamically linked library
  - The libraries are not linked and loaded until the program is run.
  - Lazy procedure linkage
    - Each routine is linked only after it is called.

## Lazy Linkage

Indirection table

Stub: Loads routine ID, Jump to linker/loader

Linker/loader code

Dynamically mapped code











Stub: Loads routine ID. Jump to linker/loader

Linker/loader code

Dynamically mapped code



# Text JAL LD **JALR** Data Text DLL routine **JALR**

#### **Text** JAL printf(); printf () { LD x5, printf\_addr JALR x0, 0(x5)Data Data printf addr .word L1 printf addr .word 0x400000 **Text** L1: LAID JAL x1, DLL; Text printf()

JALR x0, 0(x1)

→ Assume this is loaded into 0x400000

(b) Subsequent calls to DLL routine

# **Starting Java Applications**





#### **MIPS Instructions**

- MIPS: commercial predecessor to RISC-V
- Similar basic set of instructions
  - 32-bit instructions
  - 32 general purpose registers, register 0 is always 0
  - 32 floating-point registers
  - Memory accessed only by load/store instructions
    - Consistent use of addressing modes for all data sizes
- Different conditional branches
  - For <, <=, >, >=
  - RISC-V: blt, bge, bltu, bgeu
  - MIPS: slt, sltu (set less than, result is 0 or 1)
    - Then use beq, bne to complete the branch



# Instruction Encoding

| Register-re              | gister   |                                  |          |                              |    |                              |    |     |              |    |                      |        |   |           |   |
|--------------------------|----------|----------------------------------|----------|------------------------------|----|------------------------------|----|-----|--------------|----|----------------------|--------|---|-----------|---|
|                          | 31       |                                  | 25       | 24                           | 20 | 19                           | 15 | 14  | 12           | 11 |                      | 7      | 6 |           | 0 |
| RISC-V                   |          | funct7(7)                        |          | rs2(5)                       |    | rs1(5)                       |    | fun | ct3(3)       |    | rd(5)                |        |   | opcode(7) |   |
|                          | 31       | 26                               | 25       | 21                           | 20 | 16                           | 15 |     |              | 11 | 10                   |        | 6 | 5         | 0 |
| MIPS                     |          | Op(6)                            |          | Rs1(5)                       |    | Rs2(5)                       |    | R   | d(5)         |    | Const(5              | 5)     |   | Opx(6)    |   |
|                          |          |                                  |          |                              |    |                              |    |     |              |    |                      |        |   |           |   |
| Load                     |          |                                  |          |                              |    |                              |    |     |              |    |                      |        |   |           |   |
|                          | 31       |                                  |          |                              | 20 | 19                           | 15 | 14  | 12           | 11 |                      | 7      | 6 |           | 0 |
| RISC-V                   |          | immedi                           | ate(     | (12)                         |    | rs1(5)                       |    | fun | ct3(3)       |    | rd(5)                |        |   | opcode(7) |   |
|                          | 31       | 26                               | 25       | 21                           | 20 | 16                           | 15 |     |              |    |                      |        |   |           | 0 |
|                          |          | On(6)                            |          | Rs1(5)                       |    | Rs2(5)                       |    |     |              |    | Const                | (16    | ) |           |   |
| MIPS                     |          | Op(6)                            |          | 1(3)                         |    | 1132(0)                      |    |     |              |    |                      | `      |   |           |   |
| MIPS<br>Store            | 31       | Ορ(δ)                            | 25       |                              | 20 |                              | 15 | 14  | 12           | 11 |                      | •      |   |           | 0 |
| Store                    | 31 ir    |                                  | 25       | 24                           | 20 | 19                           | 15 |     | 12<br>ct3(3) |    |                      | 7      |   | opcode(7) | 0 |
|                          |          | mmediate(7)                      | 25<br>25 | 24<br>rs2(5)                 | 20 | 19 rs1(5)                    | 15 | fun | 12<br>ct3(3) |    |                      | 7      |   | opcode(7) | 0 |
| Store                    | ir       | mmediate(7)                      |          | 24<br>rs2(5)                 |    | 19 rs1(5)                    |    | fun |              |    |                      | 7      | 6 | opcode(7) |   |
| Store<br>RISC-V          | ir       | mmediate(7)<br>26                |          | 24<br>rs2(5)                 |    | 19<br>rs1(5)                 |    | fun |              |    | mmediate(5)          | 7      | 6 | opcode(7) |   |
| Store<br>RISC-V          | ir       | mmediate(7)<br>26                |          | 24<br>rs2(5)                 |    | 19<br>rs1(5)                 |    | fun |              |    | mmediate(5)          | 7      | 6 | opcode(7) |   |
| Store<br>RISC-V<br>MIPS  | ir       | mmediate(7)<br>26                | 25       | 24<br>rs2(5)                 | 20 | 19<br>rs1(5)                 | 15 | fun | ct3(3)       |    | nmediate(5)<br>Const | 7      | 6 | opcode(7) |   |
| Store<br>RISC-V<br>MIPS  | 31<br>31 | mmediate(7)<br>26                | 25       | 24<br>rs2(5)<br>21<br>Rs1(5) | 20 | 19<br>rs1(5)<br>16<br>Rs2(5) | 15 | fun | ct3(3)       | 11 | nmediate(5)<br>Const | 7 (16) | 6 | opcode(7) | 0 |
| Store RISC-V MIPS Branch | 31<br>31 | mmediate(7) 26 Op(6) mmediate(7) | 25       | 24<br>rs2(5)<br>21<br>Rs1(5) | 20 | 19<br>rs1(5)<br>16<br>Rs2(5) | 15 | fun | ct3(3)       | 11 | nmediate(5)<br>Const | 7 (16) | 6 |           | 0 |



#### The Intel x86 ISA

- Evolution with backward compatibility
  - 8080 (1974): 8-bit microprocessor
    - Accumulator, plus 3 index-register pairs
  - 8086 (1978): 16-bit extension to 8080
    - Complex instruction set (CISC)
  - 8087 (1980): floating-point coprocessor
    - Adds FP instructions and register stack
  - 80286 (1982): 24-bit addresses, MMU
    - Segmented memory mapping and protection
  - 80386 (1985): 32-bit extension (now IA-32)
    - Additional addressing modes and operations
    - Paged memory mapping as well as segments





#### The Intel x86 ISA



- Further evolution...
  - i486 (1989): pipelined, on-chip caches and FPU
    - Compatible competitors: AMD, Cyrix, ...
  - Pentium (1993): superscalar, 64-bit datapath
    - Later versions added MMX (Multi-Media eXtension) instructions ~ 57 instructions
    - The infamous FDIV bug
  - Pentium Pro (1995), Pentium II (1997)
    - New microarchitecture (see Colwell, The Pentium Chronicles)
  - Pentium III (1999)
    - Added SSE (Streaming SIMD Extensions) and associated registers ~ 70 instructions, 128-bit register
  - Pentium 4 (2001)
    - New microarchitecture
    - Added SSE2 instructions ~ 144 instructions



#### The Intel x86 ISA

- And further...
  - AMD64 (2003): extended architecture to 64 bits
  - EM64T Extended Memory 64 Technology (2004)
    - AMD64 adopted by Intel (with refinements)
    - Added SSE3 instructions ~ 13 instructions
  - Intel Core (2006)
    - Added SSE4 instructions(~ 54 instructions), virtual machine support
  - AMD64 (announced 2007): SSE5 instructions
    - Intel declined to follow, instead...
  - Advanced Vector Extension (announced 2008)
    - Longer SSE registers, more instructions ~ 128 new instructions, 256 bit-register
- If Intel didn't extend with compatibility, its competitors would!
  - Technical elegance ≠ market success



## **Basic x86 Registers**





## **Basic x86 Addressing Modes**

#### Two operands per instruction

| Source/dest operand | Second source operand |
|---------------------|-----------------------|
| Register            | Register              |
| Register            | Immediate             |
| Register            | Memory                |
| Memory              | Register              |
| Memory              | Immediate             |

#### Memory addressing modes

- Address in register
- Address = R<sub>base</sub> + displacement
- Address =  $R_{base}$  +  $2^{scale}$  ×  $R_{index}$  (scale = 0, 1, 2, or 3)
- Address =  $R_{\text{base}}$  +  $2^{\text{scale}}$  ×  $R_{\text{index}}$  + displacement



# x86 Instruction Encoding



## **Implementing IA-32**

- Complex instruction set makes implementation difficult
  - Hardware translates instructions to simpler microoperations
    - Simple instructions: 1–1
    - Complex instructions: 1—many
  - Microengine similar to RISC
  - Market share makes this economically viable
- Comparable performance to RISC
  - Compilers avoid complex instructions





### **Effect of Compiler Optimization**

Compiled with gcc for Pentium 4 under Linux









### **Effect of Language and Algorithm**









### **Fallacies**

- Powerful instruction ⇒ higher performance
  - Fewer instructions required
  - But complex instructions are hard to implement
    - May slow down all instructions, including simple ones
  - Compilers are good at making fast code from simple instructions
- Use assembly code for high performance
  - But modern compilers are better at dealing with modern processors
  - More lines of code ⇒ more errors and less productivity



#### **Fallacies**

- Backward compatibility ⇒ instruction set doesn't change
  - But they do accrete more instructions

