# libasm - Apprentissage de l'assembleur :
## Working with ASM on the notebook :

In [1]:
mov edx, 2
mov ebx, 16

Type,Identifier,Value
Register,EBX,16
Register,EDX,2


## asmtutor :

In [42]:
%%script bash --no-raise-error
# Create every directory
for i in {1..36}; do
    mkdir Lesson_$i | true;
done

mkdir: cannot create directory ‘Lesson_1’: File exists
mkdir: cannot create directory ‘Lesson_2’: File exists
mkdir: cannot create directory ‘Lesson_3’: File exists
mkdir: cannot create directory ‘Lesson_4’: File exists
mkdir: cannot create directory ‘Lesson_5’: File exists
mkdir: cannot create directory ‘Lesson_6’: File exists
mkdir: cannot create directory ‘Lesson_7’: File exists
mkdir: cannot create directory ‘Lesson_8’: File exists
mkdir: cannot create directory ‘Lesson_9’: File exists
mkdir: cannot create directory ‘Lesson_10’: File exists
mkdir: cannot create directory ‘Lesson_11’: File exists
mkdir: cannot create directory ‘Lesson_12’: File exists
mkdir: cannot create directory ‘Lesson_13’: File exists
mkdir: cannot create directory ‘Lesson_14’: File exists
mkdir: cannot create directory ‘Lesson_15’: File exists
mkdir: cannot create directory ‘Lesson_16’: File exists
mkdir: cannot create directory ‘Lesson_17’: File exists
mkdir: cannot create directory ‘Lesson_18’: File exists
m

### Lesson 1 : Hello world !

- Syscalls are a builtin library in the kernel to provide functions like reading inputs from a keyboard and writing to the screen.
- Execution is suspended when the programm calls a syscall, it'll contact drivers to do stuff and then return control back to your programm.
- Drivers are called drivers because the kernel literally uses them to drive the hardware.
- We can do this by loading EAX with the function number (operation code OPCODE) we want to execute and filling the remaining registers with the argumetns we want to pass the syscall. Software *INT*errupt is then requested with the `INT` instruction and the kernel takes over and calls the function from the library with our arguments.
- For instance, when `EAX=1`, `sys_exit` will be called, and `sys_write` will if `EAX=4`.
- [Linux Syscalls table](https://chromium.googlesource.com/chromiumos/docs/+/HEAD/constants/syscalls.md#x86-32_bit)

#### Writing our program :
- First we create a `msg` variable in our `.data` section and assign it the string we want to output it in.
- In the `.text` section we tell the kernel where to begin execution with a global label `_start:` as the program's entry point.
- We'll use the `sys_write` syscall to output our message (OPCODE 4). The function also takes 3 arguments which are sequentially loaded into `EDX`, `ECX` and `EBX` before requesing a software interrupt which will perform the task.

The arguments passed are as follow :
- `EDX` will be loaded with the length (in bytes) of the string.
- `ECX` will be loaded with the address of our variable created in the `.data` section.
- `EBX` will be loaded with the file we want to write to (in this case `STDOUT`).

The datatype and meaning of the arguments passed can be found in the function's definition.

We compile, link and run the programm using the commands below :

```
~$ nasm -f elf helloworld.asm
~$ ld -m elf_i386 helloworld.o -o helloworld
~$ ./helloworld
Hello World!
Segmentation fault
```

In [7]:
; ------------------------------ Lesson 1 : Hello world ! -------------------------
; Hello World Program - asmtutor.com
; Compile with: nasm -f elf helloworld.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 helloworld.o -o helloworld
; Run with: ./helloworld

SECTION .data
msg db 'Hello World!', 0Ah ; Assign msg variable with our message string

SECTION .text
global _start

_start:
    mov EAX, 4 ; Load OPCODE into EAX
    mov EBX, 1 ; Load file descriptor (STDOUT) into EBX
    mov ECX, msg ; Load ECX with the address of our str.
    mov EDX, 13 ; Load EDX with the length to write.
    int 80h

Line 8: Invalid argument: 0Ah

This apparently doesn't work, probably because of use of wrong syntax for this environment :
Retrying with an other syntax and using the following OPCODES table (https://chromium.googlesource.com/chromiumos/docs/+/HEAD/constants/syscalls.md#x86_64-64_bit).

In [13]:
; --------------------- Lesson 1 : Attempt to convert above program to a compatible version --------------
section .data
    msg db 'Hello world !'
    
section .text
    global _start
    
_start:
    mov rax, 1; OPCODE is actually 1, reg to use rax
    mov rdi, 1; reg for fd is actually rdi
    mov rsi, msg; reg for char* is actually rsi
    mov rdx, 13; reg for size is actually rdx
    syscall ; This is how we actually get a software interrupt.

Line 2: Invalid instruction: section

Still not good. Trying with writefile (and changing kernel to Python :

In [20]:
%%writefile Lesson_1/helloworld.asm
; --------------------- Lesson 1 : Attempt to convert above program to a compatible version --------------
section .data
    msg db 'Hello world !'
    
section .text
    global _start
    
_start:
    mov rax, 1; OPCODE is actually 1, reg to use rax
    mov rdi, 1; reg for fd is actually rdi
    mov rsi, msg; reg for char* is actually rsi
    mov rdx, 13; reg for size is actually rdx
    syscall ; This is how we actually get a software interrupt.

Overwriting Lesson_1/helloworld.asm


In [43]:
%%script bash --no-raise-error
nasm -f elf64 Lesson_1/helloworld.asm -o Lesson_1/helloworld.o
ld -m elf_x86_64 Lesson_1/helloworld.o -o Lesson_1/helloworld
./Lesson_1/helloworld

bash: line 3: 1079278 Segmentation fault      (core dumped) ./Lesson_1/helloworld


Hello world !

### Lesson 2 : Proper program exit
#### Some more background :
Syscall for exiting : `sys_exit` (OPCODE for 64 bits : 60).
From the entrypoint (`_start` here), instructions are sequencially executed after loaded into memory.

In [44]:
%%writefile Lesson_2/helloworld.asm
; --------------------- Lesson 2 : Attempt to convert above program to a compatible version --------------
section .data
    msg db 'Hello world !'
    
section .text
    global _start
    
_start:
    mov rax, 1; OPCODE is actually 1, reg to use rax
    mov rdi, 1; reg for fd is actually rdi
    mov rsi, msg; reg for char* is actually rsi
    mov rdx, 13; reg for size is actually rdx
    syscall ; This is how we actually get a software interrupt.
    
    mov rax, 60;
    mov rdi, 0;
    syscall

Overwriting Lesson_2/helloworld.asm


In [None]:
%%script bash --no-raise-error
nasm -f elf64 Lesson_2/helloworld.asm -o Lesson_2/helloworld.o
ld -m elf_x86_64 Lesson_2/helloworld.o -o Lesson_2/helloworld
./Lesson_2/helloworld

### Lesson 3 : Calculate string length


In [None]:
%%writefile Lesson_3/helloworld_len.asm
; --------------------- Lesson 3 : small modification of previous lesson --------------
section .data
    msg db 'Hello world, brave new world !'
    
section .text
    global _start
    
_start:
    mov rax, 1; OPCODE is actually 1, reg to use rax
    mov rdi, 1; reg for fd is actually rdi
    mov rsi, msg; reg for char* is actually rsi
    mov rdx, 13; reg for size is actually rdx
    syscall ; This is how we actually get a software interrupt.
    
    mov rax, 60;
    mov rdi, 0;
    syscall

In [1]:
%%writefile Lesson_3/helloworld_len.asm
; ---------------------- Lesson 3 : calculate string length -------------------------

section .data
    msg db "Hello, brave new world !", 0Ah ; With nul character

section .text
    global _start

_start:
    mov rax, msg
    
ft_strlen:
    mov rdi, rax
    
ft_strlen_loop:
    cmp rax, 0
    jz ft_strlen_end
    inc rdi
    jmp ft_strlen_loop
    
ft_strlen_end:
    sub rdi, rax
    
    mov rdx, rax
    mov rax, 1
    mov rsi, msg
    syscall
    
    mov rax, 60;
    mov rdi, 0;
    syscall

Overwriting Lesson_3/helloworld_len.asm


In [None]:
%%script bash --no-raise-error
nasm -f elf64 Lesson_3/helloworld_len.asm -o Lesson_3/helloworld_len.o
ld -m elf_x86_64 Lesson_3/helloworld_len.o -o Lesson_3/helloworld_len
./Lesson_3/helloworld_len

In [None]:
%%script bash
echo foo