[02/17/2022]
1. Arm Assembly
    - Human readable/ writable form of the processor's machine code.
    - Assembler -> Translate Code to the Machine Code.
    - Assembly language helps us to write code in the form of Instructions. E.g. ADD r0,r1
    - Registers : Variables of the processors(16 of them r0- r15)
        * ro - r12, r13 : SP[Stack Pointer], r14 : LR[Linker Register], r15 : PC[Program Counter], r17 : CPSR[Current Program Startor Register] 
        * r0 - r3 Arguments and return values, not preserved
        * r4 - r11 local variables, preserved
        * r12 - scratch/ IP(inner procedural), not preserved
        * **Caller-saved regs** (r0-r3, r12) | **Callee-saved regs**(r4-r11, r13,r14,sp,lr)
        <img src="images/callercallee.png" align="center" width="30%" height="30%" />
    <img src="images/assemIns1.png" align="center" width="30%" height="30%" />
    - Labels
        * Denote another important loc in code to jump directly to that part.
    - Directives
    - Data : In Assembly everything is Private unless specified as public
    - **To return data in Assembly we need to first store it in r0 and then return that.**
    - Instruction Types:
        * Data Processing
            * add, sub, mul, and, or
        * Memory
            * ldr(load data to register), str(store register)
        * Control
            * bx, b, beq(branch if equal, blt. 
    - Branching :
        * `foo:`
            * `cmp ro,#0 @compare info in ro to 0`
            * `ble else  @jump to else part`
            * `add ro,ro`
            * `b end @ need to execute not the else part hence jump to 'end'`
        * `else:`
            * `add ro,ro,#2`
        * `end:`
            * `bx lr`
2. Simple Code:
    <img src="images/assemCode2.png" align="center" width="30%" height="30%" />
    * Code:
   <img src="images/assemCode3.png" align="center" width="25%" height="25%" />     
    > as -o [filename].o [filename].s <br>
    > file [filename].o <br>
    > gcc -o [outputfile] [filename].o <br>
    > gcc -o <exec> <cpgm>.c <assembly>.s (To link assembly to C while executing) <br>
    > x/1wx 0x10408 : Examine, 1 word, hexadecimal, address to examine
   
3. Combine C with Assem:
    * Refer add_main.c | add_c.c | add_s.s
    * gcc -o add_main add_main.c add_c.c add_s.s | ./add_main
    * gdb [filename] | (gdb)disassemble add_s | b add_s [to add breakpoint] | l add_s [too see function add_s] | info registers r0 r1 r2 | quit | r [to run] | stepi [step in a function]
    * gcc -S -o add_c.s add_c.s [Generate assembly code] | -S instead of machine code assembly language code | -O3 : optimized code.
    * Refer to mul_main.c


[02/22/2022]

## Assembly language:
1. ldr vs mov :
    - ldr saves ref to the memory address in the register whereas mov saves the value as the instruction in the register
    <img src="images/ldrmov.png" align="center" width="30%" height="30%" />
2. Processor :
    - Processor state is the value in the registers and a set of ways to execute the instructions.
    <img src="images/mem.png" align="center" width="30%" height="30%" />
    - E.g. mov r0,#2 (instruction word) mov r1,#3, add r5,r0,r1
    - Processor looks for PC[Program Counter] that points in the memory to look for the code to be executed.
    - Word size is 32 bit and registers are also the word size for 32 bit processor. Instruction are also word size
    - Memory is nothing but array of bytes and addressed via `byte address` and set of words can be accessed via `word address`
    <img src="images/mem2.png" align="center" width="40%" height="40%" />
    - **Byte Order**
        * word = 0x22BB55AA, each set of 2 words would occupy a byte say 22, BB, 55, AA and arranged in an array, and can be arranged with MSB be at the bottom most(big endian) or MSB at the very top(little endian).
        * An architecture has to decide which order it'll use.
        * gdb \<exc filename> , x/4xb 0xbefff3e[mem addr]
3. Memory Instructions:
    - Control Instructions : b,ble,beq,bl(call a function) and bx(return from a function)
    - Memory Instructions:
        * ldr : load data into a register(word)
        * str : stores data into memory from a register
        * e.g. ldr r0,[r1] -> r0 = *r1, str r0,[r1] -> r0 : src and [r1] : dest
        * RISC(Reduced Instruction Set Computer), CISC(Complex Instruction Set Computer).
        * ARM is called Load/ Store Machine as it gives access to memory only using specific instructions.
            - Does this make it slow ? Not necessarily as the instruction sets for add r0,[r1],[r2] is large and the logic to decode these instructions are very complicated as compared to simple load/ str machine instructions.
4. SUM ARRAY
    - array += 1, actually adds 4 to the array, i.e. incrementing int to next int i.e. 4 bytes
    - Check for `int sum_array_c()` to assembly
    <img src="images/memarray.png" align="center" />
    - sum = sum + array[i] -> ldr r12,[r0,r2,lsl #2] -> r12 = \*(array + r2 << 2) and we need not update array pointer anymore.
   
[02/24/2022]
1. Allocate Space on Stack
    - `sub sp,sp,#16` gives us space of 4 words
    - Rule : **sp must be a multiple of 8(ABI : Application Binary Interface)(Evolution of ARM optimised memory Instructions to be multiple of 8)**
    - Deallocate:
        - `add sp,sp,#16`
    - Usage :
        - if we want to use more registers that the one that are preserved.
        - if we want to call another function we overwrite 'lr' register, so in order to preserve the function that called the code we want to not overwrite lr register.
        - We want to allocate an array then we can use an array
        - More than 4 arguments than we can use the stack to pass in additional arguments.
2. Branch and Link
    <img src="images/brlnk.png" align="center" />
3. Code e.g. :
    - when we call another function we need to preserve lr to preserve the caller function and hence sub and str in the beginning
    <img src="images/callfn1.png" align="center" width="100%" height="20%" />
    <img src="images/codeeg2.png" align="center" width="100%" height="20%" />
    - **ldrb** instead of loading 32-bits it just loads the 8 bit value
    

[02/01/2022]
1. Caller and Callee Convention:
    * Generalised Conditional statement :
        - `movle r0, r1`....`addgt`-`addle` etc :
        - why ? 'b' can kill performance(branch predictions)
        - Problem : Nested if conditions clearing a CPSR bits!!!
2. Large Immediates:
    * E.g. mov r0, #7 (where #7 is an immediate value)
    * **Immediates can't be used in the multiply statement**
    * Instruction is converted into 32 bit instruction.
    <img src="images/instrctn1.png" align="center" />
    * Refer to row 1 to understand how the instruction set looks like.
    * **OPCODE codes :**
    <img src="images/instrctn2.png" align="center" /> 
    * ldr r0, =#0xFFAA1122 @ loading a 32-bit value.
3. C Types and Generated Code:
    * int \*xp; char \*cp; when we don xp += 1 and cp += 1, it adds 4 bytes to xp('int') and 1 byte for cp.
    * int x, unsigned y; x = x >> 2; y = y >> 2, prior one is ASR and the later one if LSR.