# Register

|  q   |   l   |   w   |   b   |     usage     |
|:----:|:-----:|:-----:|:-----:|:-------------:|
| %rax | %eax  |  %ax  |  %al  | return value  |
| %rbx | %ebx  |  %bx  |  %bl  | callee saved  |
| %rcx | %ecx  |  %cx  |  %cl  | 4th argument  |
| %rdx | %edx  |  %dx  |  %dl  | 3rd argument  |
| %rsi | %esi  |  %si  | %sil  | 2nd argument  |
| %rdi | %edi  |  %di  | %dil  | 1st argument  |
| %rbp | %ebp  |  %bp  | %bpl  | callee saved  |
| %rsp | %esp  |  %sp  | %spl  | stack pointer |
| %r8  | %r8d  | %r8w  | %r8b  | 5th argument  |
| %r9  | %r9d  | %r9w  | %r9b  | 6th argument  |
| %r10 | %r10d | %r10w | %r10b | caller saved  |
| %r11 | %r11d | %r11w | %r11b | caller saved  |
| %r12 | %r12d | %r12w | %r12b | callee saved  |
| %r13 | %r13d | %r13w | %r13b | callee saved  |
| %r14 | %r14d | %r14w | %r14b | callee saved  |
| %r15 | %r15d | %r15w | %r15b | callee saved  |

# Operand
&emsp;&emsp;Operands are the source and destination of the instruction. They can be registers, memory locations, or immediate values.

|     Type      | format         | value                   | name                         |
|:-------------:|:---------------|:------------------------|:-----------------------------|
| Immediate num | $Imm           | Imm                     | Immediate Addressing         |
|   Register    | ra             | R[ra]                   | Register Addressing          |
|  Memory loc   | Imm            | M[Imm]                  | Absolute Addressing          |
|  Memory loc   | (ra)           | M[R[ra]]                | Register Indirect Addressing |
|  Memory loc   | Imm(ra)        | M[Imm+R[ra]]            | Displacement Addressing      |
|  Memory loc   | (ra,rb)        | M[R[ra]+R[rb]]          | Base Index Addressing        |
|  Memory loc   | Imm(ra,rb)     | M[Imm+R[ra]+R[rb]]      | Base Index Addressing        |
|  Memory loc   | (,rb,sc)       | M[R[rb]*sc]             | Scale Index Addressing       |
|  Memory loc   | Imm(,rb,sc)    | M[Imm+R[rb]*sc]         | Scale Index Addressing       |
|  Memory loc   | (ra,rb,sc)     | M[R[ra]+R[rb]*sc]       | Scale Index Addressing       |
|  Memory loc   | Imm(ra,rb,sc)  | M[Imm+R[ra]+R[rb]*sc]   | Scale Index Addressing       |

# Instruction
&emsp;&emsp;The instruction operating targets with different size are with different suffix. `bwlq` for 1, 2, 4, 8 bytes respectively. 
 
## Move
&emsp;&emsp;Move bytes between matched registers or memory locations.

| Instruction | usage  |      description      |
|:-----------:|:------:|:---------------------:|
|  MOV S, D   | D <- S |         move          |
|    movq     |        |     move 8 bytes      |
|    movl     |        |     move 4 bytes      |
|    movw     |        |     move 2 bytes      |
|    movb     |        |     move 1 bytes      |
|   movabsq   |        | move absolute 8 bytes |

&emsp;&emsp;Move bytes with extension.
Zero extension.

| Instruction |        usage        |       description        |
|:-----------:|:-------------------:|:------------------------:|
|  MOVZ S, D  | D <- (zero expand)S | move with zero extension |
|   movzbw    |                     |    1 byte to 2 bytes     |
|   movzbl    |                     |    1 byte to 4 bytes     |
|   movzwl    |                     |    2 bytes to 4 bytes    |
|   movzbq    |                     |    1 bytes to 8 bytes    |
|   movzwq    |                     |    2 bytes to 8 bytes    |
|    movl     |                     |    4 bytes to 8 bytes    |

Sign extension

| Instruction |           usage            |       description        |
|:-----------:|:--------------------------:|:------------------------:|
|  MOVS S, D  |    D <- (sign expand)S     | move with sign extension |
|   movsbw    |                            |    1 byte to 2 bytes     |
|   movsbl    |                            |    1 byte to 4 bytes     |
|   movswl    |                            |    2 bytes to 4 bytes    |
|   movsbq    |                            |    1 bytes to 8 bytes    |
|   movswq    |                            |    2 bytes to 8 bytes    |
|   movslq    |                            |    4 bytes to 8 bytes    |
|    cltq     | %rax <- (sign expand)%eax  |    4 bytes to 8 bytes    |

## Pop and Push
&emsp;&emsp;Pop and push bytes from and to stack.

| Instruction |                   usage                    | description |
|:-----------:|:------------------------------------------:|:-----------:|
|  pushq   S  | R[%rsp] <- R[%rsp]-8 <br/> M[R[%rsp]] <- S |    push     |
|  popq    D  | D <- M[R[%rsp]] <br/> R[%rsp] <- R[%rsp]+8 |     pop     |

## Arithmetic and Logic Operation
&emsp;&emsp;Arithmetic and logic operation between matched registers or memory locations.

| Instruction |   usage   |      description       |
|:-----------:|:---------:|:----------------------:|
|  leaq S, D  |  D <- &S  |          load          |
|    INC D    | D <- D+1  |         add 1          |
|    DEC D    | D <- D-1  |         sub 1          |
|    NEG D    |  D <- -D  |      get negative      |
|    NOT D    |  D <- ~D  |        reverse         |
|  ADD S, D   | D <- D+S  |          add           |
|  SUB S, D   | D <- D-S  |          sub           |
|  IMUL S, D  | D <- D*S  |    signed multiple     |
|  XOR S, D   | D <- D^S  |          xor           |
|   OR S, D   | D <- D\|S |           or           |
|  AND S, D   | D <- D&S  |          and           |
|  SAL S, D   | D <- D<<S |       left shift       |
|  SHL S, D   | D <- D>>S |       left shift       |
|  SAR S, D   | D <- D>>S | arithmetic right shift |
|  SHR S, D   | D <- D>>S |   logic right shift    |

&emsp;&emsp;Special arithmetic operation.

| Instruction |                                 usage                                 |      description       |
|:-----------:|:---------------------------------------------------------------------:|:----------------------:|
|   imulq S   |                    R[%rdx]: R[%rax] <- S * R[%rax]                    | unsigned high multiple |
|   mulq S    |                    R[%rdx]: R[%rax] <- S * R[%rax]                    | unsigned high multiple |
|    cqto     |               R[%rdx]: R[%rax] <- (sign expand) R[%rax]               |  convert to 16 bytes   |
|    idivq    | R[%rax] <- R[%rdx]: R[%rax] / S <br/> R[%rdx] <- R[%rdx]: R[%rax] % S |     signed divide      |
|    divq     | R[%rax] <- R[%rdx]: R[%rax] / S <br/> R[%rdx] <- R[%rdx]: R[%rax] % S |    unsigned divide     |

## Condition Code
&emsp;&emsp;Condition code is a 1 byte register. 

| Instruction | usage |         description          |
|:-----------:|:-----:|:----------------------------:|
|     CF      | Carry | set 1 when unsigned overflow |
|     ZF      | Zero  |      set 1 when equal 0      |
|     SF      | Sign  |     set 1 when negative      |
|     OF      | Over  |  set 1 when signed overflow  |

After arithmetic and logic operation, the condition code is set according to the result. 

## Compare, Test and Set

| Instruction |  Basic  |    description    |
|:-----------:|:-------:|:-----------------:|
| CMP S1, S2  | S2 - S1 | compare S1 and S2 |
| TEST S1, S2 | S1 & S2 |  test S1 and S2   |

| Instruction | usage               |       description        |
|:------------|:--------------------|:------------------------:|
| sete D      | D <- ZF             |      equal or zero       |
| setne D     | D <- ~ZF            |        not equal         |
| sets D      | D <- SF             |         negative         |
| setns D     | D <- ~SF            |       not negative       |
| setg D      | D <- ~(SF^OF) & ~ZF |       greater than       |
| setge D     | D <- ~(SF^OF)       |      greater equal       |
| setl D      | D <- SF^OF          |        less than         |
| setle D     | D <- (SF^OF) \| ZF  |        less equal        |
| seta D      | D <- ~CF & ~ZF      |     above (for uns)      |
| setae D     | D <- ~CF            | above or equal (for uns) |
| setb D      | D <- CF             |     below (for uns)      |
| setbe D     | D <- CF \| ZF       | below or equal (for uns) |

&emsp;&emsp;Before using `set` instruction, we should use CMP or TEST instruction to set the condition code to make sure that the condition code is set correctly.

## Jump
&emsp;&emsp;Jump to the target address.

| Instruction  | condition      |  description   |
|:-------------|:---------------|:--------------:|
| jmp L        | no condition   | directly jump  |
| jmp *Operand | no condition   | directly jump  |
| je L         | ZF             |     equal      |
| jne L        | ~ZF            |   not equal    |
| js L         | SF             |    negative    |
| jns L        | ~SF            |  not negative  |
| jg L         | ~(SF^OF) & ~ZF |  greater than  |
| jge L        | ~(SF^OF)       | greater equal  |
| jl L         | SF^OF          |   less than    |
| jle L        | (SF^OF) \| ZF  |   less equal   |
| ja L         | ~CF & ~ZF      |  above (uns)   |
| jae L        | ~CF            | above or equal |
| jb L         | CF             |  below (uns)   |
| jbe L        | CF \| ZF       | below or equal |

## Condition Move
&emsp;&emsp;Move the source to destination according to the condition code.

| Instruction | condition      |  description   |
|:------------|:---------------|:--------------:|
| cmove S, R  | ZF             |  equal / zero  |
| cmovne S, R | ~ZF            |   not equal    |
| cmovs S, R  | SF             |    negative    |
| cmovns S, R | ~SF            |  not negative  |
| cmovg S, R  | ~(SF^OF) & ~ZF |  greater than  |
| cmovge S, R | ~(SF^OF)       | greater equal  |
| cmovl S, R  | SF^OF          |   less than    |
| cmovle S, R | (SF^OF) \| ZF  |   less equal   |
| cmova S, R  | ~CF & ~ZF      |  above (uns)   |
| cmovae S, R | ~CF            | above or equal |
| cmovb S, R  | CF             |  below (uns)   |
| cmovbe S, R | CF \| ZF       | below or equal |

# Floating Point Code
## Register
&emsp;&emsp;AVX floating point registers are 256 bits. There are 16 YMM registers from %ymm0 ~ %ymm15. Every YMM register is 256 bits. The lower 128 bits of YMM register is the corresponding XMM register from %xmm0 ~ %xmm15.

&emsp;&emsp;The %ymm0 is used to store the return value of the function or the first parameter. The %ymm1 ~ %ymm7 are the 2nd ~ 8th parameters. The %ymm8 ~ %ymm15 are the caller saved registers.

## Instruction
### Move
| Instruction | source | target | description  |
|:------------|:-------|:-------|:-------------|
| vmovss      | M32    | xmm    | move 4 bytes |
| vmovss      | xmm    | M32    | move 4 bytes |
| vmovsd      | M64    | xmm    | move 8 bytes |
| vmovsd      | xmm    | M64    | move 8 bytes |
| vmovaps     | xmm    | xmm    | move 4 bytes |
| vmovapd     | xmm    | xmm    | move 8 bytes |

Where M32 and M64 are the memory location with 4 bytes and 8 bytes respectively.

### Transform
| Instruction | source  | target | description      |
|:------------|:--------|:-------|:-----------------|
| vcvttss2si  | xmm/M32 | R32    | float32 -> int32 |
| vcvttsd2si  | xmm/M64 | R32    | float64 -> int32 |
| vcvttss2siq | xmm/M32 | R64    | float32 -> int64 |
| vcvttsd2siq | xmm/M64 | R64    | float64 -> int64 |

Where R32 and R64 are the 32 bits and 64 bits registers respectively.

| Instruction | source1 | source2 | target | description      |
|:------------|:--------|:--------|:-------|:-----------------|
| vcvtsi2ss   | M32/R32 | xmm     | xmm    | int32 -> float32 |
| vcvtsi2sd   | M32/R32 | xmm     | xmm    | int32 -> float64 |
| vcvtsi2ssq  | M64/R64 | xmm     | xmm    | int64 -> float32 |
| vcvtsi2sdq  | M64/R64 | xmm     | xmm    | int64 -> float64 |

these four instructions have two sources. Normally, we set the second source same with target.


### Calculation
&emsp;&emsp;Calculation between two xmm registers.

| single | double | usage            | description |
|:-------|:-------|:-----------------|:------------|
| vaddss | vaddsd | D <- S2 + S1     | float add   | 
| vsubss | vsubsd | D <- S2 - S1     | float sub   |
| vmulss | vmulsd | D <- S2 * S1     | float mul   |
| vdivss | vdivsd | D <- S2 / S1     | float div   |
| vminss | vminsd | D <- min(S2,S1)  | float min   |
| vmaxss | vmaxsd | D <- max(S2,S1)  | float max   |
| sqrtss | sqrtsd | D <- sqrt(S1)    | float sqrt  |

### Digital Operation
&emsp;&emsp;Digital operation between two xmm registers.

| single | double | usage        | description |
|:-------|:-------|:-------------|:------------|
| vandps | vandpd | D <- S2 & S1 | digital and |
| vxorps | vxorpd | D <- S2 ^ S1 | digital xor |

### Compare
&emsp;&emsp;Compare two xmm registers. 

| single   | double   | usage        | description |
|:---------|:---------|:-------------|:------------|
| vucomiss | vucomisd | D <- S2 - S1 | digital xor |

where S1 could be in XMM register or memory location. S2 is always in XMM register.
It sets the ZF, PF, CF according to the result. When any num is NaN, it sets PF to 1.

|   Order   | CF | ZF | PF |
|:---------:|:--:|:--:|:--:|
| No order  | 1  | 1  | 1  |
|  S2 < S1  | 1  | 0  | 0  |
|  S2 = S1  | 0  | 1  | 0  |
|  S2 > S1  | 0  | 0  | 0  |
