# Lets write some assembly code

In this chapter we will get going and write some simple assemble code,  "build" it and run it within a "debugger" so that we can get a sense of how everything fits together.

To do this we will use three terminal sessions one terminal to run an ascii editor, one to run the shell on its own so that we can compile our source code, and one to run our debugger.  We use three different terminals so that we can stay organized and avoid having to constantly stop and start the different programs.   The editor that we will be using (emacs) has support for integrating all three tasks within its self but for the moment we will keep things seperate to make sure we know what is going on.

## EDITOR : Terminal to run our editor 

An editor allows us to create and udpate plain text ascii files. An editor is the core tool of a programmer!  Programming is all about writing software in the form of ascii files that encode what we want the computer to do in some language or another (in our case assembly and C). So far you my have been taught to use various Integrated Development Environments (IDEs) that include and editor, build system and debugger within them.  In the class we will strip things down to there traditional bare essentials so you can get an idea of how things are really working and how IDE's are themselves constructed.  

In [21]:
from IPython.display import IFrame
IFrame('http://localhost:8888/terminals/2', 1000, 600)

In the above terminal we will run the `emacs` editor to create and write our first assembly program. 
To do this issue the following command.

`emacs simple.S`


In the above window you should enter in a copy of the following Code

```assembly_x86
        .intel_syntax noprefix   // set assembly language format to intel

        .data                    // Place the following in the data area
myvar:  .quad 0x00000000000000ff // place the 8 byte value at a location who's
                                 // address we can refer to by the symbolic
                                 // name myvar

        .text                    // Place the following in the area that
                                 // instruction should be encoded and stored
                                 // for historical reasons it is called text
        .global _start
_start:
        mov rax, 0xdeadbeef      // mov the immediate value hex deadbeef into rax
        mov rdi, qword ptr myvar // mov the value of a the address c into rdi
        and rax, rdi             // rax = rax AND rdi

        int3                     // trap to the debugger
```


# Building: Terminal to run our build commands

In [24]:
from IPython.display import IFrame
IFrame('http://localhost:8888/terminals/3', 1000, 200)

- gcc --static -g -nostartfiles -nolibc simple.S -o simple

To execute our code we must convert the "source" into a binary executable that can be loaded into memory and contains all the data and instructions (at the right locations).  To do this we must use programs that converts our assembly code into the "correct" raw binary values and assigns the those values to address.   The OS will load these values to the specified locations when we ask it to run our program.  

This process of converting human readable source code into a binary executable format is often referred to as "building".  The tools we will use are an assembler and a linker.  

The assembler's has been written to convert the human names ("memonics") of the instructions in our source files into the binary code that our CPU understands.  There is no magic!   The manufacuter of the CPU publishes a manual that defines what instructions the CPU supports.  Each instruction has a human "memonic" (eg.  `mov rax, <value>`) and the binary code that the CPU understands (eg. `mov rax, 0xdeadbeef` is `0x48,0xb8,0xef,0xbe,0xad,0xde,0x00,0x00,0x00,0x00`).  Given the manual a programmer writes the assemble to go over our source files and translates what we write into the cpu binary code.  The programmer extends the memonics with what are called "directives" such as `.intel_syntax, .text, .global, .quad, etc` that we can use to control and direct the assemble.   To fully understand all the syntax and what we can do one must look at both the manual for the CPU and for the assembler.  If all goes well and our program does not have any syntax errors then the assembler will generate a file with its output. This file is called an object file.  

We use a tool called a linker to process the object files that makes up our program to create a binary executable specific to our operating system and cpu.  It is this file that is "really" our program.  The linkers job is to prodess all our object files to create the binary with knowledge of where our OS wants things to be placed in memory (in our simple examples there is only one, later on we will have other object files from libraries of functions that we will want to use as well).  Specifically the developers of the OS provide information to the linker that tells it the rules of where instructions and data can go.  It is the linkers job to figure out what addresses each of the values that makes up our program should be given.  As such it also needs to fix up our code so that the final addresses are reflect for each of the places in our code where we reference particular symbolic names.  We will talk more about this later.  Assuming all goes well and the linker does not flag any errors then it will produce a binary executable that the OS can load and run.  One special task of the linker is to mark in the binary the address of the first instruction so that the OS can be sure to initialize the CPU correctly to start executing instructions from the right starting location -- this location is called the "entry point".   Our linker by default assumes that our code contains a symbol named `_start`.  If so the address it assigns to `_start` is what it will write into the executable as the entry point so that the OS can load and start our program correct.  If we fail to define the `_start` label the linker will produce an warning and man an assumption.  It is a bad idea to ignore warnings when programming at this level ... after all we know what assuming makes of you and me.

So in the shell above we will run a command (`gcc --static -g -nostartfiles -nolibc simple.S -o simple`) that runs both the assembler and linker for us.  We will have to pay close attention to see if there are any errors.  If so we need to go up to the editor make changes and save those changes.  Then try building again.  We repeat this untill there are no build time errors.  Remember the executable is different from the source any change we make to the program source code requires that we rerun the build process to update the binary.   Remember just because there are no build time errors does not mean that our code is "right" or free of bugs.  

Later on we will see how to use another tool called make to further simplify and automate the building process.

## Debugger: Terminal to run our debugger -- actually it is much more than just a debugger

In [15]:
from IPython.display import IFrame
IFrame('http://localhost:8888/terminals/4', 1000, 800)

- `gdb-tui simple`
- `tui new-layout 210 src 2 regs 5 status 1 cmd 1`
- `layout 210`
-  break _start
-  run
-  step  
Continue stepping until you get to the end

`gdb` (or `gdb-tui`) which starts in a slightly more friendly mode) is a very powerful tool in the hands of a power user (that you or soon to be).  `gdb` is complicated and cryptic but allows you to not just trace your programs execution but it allows you to explore all aspects of the hardware that your program has access too.  You can peek into the CPU and examine arbitrary memory locations.  And perhaps even more cool you can change the CPU registers and any memory location on the fly while your program is running!  It is going to take us a while to full explore all the power of gdb.   But lets get started.

If you type help you will get a list of the major commands that gdb support for the moment we are going to focus on the basics of following tasks:
- inspecting memory : examining memory, disassembling memory
- inspecting registers 
- setting breakpoints
- starting execution
- stepping instructions
- quiting


Ok lets write a new program that does something else

In [23]:
from IPython.display import IFrame
IFrame('http://localhost:8888/terminals/2', 1000, 600)