## Reading 26-1 - Assembly Language Introduction

We've now arrived at the end of our introduction to C programming. The rest of the course will mostly look at higher-level concepts built atop the understanding we have developed.

But before the look at higher levels that we will use C++ to investigate, we will briefly pull back the covers and see what happens <b>at the level below</b> C to make your programs run.

Understanding how complex programs boil down to bytes will help you debug your program, and will make you appreciate why they behave the way they do.

    --- | Web sites, Google, Facebook, ChatGPT, etc.
     C  |-------------------------
     S  | Parallel programming    <-- block 4
     E  |-------------------------
     2  | C++ | Operating systems <-- block 3
     0  |-------------------------
     1  | C programming language  <-- block 1 - we discussed this so far
     3  |-------------------------
     3  | Assembly language       <-- block 2- we will briefly cover this
    --- |-------------------------------------------
        | Hardware (chips)        <-- Prof. Morrison's Digital Integrated Circuits course

> The Electrical Engineers will take the CSE 20221 Logic Design course, where understanding of memory operations, assembly language, and instruction sets is crucial for success in that course. So part of my motivation for introducing assembly language now is to set you up for success in future coursework.

Each computer architecture (such as x86-64, which most modern computers use and we're considering in this course) has an <b>instruction set</b> specified by the manufacturer in order to optimize their operations based on their specific needs or intellectual property.

The instruction set, first and foremost, defines what sequences of bytes <b>trigger specific behavior in the processor</b> (e.g., adding numbers, comparing them for equality, or loading data from memory). 

But hexadecimal bytes are hard for humans to read, so the instruction set also comes with a human-readable <b>assembly language</b> that consists of short, mnemonic instructions that correspond directly to a byte encoding (i.e., each of these instructions corresponds to a specific, unique set of hexadecimal bytes).

## Back into the Void: Generating the Assembly Code

we will repeat the process of building a program from the ground up, starting with <a href = "https://github.com/mmorri22/cse20133/blob/main/readings/lec26/void.c"><code>void.c</code></a>

    void main(){

    }

We can look at the contents of <code>add.o</code> using a tool called <b>objdump</b>. 
    
    mkdir reading26
    cd reading26
    wget https://raw.githubusercontent.com/mmorri22/cse20133/main/readings/lec26/void.c
    gcc -c void.c 

If you are using a Windows Machine or Linux, perform the following instruction:

    objdump -d add.o 

If you are using a Mac, perform the following instruction:

    x86_64-linux-gnu-objdump -d add.o 
    
> <b>On the difference between Windows and Mac</b> - If you are using a Mac, and you try running <code>objdump</code>, you will likely get the following error:<p>
> <code>trap1:     file format elf64-little</code><br>
> <code>objdump: can't disassemble for architecture UNKNOWN!</code><p>
> This is because the executable we provide is <b>compiled for the x86-64 architecture</b> and contains machine instructions that <b>only x86-64 machines understand</b>, but the computer you’re using <b>only understands ARM64 instructions</b>.<br>

## Reading Assembly Language 
    
<b>It's not important that you understand this yet. Just that you get an idea that we can view how programs work at the binary level</b>.
    
> <b>What does the machine language below mean?</b>
> We don't know machine language yet. But to give you an intution, <code>retq</code> tells the processor to return to the calling function, which is just <code>return 0;</code> in main.
    
This is the resulting output, which may be found at <a href = "https://github.com/mmorri22/cse20133/blob/main/readings/lec26/void_objdump.txt">void_objdump.txt</a>

    0000000000000000 <main>:
       0:   f3 0f 1e fa             endbr64 
       4:   55                      push   %rbp
       5:   48 89 e5                mov    %rsp,%rbp
       8:   90                      nop
       9:   5d                      pop    %rbp
       a:   c3                      retq   
    
            ^                       ^
        | bytes in file         | their human-readable meaning in x86-64 machine language
        | in hexadecimal        | (not stored in the file; objdump generated this)
        | notation

### <font color = "red">Class Introduction Question #1 - What is an Instruction Set and how are they different between computers made by different manufacturers?</font>

### <font color = "red">Class Introduction Question #2 - What is Assembly Language and why do we describe an assembly language when the instruction set precisely describes what happens in the machine?</font>

### <font color = "red">Class Introduction Question #3 - What is the purpose of the objdump instruction? And why do MAC users need to perform x86_64-linux-gnu-objdump instead?</font>

### The next reading for this lecture is <a href = "https://github.com/mmorri22/cse20133/blob/main/readings/lec26/Reading%2026-2.ipynb">Reading 26-2 - Generating Assembly Code</a>