## Basic Computer Architecture


_burton rosenberg, 23 june 2023_


### Table of contents.

1. <a href="#intro">Introduction</a>
1. <a href="#i85">A Model CPU: Intel 8085</a>
1. <a href="#ram">Random Access Memory</a>
1. <a href="#architecture">CPU Architecture</a>



### <a name=intro>Introduction</a>

Computational devices have existed for a very long time, and are of many forms. The slide rule is a device for computing the product of two numbers, and was in common use in the 1950's until definitively replaced by pocket calculators, such as the [Texas Instruments SR-50](https://en.wikipedia.org/wiki/TI_SR-50) in 1973 or the [Hewlett-Packard HP-35](https://en.wikipedia.org/wiki/HP-35). 

 
<div style="float:right;margin:2em;width:450px;border:1px solid green;padding:1em;">
<a title="Rama, CC BY-SA 3.0 FR &lt;https://creativecommons.org/licenses/by-sa/3.0/fr/deed.en&gt;, via Wikimedia Commons" href="https://commons.wikimedia.org/wiki/File:Keuffel_and_Esser-Model_4181-1_Log_log_Duplex_Decitrig_slide_rule-IMG_5821-white_(cropped).jpg"><img width="512" alt="Keuffel and Esser-Model 4181-1 Log log Duplex Decitrig slide rule-IMG 5821-white (cropped)" src="https://upload.wikimedia.org/wikipedia/commons/thumb/6/62/Keuffel_and_Esser-Model_4181-1_Log_log_Duplex_Decitrig_slide_rule-IMG_5821-white_%28cropped%29.jpg/512px-Keuffel_and_Esser-Model_4181-1_Log_log_Duplex_Decitrig_slide_rule-IMG_5821-white_%28cropped%29.jpg"></a>
    <center>The Keuffel and Esser Model 4181 slide rule</center>
</div>

Although the slide rule did a calculation, it would not today be called a computer because, 

1. It lacked memory to store previous results and recall them when needed.
1. It carried out only a single operation.
1. Its operation was analog, working in the continuous value of length, rather than digital.

The Turing Machine (1936) contains these features. It was digital in that it worked by combining symbols from a finite set of symbols. It had a tape of cells for storing the symbols which functioned as a memory. And it had a state machine that could walk through a seqence of operations, including deciding on the next operation according to what symbol was on the tape.

Not too long afterwards, eletronic machines were constructed that could undertake the basic operation of a Turing Machine. The [ENIAC](https://en.wikipedia.org/wiki/ENIAC) was possibly the first computer with all the features now understood to be essential. It was built in 1945 at the University of Pennsylvania and designed by John Mauchly and J. Presper Eckert. (See [you tube](https://www.youtube.com/watch?v=bGk9W65vXNA&list=PLsMFdolQPga7ldPdlPqsPh6b4m7L4ik8q)).


At first, the computing elements were relays, switches that can be turned off and on by electromagnetics. Here we see the preference for a binary device &mdash; one whose basic symbol set contained just two symbols: off and on (or true and false; or 0 and 1). The ENIAC was using vacuum tubes, a device that allowed a small signal to modulate a larger electron stream, hence amplifing the signal. Vacuum tubes are hot and unreliable but are still in used in some guitar amps because the peculiarities of their amplification curves given a desirable sound.

A vacuum tube amp at [Usagi Electric](https://www.youtube.com/watch?v=HL5BIE5TD3s).

But a vast number of tubes or relays would be needed and would be physically impossible to assemble. The invention of the transitor replaced the tube with a device based on the physics of crystals, and manufacturing methods developed to put many transitors onto a single silicon wafer. In 1977 when Intel achieved transitors the size of about 3&mu;m, or 3 on millionths of a meter, it placed 6,500 of such devices into the computing device we shall take as the model architecture for a digital computer, the Intel 8085. 

### <a name="i85">A Model CPU: Intel 8085</a>

By 1977 much of the architecture for a digital computer had been worked out, and many such machines had been built. The ability to make smaller and smaller transitors created the  LSI (Large Scale Integration) device with a number of transitors capable of an entire computing unit on a chip. The computing unit is not the only device necessary to build a computer. Devices for memory and Input/Output (I/O) are also necessary. In creating the 8085, Intel introduced or integrated a number of different chips, that altogether was able to make a computer on a single printed circuit board. 



<div style="float:right;margin:2em;width:450px;border:1px solid green;padding:1em;padding-top:2em;">
<a title="Pdesousa359, CC BY-SA 3.0 &lt;https://creativecommons.org/licenses/by-sa/3.0&gt;, via Wikimedia Commons" href="https://commons.wikimedia.org/wiki/File:Intel_8085A_Die_CPU_Image.jpg"><img width="256" alt="Intel 8085A Die CPU Image" src="https://upload.wikimedia.org/wikipedia/commons/6/6c/Intel_8085A_Die_CPU_Image.jpg"></a>
    <p>
        <center>The Intel 8085 CPU chip</center>
</div>

We look at the 8085 and family because it contains most the needed concepts for modern computing architeture, is reasonably simple to describe, and is an actual device that the reader can, if they wish, investigate more fully. 

The 8085 itself is a <u>Central Processing Unit</u>, (CPU). The CPU can,

1. Carry out mathematical and logical operations on data, such as addition, or boolean operations such as or.
1. Sequence the operations according to instructions that are fetched from memory, and placed into the region of the CPU that can sequence and gate the data flows accross the CPU to carry out the operation.
1. Move data from on CPU <u>memory registers</u> to a larger memory store called the <u>RAM</u>. 
1. Carry out communication to and from the memory, using various Input/Output techniques under the control of the CPU.

### <a name="ram">Random Access Memory</a>

While the Turing Machine have memory, it was arranged as a sequence of cells, each capable of holding a single symbol to be read or written, and accessed by a head that moved left or right along the sequence of cells. By the time of the 8085 a different model was adopted for memory, that of the <u>Random Access Memory</u> (RAM). A RAM stored a uniform data elment called the <u>byte</u>, which was by that time a collection of eight bits. The data was stored at an integer index, its <u>address</u>.

An individual byte could store one of 256 different symbols. Otherwise this could be considered a number between 0 and 255 by considering the bits as the coefficients of a base two number representation. Note that if one does think of this as an integer, the leading zeros are part of the number, unlike the normal convention of dropping leading zeros. It actually seems not to make much sense to say 037 instead of 37. 

__N.B.:__ But in C language, a number with a leading zero is in octal notation. That is, in base 8. Hence 037 has value 3&times;8&plus;7=41. 

Data items as created by the computer are stored using an interval of bytes in the RAM, and can be accessed either one by one, or in single access, depending on the the hardward ability. Typically, if data is stored in multiple bytes, the number of bytes is a power of two, and the lowest address for the data item is divisible by number of bytes. For instance, a 32 bit integer needs 4 bytes, and it would be stored beginning at address 4 &times; i, and occupying that and the next 3 addresses, for some i.



<div style="float:right;margin:2em;width:450px;border:1px solid green;padding:1em;">
<a title="MCS-80/85 Family User's Manual, Intel, October 1979" href="https://archive.org/details/Mcs80_85FamilyUsersManual/">
<img src="https://www.cs.miami.edu/home/burt/learning/csc421.241/images/8085-cpu-block-diagram.png">
    <center>MCS-80/85 Family User's Manual, Intel, October 1979.</center>
</a>
</div>


### <a name="architecture">CPU Architecture</a>

he CPU has serveral components, connected together internally.

#### The Registers:

The CPU keeps data in registers. These are collections of bits like a byte in RAM but are not always 8 bits. While RAM can organize larger data units as a sequence of bytes, a Register will have enough bits to directly store in one register the data item. It is also smaller than a RAM. For this reason it is indexed differently. Usualy a register has a specific function so it is used implicitly as part of an instruction.

Registers can hold data but they can also hold addresses in memory. The CPU must sequence through the instructions. It does this with the <u>Program Counter</u> (PC), a register whose value is the memory address of the instruction. Normally, the PC is incremented with each instruction, so sequential instructions are placed in order in the RAM for execution.

However the PC can be loaded direction with a `JMP` instruction. It can be loaded
conditionally based on the values of other registers. The  `JZ` (jump on zero) instruction
will load the PC only if the Z bit in the Flag register is set. This flag is set as a result of a computation resulting in a zero value. We discuss that later in the section of the ALU. 

The Intel and many other architectures of a <u>stack pointer</u> register, SP, which holds a memory address and is used to faciliate subrountines. A `CALL` instruction provides an address to load into the PC. Before the load, the current value of the PC is stored in memory at the location given by the SP. In the Intel an address is 16 bits, or two bytes. So two memory locations are used. In fact that are at SP-1 and SP-2. Then the SP is updated to SP-2. This is called a push. The the PC gets its new value.

The result is the same as a jump, however if the programmer wishes to jump back to the instruction following the call, that address can be reloaded into the PC from the stack. This is done with the `RET` instruction. It moves two bytes loacted at SP and SP+1 to the PC and updates the SP to SP+2. 



CPU's differ in the number of registers, their sizes, and if they have special purposes. The 8085 has the following registers,

1. __ACC__, the accumulator. A 16 bit register often used to receive output from the ALU.
1. __FLAG__, the flag register. A 5 bit register where each bit has a special meaning.
1. __PC__, the program counter. A 16 bit register pointing to the next instruction to execute.
1. __SP__, the stack pointer. A 16 bit register pointing to the top of the stack, the most recently pushed element on the stack.
1. __B through E__, general purpose 8 bit registers that can provide data to the ALU. They can be used in these pairs: BC and DE as 16 bit registers.
1. __H and L__, the high and low registers. These are 8 bit registers can can be used as a 16 bit pair. They are special because they can either hold data or addresses. If it holds an address, the CPU can fetch or store to the address held in the register pair HL.

#### The ALU: 
 
The Arithmetic Logical Unit is a network of boolean logic gates that can compute a variety of functions. While the CPU works step by step, the ALU functions in one step. Input is presented to the ALU at the beginning of an instruction step, and collected at the end of the instruction step. It can guide computations between steps by setting condition flags, for instance, if a step results in a zero value, besides delivering that value to some output register, it can set the Z flag in the Flag register.


#### The Instruction Fetch and Decode

We have mentioned that the PC register holds the address of the next instruction to excute. The instruction fetch and decode component of the CPU is responsible for doing the instruction fetch, and once the instruction is brougto to some internal memory on the CPU, to manipulate the CPU electronics to carry out the instruction. 

For instance, a `JMP` instruction would carry a 16 bit value, and the decode would copy that value into the PC.

An `ADD` instruction includes a register indicator. As there are 8 8-bit registers, the register indication is a number 0 through 7. The decode will route the accumulator data and the data in the give register to the adder circuitry in the ALU, and will capture the result of the circuitry back into the accumulator. 


#### Negative numbers

The data stored in a byte can be looked at in a few ways.

1. The 8 bits can be put into 256 zero-one combinations. So the byte can be said to store one of 256 different symbols. The symbols themselves can have an arbitrary meaning.
1. The 8 bits can be seen as bit of a 8 bit binary number, so the byte is understood to store a number between 0 and 255. 
1. The integer range can include positive and negative numbers using <u>sign magnitude</u>, one bit is set aside to mark the number provided by the remaining 7 bits as positive or negative. The numbers represented range from -127 to +127.
1. The integer range can include positive and negative numbers using <u>two's complement</u>. The nature of modular arithemtic is used to pair up the numbers so their sum is 256. If x is paired with y, we have x+y=0 and therefore x=-y. This is good because the assignment by two's complement makes sense. This gives a range of -128 to +127.

In sign magnitude, only 255 of the 256 different bit combinations are used. This is because the number zero is represented twice &mdash; as a negative zero and a positive zero.
 
In two's complement, exactly  two numbers will pair with themselves, 0 and 128. The others pair with something distinct, as we are solving 2x=0 (256). We assign 0 as postive and 128 as negative. This gives a range of -128 to +127.

In [1]:
%%file boolean-adder.c

#include<stdio.h>
#include<string.h>

#define N 8 


void print_aux(char * a, int n) {
    int i ;
    for (i=0;i<n;i++) 
        printf("%c", (a[i])?'1':'0' ) ;
    return ; 
}

void print_results(char * a, char * b, char *c, char carry, int n) {
    int i ;
    printf("A: ") ;
    print_aux(a,n) ;
    printf("\nB: ") ;
    print_aux(b,n) ;
    printf("\nC: ") ;
    print_aux(c,n) ;
    printf("\n%scarry out\n\n", (!carry)?"no ":"") ;
    return ;
}


// ------------------------


char full_adder(char a, char b, char c_in, char * c_out) {
    *c_out = (a & b)|(b&c_in)|(a&c_in) ;
    return a ^ b ^ c_in ;
}

int main(int argc, char * argv[]) {
    char r_A[N], r_B[N], acc[N] ;
    char cy_flag, carry ;
    int i ;
    for (i=0;i<N;i++) {
        r_A[i] = (argv[1][i]=='1') ;
        r_B[i] = (argv[2][i]=='1') ;
        acc[i] = 0 ;
    }
    cy_flag = 0 ; 
    
    for (i=(N-1);i>=0;i--) {
        acc[i] = full_adder(r_A[i],r_B[i],cy_flag,&carry) ;
        cy_flag = carry ;
    }
    
    print_results(r_A, r_B, acc, cy_flag, N) ;
    return 0 ;
}

Overwriting boolean-adder.c


In [2]:
program = "boolean-adder"
!cc -o {program} {program}.c
for test_value in ["00000000 00000000", "00000001 00000001", 
                   "00111111 00000001", "11101110 00000010", 
                   "11110000 00110000"]:
    !./{program} {test_value}

A: 00000000
B: 00000000
C: 00000000
no carry out

A: 00000001
B: 00000001
C: 00000010
no carry out

A: 00111111
B: 00000001
C: 01000000
no carry out

A: 11101110
B: 00000010
C: 11110000
no carry out

A: 11110000
B: 00110000
C: 00100000
carry out

