# Computer Architecture

Modern computers are very complex, according to some, they are among the most complicated things created by humans. For our purpose, we want to understand computers in terms of three major components: a CPU which performs computation , memory which _remembers_ short term data and disk which stores long term data. 



#### Laptop internals
![](images/EBMotherboard.JPG)

#### Motherboard
![](images/Supermicro-X12SCA-F-Overview.jpg)

#### Memory (RAM)
![](images/RAM-Modules.jpg)

#### CPU
![](images/How_to_stress_test_your_CPU-Hero.jpg)

#### Disk drive
![](images/Laptop-hard-drive-exposed.jpg)


## CPU (Central Processing Unit)
![](images/How_to_stress_test_your_CPU-Hero.jpg)

CPUs are often called the brains of a computer. The main purpose of a cpu is to perform computation. 
![](images/overview-fig1.png)


### Instruction (or function) execution

In general terms, _all_ computation, all functions, all fancy libraries written by anyone, must eventually be translated to the functions or instructions which come built-in to a CPU. You might be surprised to learn that modern cpus only have a few hundred instructions!

At a high level, for our purpose, think of cpus containing three types of instructions: arithmetic, control flow and memory related.

#### Arithmetic operations
Cpus contain _many_ instructions of type 

`add_int   arg1 arg2 output`

`add_float arg1 arg2 output`

`mult_int   arg1 arg2 output`

`mult_float arg1 arg2 output`


The syntax shown above is not standard, but it demonstrates that there are instructions for basic math. Since the code to add two integer numbers must be different from the code to add to floating point numbers, there are separate instructions. Higher level languages, like Python, may provide a single interface and hide away type complexity, but low level code always differentiates. There are no optional arguments, no default values, no such niceties in low level programming.

Notice also that these functions have three parameters. The last parameter is the output location of the result. We will understand this better after we study registers.

#### Memory operations
Memory load/store instructions are perhaps not as obvious ass arithmetic operations. Load/store operations can be thought of as these examples:

`load src_memory_index dst_register`

`store src_register dst_memory`

(Note that the above examples are simplified to illustrate a point)

#### Comparison operations
Several instructions are provided for comparison

`cmp arg1 arg2 out`
`lt arg1 arg2 out`
`lteq arg1 arg2 out`

The instructions above are simplified examples of `compare`, `less than` and `less than or equal to`.


#### Control flow operations
The last major type of instructions consist of control flow:

`jmp iftrue jmp_location`

These instructions correspond to if/else statements or loop constructs in higher level languages.

### Registers

In an earlier example, we show the following (simplified) instruction:

`add_int arg1 arg2 output`

In this instruction, `arg1` and `arg2` can be integers or locations of a register. The `output` parameter refers to a register location.

_Registers_ can be thought of as a tiny amount of memory attached to a cpu. Think of registers as a tiny array which can only hold a few tens of integers (32-128 are common numbers of registers). 

Since computers can't possibly run with such small amount of memory, many instructions, as we saw earlier, load data from main memory into these registers and copy contents of these registers back to main memory.

#### Programs are just data, loaded from memory

The cpu needs to load programs from main memory, into its local memory (different from data registers). These programs, themselves, are nothing more than the types of instructions we saw above. _All_ programs, C, Python, R, Java eventually get translated to the types of instructions we saw above.

#### Registers only contain zeros and ones

You may have heard the expression that everything is zeros and ones to a computer. This is where that expression becomes a reality.

Modern cpu registers are made up of 64 _bits_.

Register 1 `|_|_|_|_|_|_|_|_|_|..._|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|`

Register 2 `|_|_|_|_|_|_|_|_|_|..._|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|`

Register 3 `|_|_|_|_|_|_|_|_|_|..._|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|`

...

Each location in the set of register shown above can only contain a zero or a one (or True/False or On/Off, however you want to think of it)

What does the number "42" mean? How can it be stored in a computer at the lowest level? You can probably tell from context, all numbers, strings, pictures, videos, video games, web pages, programs, must be converted to a set of zeros and ones. We will look at how to do so in a later section.

#### There are cache layers between registers and memory
Earlier we saw an example of a memory operation: `load src_memory_index dst_register`

In reality, this operation is will not _necessarily_ pull a value from memory and put it in the target register. That will be an _extremely_ expensive operation. There are several layers of `caches` which try to reduce register to memory traffic. Later in this lecture we will see an important link about the cost of accessing data from cache vs memory.

**Exercise** What kind of CPU do you have?

In [None]:
import platform

platform.processor()

If numbers are represented by finite digits, is there a limit to the arithmetic we can do?

In [1]:
.1 + .2 == .3

False

In [4]:
.1 + .3

0.4

In [None]:
.2 + .3 == .5

In [None]:
.3 - (.1 + .2)

## Memory (RAM or Random Access Memory)
![](images/RAM-Modules.jpg)

Most modern consumer laptops come with 4-16 gigs of ram.

As stated earlier, modern CPUs operate on 64 bits at a time. For example, integers are stored in 64 bits (we will see how, later).

There are 8 bits in a byte, so an integer takes up 8 bytes.

There are 1,000,000 bytes in a megabyte (approx) so a megabyte has 125,000 integers.

There are (approx) 1,000 mb in a gigabyte so a gb has about 125,000,000 integers

In other words, a machine with just 8 gigs can hold 1,000,000,000 integers. **A billion integers**.

Keep in mind that technically, these numbers are counted in terms of power of two, so a gigabyte actually has 1024 megabytes, not 1000. We (and rest of the industry) have simplified the calculation.

Further, your 8 gig laptop can't actually store 1 billion integers, since it also needs to store the operating system, the program processing all those integers, etc. In reality, it may be able to simulate holding more, since it will _swap_ to disk (pretend that the disk is part of the memory).

You can think of ram is a giant array. You can access the array via its index (location 0, location 1, location 23423, etc.). Each location can only hold 64 bits of information.

## Disk

The final main component of a modern computer is the disk drive. You can easily buy disk drives of a terabyte or more. a terrabyte can hold approximately 1 trillion integers.

Older disk drives are often the only physically moving parts on a computer. These disk drives contain spinning disks. These disks spin quite fast (7200 RPM is a common number) and a physical arm makes small movements to reach the location containing requested data.

Similar to RAM, you can think of your disk as an even more gigantic array, accessible by location, containing 64 bits of information at a time.

![](images/Laptop-hard-drive-exposed.jpg)

## Why registers, ram and disks? Answer: surprisingly important

Notice that CPU registers can only contain a handful of integers, ram can hold about a billion and disk can hold around a trillion. Why this separation?

There is a cost/speed tradeoff. According to the website pcpartpicker, current (2019, Q4) prices of ram and disk are as follows:

Ram:  $6.25/Gig

Disk: $0.39/Gig

Far more important are the performance difference. Below is a very well known table of latency numbers (the amount of time it takes to get a value):

https://gist.github.com/hellerbarde/2843375

("Latency numbers every programmer should know" and "Latency numbers humanized)

**Exercise** Read the link above

**Exercise** R and Pandas operate on data which fits in memory, why not just extend to data which fits on disk? (Hint, algorithms for such slow data are completely different)

## Integers (and floats) to zeros and ones

Computers store numbers in "binary" format. Note that normal numbers (base 10) are made up of 10 digits:

`0, 1, 2, 3, 4, 5, 6, 7, 8, 9`

We can re-write these digits as such:

`00, 01, 02, 03, 04, 05, 06, 07, 08, 09`


When we need to go beyond the first 10 digits, we replace the initial zero with 1

` 1 0, 1 1, 1 2, ...`

Once we need to go beyond the first 20 numbers, we repeate the logic stated above:

` 2 0, 2 1, 2 2, ...`

Binary numbers (base 2) are made up of two digits:

`0, 1`

Similar to base 10, we can re-write these numbers as:

`00, 01`

When we need to go beyond 0 and 1, we continue to count up in the following manner:

`00, 01, 10, 11`

When we need to go beyond the first 4:

`000, 001, 010, 011, 100, 101, 110, 111`

The following is a good way of visualizing:

`..|_|_|_|_|_|_|_|_|`<br>
`...7.6.5.4.3.2.1.0 <= location`<br>
`2^.7.6.5.4.3.2.1.0 <= value`

Example: 00000011 = 0 + 0 + 0 + 0 + 0 + 0 + 2^1 + 2^0 = 2 + 1 = 3

![](images/calc.png)

### Limits of integers

Not many years ago, cpus were 32 bit (unlike 64 bit, which is today's register size). If all bits of a 32 bit integer are set to 1, you get 2,147,483,647.

We are ignoring the issue of negative numbers. In reality, the first bit is often the sign bit (an indicator that the number is positive or negative). This means that the actual max integer value is half of 2,147,483,647.

This has had real world implications! YouTube had to change their counter from 32 bits to 64 bits because "Gangham Style" view count exceeded 2 billion. Recall that an 8 gig machine can hold 1 billion integers. This means that if the want to jump to a value at, say location 3 billion, on a 32 bit machine, you simply can't do it! This actually caused 32 bit machines to have very limited amount of ram!

64 bits give use more numbers than there are atoms in the universe.

### Floating point numbers

With integers, we can easily understand limits of numbers. On a 32 bit, we will eventually run out of numbers (both on the positive end as well as the negative end).

What about numbers with decimals in them? How are they stored and what are their limits?

Given a decimal number, it first needs to be written in two parts: matissa and exponent

`number = significand * 10 ^ exponent`

On 64 bit machines, 52 bits are allocated to the significand and 11 to the exponent (the remaining bit is for the sign).

Example: `1.23 = 1.229999.. * 10 ^ 0`

#### Accumulation of errors

Notice that both significand and exponent are of limited size, so each of them have their limitations. Floating point numbers, by their definition are approximated. In a large calculating, such as one where you are multiplying numbers hundreds or thousands of times, these errors will accumulate.

#### Comparison of floats

Recall from an earlier lecture that you should never do direct comparison of floating point numbers. Since the number are not stored exactly, the comparison may not return the expected anser:

In [None]:
.1 + .2 

In [None]:
.3 == (.1 + .2)

Instead of directly comparison, you should always check if the two numbers are close enough that the error doesn't matter:

In [None]:
.3 - (.1 + .2)

Such a tiny number is close enough to zero that you can take it to _mean_ zero. You can also use data types built specifically for decimals (Python's `Decimal`), but they will often be _much_ slower than native floating point operations.

In financial domains, decimals, such as cents are often represented as integers. So while you may think of representing "1 dollar and 12 cents" as `1.23`, a financial programmer is likely to represent it as `123` (the number is multiplied by a number, such as 100).

You can also just round the numbers and be done with it:

In [None]:
round(.3) == round(.1 + .2)

## Strings to zeros and ones

We have seen how an integers and floating point numbers are translated to "zeros and ones" which computers undestand. But how are characters and strings encoded?

Quite simply, there are a few standard ways characters are converted to integers, which are then converted to zeros and ones.

#### ASCII

The most common such mapping is called ASCII, or as _no_ one calls it, "American Standard Code for Information Interchange."

![](images/ascii.png)

**Exercise** There are 128 items on the table above, is there a particular reason for such number?

Notice that not all items on the table are printable characters. The table has space for a bell (007), delete (127), carriage return or new line (013), etc. 

#### Unicode

Modern operating systems and programming enivornments now use a standard called Unicode. Unlike ASCII, unicode has _thousands_ of character mappings. It covers almost all spoken languages, current or historical.

![](images/unicode_sample.png)

The first version of Unicode (1991) provided around 7,000 characters, coverint English, Arabic, Armenian, Bengali, Bopomofo, Cyrillic, Devangari, georgian, Greek, Coptic, Gujarati, Gurmukhi, Hangul, Hebrew, Hiragana, Kannada, Katakana, Lao, Latin, Malayalam, Oriya, Tamil, Telugu, Thai and Tibetan. The version release in May, 2019 has almost 138,000 characters.

Note that the first 127 characters of Unicode match the ASCII characters, thereby preserving some backward compatibility.

By default, when you open a file to read as text (rather than binary), Python will attempt to read the file as Unicode.

#### Other encodings

There are some other encodings floating on the internet. As mentioned earlier, Python will attempt to open file as unicode (for text reading). If the file was encoded in a different format, you may have to explicitely override the encoding:

`file = open(file_name, encoding="latin-1")`

### Space considerations

Notice a subtle but important issue. The number 100 takes up 64 bits on modern computers (and shown above). The number 1,000 also takes up 64 bits. In fact, the largest number 64 (signed) bits can hold is 9,223,372,036,854,775,807!

Also recall that the string "100" has three characters, hence takes up 64 * 3 bits. "1000" adds another digit so the size of this data goes up to 64 * 4. The string "1000" is 4 times as large as the integer 1000. 



In [13]:
import numpy as np

In [None]:
np.array([1]).nbytes, np.array(["1"]).nbytes

In [None]:
np.array([10]).nbytes, np.array(["10"]).nbytes

In [None]:
np.array([100]).nbytes, np.array(["100"]).nbytes

In [None]:
np.array([1000]).nbytes, np.array(["1000"]).nbytes

In [None]:
np.array([10000]).nbytes, np.array(["10000"]).nbytes

**Exercise** If your database, Pandas or R dataframe wrong stores a numeric ID as a string instead of a number, how does this effect the storage and time performance of your work?

**Exercise** Many databases store large number of records in "row" format:

```
name,  age, profession,   niceness
homer, 38 , nuclear tech, 4
burns, 103, ceo         , 0
marge, 36 , home maker  , 8
bart , 10 , student     , 3
lisa , 8  , student     , 7
```

Some databases store data in column format:

```
profession: nuclear tech, ceo, home maker, student, student
niceness  : 4, 0, 8, 3, 7
name      : homer, burns, marge, bart, lisa
age       : 38, 103, 36, 10, 8
```
What is the cost of finding the average `niceness` score of students?


## Further reading:

What every programmer should know about floating-point arithmetic: https://floating-point-gui.de/

Python's docs on floating points: https://docs.python.org/3/tutorial/floatingpoint.html

## References:

ASCII table comes from https://www.asc.ohio-state.edu/demarneffe.1/LING5050/material/characters.html (which credits Wikimedia Commons)

Unicode sample: https://en.wikipedia.org/wiki/File:Unicode_sample.png

- Disk drive imag: https://en.wikipedia.org/wiki/Hard_disk_drive#/media/File:Laptop-hard-drive-exposed.jpg
- Motherboard image credit https://www.servethehome.com/supermicro-x12sca-f-review-intel-xeon-w-1200-motherboard/
- Memory stick image credit https://www.premiumbeat.com/blog/ram-system-memory-for-video-editing/
- CPU image credit https://www.avg.com/en/signal/cpu-stress-test
- Laptop internals image credit https://en.wikipedia.org/wiki/Motherboard