## References, Pointers and RAM memory


_burton rosenberg, 23 june 2023_


### Table of contents.

1. <a href="#intro">Introduction</a>
1. <a href="#caveats">Caveats</a>




### <a name=intro>Introduction</a>


We have looked at the architecture of the Intel 8085 processor, in my opinion, the must influential microprocessor design ever. We look at this microprocessor because it is simple enough to be documented in a few pages. There are a few new concepts that the 8085 did not have, notable protection and virtual memory, but it is pretty much exactly the computer we compute on today.

<div style="float:right;margin:2em;width:450px;border:1px solid green;padding:1em;">
<a title="MCS-80/85 Family User's Manual, Intel, October 1979" href="https://archive.org/details/Mcs80_85FamilyUsersManual/">
<img src="../images/databuses_8085.png">
<br>MCS-80/85 Family User's Manual, Intel, October 1979.
</a>
</div>


We had mentioned that a basic computer had four components,

- The CPU, containing instruction sequence and decode, registers and and ALU for instruction executoin,
- A RAM memory, an integer addressible array of 8-bit bytes.
- An I/O system, which is quite various in its functionality and implementation
- And a bus to connect these three components.

This diagram from the MCS80/80 Family User's Manual helpfully shows us that the bus will have a channel for addresses, a channel for data, and a channel for control. 

C Language provides an almost direct contact with the RAM. Advanced languages such as Java and Python do not, and in fact attempt to hide the RAM through abstractions. Helpfully we will use C to understand RAM.

We claim that RAM is a integer addressable sequnce of 8-bit bytes. In C, the 8-bit byte is the data type `char`. Most of the time, the programmer does not deal with the address of a variable. It is the role of the compiler to layout data into the RAM and keep track of where it is. The compiler will substitue the address of a variable for the name of the variable where it appears on the program.

There are, however, when the programmer deals directly with the memory. Dynamically allocated memory, for data structures that change over time are an example. C does not completely leave the programmer without aid in dealing with memory, so it does not simply give integer addresses. It provides for _references_, that can be used in two ways,

- A reference can be dereferenced to give a value. This happens in the context of an R-value.
- A reference can be dereferenced to give a storage location to store sa value at the reference. This happens in the context of an L-value.

The name L-value and R-value come from left and right. A variable appearing on the left of an assignment is a location to store to; a variable appearing on the right of an assignment is a location to retrieve from.


### <a name=caveats>Many caveats</a>

The notion of an address being an integer glosses over many details. 

First off, there is a difference between address that the CPU and the programmer see and addresses that the bus and the RAM see. The former are _virtual addresses_ and the later are _physical addresses_.

This is needed since many programs are running independently on the computer, making references to memory locations. Each must work in its own independent memory space, so they are not concerned if the address conflict across programs. This is accomplishes with a _virtual memory system_ that is implemented partially in hardware and partially by the operating systme in software. 

This is an advanced topic and we will refer to it only such as we have, in passing, so that you are aware. If you are interested, this is definitely a very interesting thing to learn about. Almost all computers now have virtual memory, however our Intel 8085 did not.

Second, seeking efficiceny, the memory stores and fetches tend to be in "buckets" rather than bytes. Sort of like going grocery shopping, where one makes a list of all associated items for a single trip to the store, a memory store or fetch will carry a _cache line_ of bytes, which include the byte of interest, but include other bytes, on the assumption that they also will soon be needed.

Thirdly, this bucketing is carried out on various levels, creating a level 1, level 2 and level cache. The idea of an integer addressible memory is not denied by these caches, but the notion that we are traveling to RAM with each memory request is. Data is batched even more than a cache line, to make sure that the slow process of accessing RAM is minimized when possible.

Fourth, the Intel architecture is a _segmented memory_ scheme, in which memory is tagged with a few flavors, such as Data, Text, and Stack. A memory address is a pair, an integer for the segment, and a segment descriptor. This is complicated beyond words and specific to Intel. However, other architectures are free to make exotic memory structures such as segments. These complications are best hidden, not just for the purposes of teaching, but for the purposes of C code that works on all architectures without source code changes.