# Unit 8 Part 1: Assignment and References
## Paul Curzon

## Interactive Programming Exercises

### Learning Outcomes
- Explain why assignment appears to work differently between values of simple types and complex ones.

*This notebook looks at code fragments. After completing these exercises you MUST then go on to write full programs - see the programming exercises in the workbook.*

*Answers to exercises are given at the end.*

**Always read the answers to exercises and compare them to your own. There are important things to learn from the answers.**

<span style="color: red;"> It is a really good idea to add your own notes throughout this notebook to reinforce what you have learnt and highlight important points. Click on the + in the toolbar above to make a new note area, and change it from Code to Markdown in the dropdown menu above if your note is not executable code. You may also want to highlight your notes in red as here, so they stand out. You change colours using span like this: </span>
```
<span style="color: red;">THE TEXT TO COLOUR RED</span>
```

## Assignment

We have seen that variables are like storage boxes 
- but that can only store one thing at once, so when a new thing is stored the old thing is shredded,
- and things are copied from them, never moved.

However, there is some apparently odd behaviour of assignment that suggests something more is going on. The above is a good model of what is happening, but something else comes in to play - references. References are just pointers from one storage location in memory to another and understanding them is key to understanding the odd behaviour.

Let us start by reviewing the basics from the early units. 

### Exercise 1<a id="Exercise1"></a>

Read the following code (from the first units) and predict exactly what it prints. After making your prediction run it to check you were right. Then explain what happens.

**Write your prediction and explanation here**


In [None]:
int num1 = 16;
int num2 = 32;

System.out.println("num1 holds " + num1);
System.out.println("num2 holds " + num2);

num2 = num1;

System.out.println("num1 holds " + num1);
System.out.println("num2 holds " + num2);

num1 = 8;

System.out.println("num1 holds " + num1);
System.out.println("num2 holds " + num2);

**NOW READ THE ANSWER** [Click here to jump to the solution to this exercise](#Solution1)

### Exercise 2<a id="Exercise2"></a>

The following code is identical to the above except that it now uses array variables instead of int variables to store the single integer values (in arrays of size 1). Read the code and predict exactly what it prints.

Then run the code to see if you were right.

**Write your prediction here**


In [None]:
int [] numarray1 = {16};
int [] numarray2 = {32};

System.out.println("numarray1 holds " + numarray1[0]);
System.out.println("numarray2 holds " + numarray2[0]);

numarray2 = numarray1;

System.out.println("numarray1 holds " + numarray1[0]);
System.out.println("numarray2 holds " + numarray2[0]);

numarray1[0] = 8;

System.out.println("numarray1 holds " + numarray1[0]);
System.out.println("numarray2 holds " + numarray2[0]);

**NOW READ THE ANSWER** [Click here to jump to the solution to this exercise](#Solution2)

## What is happening

Something odd is happening. You would expect (given what we understand about assignment from the previous units) for the two examples to print the same results, but they don't. In the second example, changing a value in number1 has apparently changed number2 as well even though it is not mentioned. However, they are different variables - different storage spaces that are supposedly not connected. The example seems to suggest assignment has worked differently: as a result of the assignments the two variables appear to be now connected in a strange way.

**Do not jump to conclusions!**

Despite what it looks like, it is NOT assignment that is working differently in the two situations. **It is the way different values are stored that causes the difference.** A variable declaration creates a storage space in memory ('a box') and the variable name is attached to that box. That is the case whatever the type of value stored.

Simple values like integers are stored directly in the box. So the result of 
```java
num1 = 16;
```
is to put 16 in the storage space called num1.

| num1 |
| :---: |
| 16 |

An array is stored differently, however. It is stored in two parts so involves two different storage spaces not just one. The first storage space is similar to the one for integers. It is the place in the computer memory the variable name actually refers to. However the array value itself is not stored there. Instead a **reference** to the value is stored there. A reference is just a **pointer**. It points to the second storage space reserved for the array which is where the actual value is stored. In reality, a reference is a **memory address**: just a big number identifying a specific place in memory. Memory addresses are a bit like the index of an array but giving the position in the whole of memory, rather than just a position in an array.

Below, to help illustrate what is happening, we just use @1 as a shorthand for a particular memory address. We label the second, unnamed storage space by this memory address label. In reality it is just a memory address number.


| numarray1 | 
| :---: | 
| @1 |

<center>I</center>

<center>I</center>

<center>I</center>

<center>V</center>
    

| @1 | 
| :---: | 
| 16 |


The arrow shows how the stored memory address points to the other storage space.

## How does the way arrays are stored affect assignment?

When we assign a value to a variable such as
```java
num1 = 16;
```
We are just putting a value in the named storage space.

When we refer to a variable in an expression such as on the right hand side of
an assignment, we get a value out of the named storage space. Therefore,
```java
num2 = num1;
```
gets the value out of the storage space num1 and puts it in num2.

Exactly the same thing happens with assignment except this time what is in those storage spaces refered to by the variable name are not the data itself but a reference.

This means that when we assign a value to a variable such as
```java
numarray2 = numarray1;
```
the right hand side evaluates to give us a copy of the value in numarray1 - and that value is the **reference** that is stored in numarray1. The assignment then stores that reference into numarray2. We have copied a memory address from one place to the other. This means that we are changing where the variable numarray2 is pointing to. 
numarray2 now points to the same block of memory as numarray1. It now holds a new address.
The actual data the two array variables are pointing to is untouched by the assignment.

Both variables now point to the same storage space - the storage space that was originally set up for numarray1. The original storage space numarray2 was pointing to is lost (unless a copy of it was made). The memory concerned is eventually freed up to be used again.

When we then do an assignment to change an entry in the array, we change the information at the place both variables are pointing to.

So
```java
numarray1[0] = 8;
```
changes the value in the 0th position after that pointed to from numarray1 to 8. However, numarray2 is now pointing to the same place, so when we follow its pointer and look at it ```numarray2[0]``` it goes now to the same storage location and so gives us back the same value.  Printing the values in both arrays does the same thing. 


The upshot of this is that when we give an array variable name it access the variable value as
with any other value. However, what it finds there is a reference not the data. Assignment always
does the same thing copying and storing new values in the named storage spaces,
but the values in those storage spaces being manipulated are references in the case of an array.


### Exercise 3<a id="Exercise3"></a>

Run the following code to see what happens. Given the above explanation of how arrays are stored, explain what happens.

**Write your prediction and explanation here**


In [None]:
int [] numarray1 = {16};
System.out.println(numarray1);
System.out.println(numarray1[0]);

**NOW READ THE ANSWER** [Click here to jump to the solution to this exercise](#Solution3)

## Following references

When we give an index to an array variable (eg ```numarray1[0]```) we are telling the computer to follow the pointer. 
You can think of a reference as being a signpost telling you how to get
to the real storage location. Then treat the ```[]``` as meaning "follow the signpost and go to
where it is pointing to find the actual data."

So ```numarray1[0]``` says go to numarray1 and note
the reference stored there, then "follow the signpost" ie go to the block of storage at the
given memory address. The ```[0]``` says once you get there go to the 0th place on from that. 

An expression ```a[5]```
 says look at the reference in a, then go to that storage space it points to, but then go on 5 places further from that point to find the actual data. The value found there is what ```a[5]``` evaluates to.

The print statement:
```java
System.out.println(numarray1[0]);
```
thus prints the value stored in the location where numarray1 is pointing. 16 is stored there 
so 16 is passed to the print method to be printed.


### Exercise 4<a id="Exercise4"></a>

Edit the following code so that the two arrays have length 2 so store two values. Run it to check that the same thing happens when the other values in the array are accessed.

In [None]:
int [] numarray1 = {16};
int [] numarray2 = {32};

System.out.println("numarray1 holds " + numarray1[0]);
System.out.println("numarray2 holds " + numarray2[0]);

numarray2 = numarray1;

numarray1[0] = 8;

System.out.println("numarray1 holds " + numarray1[0]);
System.out.println("numarray2 holds " + numarray2[0]);

**NOW READ THE ANSWER** [Click here to jump to the solution to this exercise](#Solution4)

### Exercise 5<a id="Exercise5"></a>

Predict what you think the following code will do. Explain what happens. 

Are records stored like integers or like arrays using reference?

**Write you answer here**


In [None]:
class Animal
{
   String name;
}


Animal variable1 = new Animal();
Animal variable2 = new Animal();

variable1.name = "cat";
variable2.name = "dog";

variable2 = variable1;

variable1.name = "cow";

System.out.println("variable1 holds " + variable1.name);
System.out.println("variable2 holds " + variable2.name);

**NOW READ THE ANSWER** [Click here to jump to the solution to this exercise](#Solution5)

### Exercise 6<a id="Exercise6"></a>

Convert the following code so that the variables are of type char and store single characters in them (ie values
'a', 'b' and 'c').

Based on the results, are single characters stored like integers in the named storage space or as references like arrays?

**Write you answer here**


In [None]:
int variable1 = 16;
int variable2 = 32;

System.out.println("variable1 holds " + variable1);
System.out.println("variable2 holds " + variable2);

variable2 = variable1;

variable1 = 8;

System.out.println("variable1 holds " + variable1);
System.out.println("variable2 holds " + variable2);

**NOW READ THE ANSWER** [Click here to jump to the solution to this exercise](#Solution6)


## Strings
Strings are more complicated. They are actually stored as references, but in a way that hides the fact (basically they are defined as an abstract data type hiding the reference implementation). The key difference is **String values cannot be changed once creaated**.  Each new String value is stored in a different storage place. When you store a new String value in a variable you are creating a completely new sequence of characters stored on the **heap** first and then making a reference point to it, stored in the variable. When you concatenate two strings like 
```
"hello " + "Paul"
```
to create the String ```"hello Paul"``` you are making a completely new string not changing either
 ```"hello "```  or  ```"Paul"```.

### Exercise 7<a id="Exercise7"></a>

Run the following similar code to that above. Given Strings **are stored as references like arrays**, can you explain based on the explanation above why String assignment seems to have the same effect as with integers rather than as with arrays and records.

In [None]:
String variable1 = "cat";
String variable2 = "dog";

System.out.println("variable1 holds " + variable1);
System.out.println("variable2 holds " + variable2);

variable2 = variable1;

System.out.println("variable1 holds " + variable1);
System.out.println("variable2 holds " + variable2);

variable1 = "ant";

System.out.println("variable1 holds " + variable1);
System.out.println("variable2 holds " + variable2);

**NOW READ THE ANSWER** [Click here to jump to the solution to this exercise](#Solution7)

## The heap and the stack

When a Java program runs the memory it uses is divided into two areas called the stack and the heap that are both organised and used differently. All declared variables are allocated a place on the stack in the next free space there. Variables thus refer to the stack. It is thus a very organised area of memory in a way that supports localisation.

For variables of simple types like integers and characters, the variable is a place on the stack. When values are stored in a variable they are stored in that place on the stack. So a declaration like

```java
int count = 1;
```

creates a storage space on the stack, labels it count and the value 1 is stored in that position on the stack.

For variables holding reference types like arrays, records and Strings, the variable also corresponds to a place on the stack. However, the value stored there on the stack is the reference. All references then point to storage locations on the heap.

Therefore, in a declaration like

```java
int [] ages = {1,2,3};
```
one storage space is created on the stack. It is labelled ages, and it holds a reference. The values 1, 2 and 3 are stored together on the heap. The reference stored on the stack points to that new place on the heap.

The heap can hold references too, pointing to other places on the heap. If we declare an array of Strings, then all the values stored in the array (on the heap) are Strings. However, those Strings are stored as references themselves, so what the values in the array are, are just references to other places on the heap.

## new

Given the above explanation of references, the stack and the heap, we can now start to understand what **new** does: it just allocates storage space on the heap.

### Exercise 8<a id="Exercise8"></a>

Explain what each line in the following code fragment does in terms of the stack and the heap.

**Write your explanation here**


In [None]:
int myage = 50;

int[] ages;
ages = new int[5];
ages[3] = myage;
System.out.println(ages[3]);

**NOW READ THE ANSWER** [Click here to jump to the solution to this exercise](#Solution8)

### Exercise 9<a id="Exercise9"></a>

What will the following print? Run it to see if you are right and explain what it is doing.

**Write your answer here**


In [None]:
System.out.println(new int[3]);

**NOW READ THE ANSWER** [Click here to jump to the solution to this exercise](#Solution9)

### Exercise 10<a id="Exercise10"></a>

Summarise what you know about the way values of different types are stored.

**Write your summary here**


**NOW READ THE ANSWER** [Click here to jump to the solution to this exercise](#Solution10)

*Once you have done the above exercises (and understand how the concepts work) move on to doing the actual programming exercises from the workbook, writing full programs. You must be able to write full programs, not just fragments.*

## Solutions

### Solution to Exercise 1<a id="Solution1"></a>

It prints
```
num1 holds 16
num2 holds 32
num1 holds 16
num2 holds 16
num1 holds 8
num2 holds 16 
```

The code stores 16 in num1 and then 32 into num2, printing both out.
It then makes a copy of the value in num1 and stores it in num2, leaving num1 alone.
```java
num2 = num1;
```
num2 as a result now holds 16 as does num1 still.
We then assign the value 8 to num1. That means num2 gets the value 8, losing its previous value.
```java
num1 = 8;
```
As this assignment does not mention num2, num2 retains the value 16

Thus the final values printed are 8 and 16.

[Return to Exercise](#Exercise1)

### Solution to Exercise 2<a id="Solution2"></a>

Despite looking essentially the same it this time prints  adifferent value at the end for numarray1
```
numarray1 holds 16
numarray2 holds 32
numarray1 holds 16
numarray2 holds 16
numarray1 holds 8
numarray2 holds 8
```
Changing the value in variable numarray1 seems to have also changed the value in numarray2 even though they are different variables so different storage spaces. It seems like the assignment
```java
numarray2 = numarray1;
```
has behaved completely differently here to the equivalent looking.
```java
num2 = num1;
```

It *seems* to have somehow made the two variables be the same box. It hasn't!

In fact, the assignments are behaving exactly the same in the two examples of exercise 1 and 2. The difference is not due to the assignment but due to the fact that integers and arrays are stored in different ways, as we will see.

[Return to Exercise](#Exercise2)

### Solution to Exercise 3<a id="Solution3"></a>
It prints something like:
```
[I@1032949d
16
```
where the first line printed is apparently a meaningless string of digits and letters. On the second line 16 is printed.

The code first creates an array variable, numarray1 storing in it a reference to another storage space
where the 16 is stored. When we print numarray1 itself: not one cell of it like ```numarray1[0]``` but the whole thing as in
```java
System.out.println(numarray1);
```
the reference is printed.

We are not actually printing the data here. We are printing the reference that is stored in the
storage space numarray1 itself - a memory address (so essentially a big number). It is printed in a number representation called hexadecimal (base 16) which has 16 digits so contains letters a-f as the extra digits that come after 9. Each time you run it a different place in memory will be **allocated** (ie chosen) for the array storage and so the address will be different each time. That is why the hexadecimal number printed is different each time it is run.

By contrast, when we print ```numarray1[0]``` as in
```java
System.out.println(numarray1);
```
we are going to that memory address and printing the value stored there. It is where the 16 was put so
that is what we print. You can think of the square brackets as meaning follow the pointer.

[Return to Exercise](#Exercise3)

### Solution to Exercise 4<a id="Solution4"></a>

The same thing does happen all values in numarray1 and numarray2 are changed - all values in the array are accessed by following where the reference / pointers so once the variable is pointing to a different place all the array locations are affected.

In [None]:
int [] numarray1 = {16, 17};
int [] numarray2 = {32, 33};

System.out.println("numarray1 holds " + numarray1[0] + "," + numarray1[1]);
System.out.println("numarray2 holds " + numarray2[0] + "," + numarray2[1]);

numarray2 = numarray1;

numarray1[0] = 8;
numarray1[1] = 9;

System.out.println("numarray1 holds " + numarray1[0] + "," + numarray1[1]);
System.out.println("numarray2 holds " + numarray2[0] + "," + numarray2[1]);


[Return to Exercise](#Exercise4)

### Solution to Exercise 5<a id="Solution5"></a>

It prints
```
variable1 holds cow
variable2 holds cow
```
so even though we only changed variable1's field to cow, variable2's field has changed too.

This means that, like arrays, records are stored as references to the actual record.

Refer to the copy of the program below. It declares a new type of animal with a single field - it's name. Two variables of type animal are created one storing the String "cat" and the other storing the String "dog". However, the two variables actually just store references to other storage spaces in memory. Those Strings "cat" and "dog" are stored elsewhere. The references stored in the variables indicate where. 

The assignment
```java
variable2 = variable1;
```
makes a copy of the reference that was in variable1 (which points to the storage holding "cat") and puts it in variable2. At this point both variables point to the same place - where "cat" is stored.

The next instruction:
The assignment
```java
variable1.name = "cow";
```
says go to variable1 and follow the reference you find there. Where you end up you find a record with a single field called name. Go into that field and put the String "cow" there.

Now beecause variable2 contains the same reference as variable1, it is pointing at the same storage space where "cat" *was* stored, but where now "cow" is stored. That means when we print the name field of either variable1 or variable2, it is now "cow". Both variables point to the same storage location so both hold the same values.


In [None]:
class Animal
{
   String name;
}


Animal variable1 = new Animal();
Animal variable2 = new Animal();
variable1.name = "cat";
variable2.name = "dog";

variable2 = variable1;

variable1.name = "cow";

System.out.println("variable1 holds " + variable1.name);
System.out.println("variable2 holds " + variable2.name);

[Return to Exercise](#Exercise5)

### Solution to Exercise 6<a id="Solution6"></a>

Change the types to char and make the three values assigned to them different. If variable1 and variable2 remain different at the end then the variables are storing the actual values.

The following code prints
```
variable1 holds c
variable2 holds a
```
The last assignment to variable1 does not change the value printed when variable2 is printed, so they are NOT storing references.

Characters are stored like integers, actually in the named variable storage space, not as references, as are the other basic types such as booleans and doubles.

This means the final assignment only affects variable1 not variable2, so only variable1 gets the new value of the character ```'c'```


In [None]:
char variable1 = 'a';
char variable2 = 'b';

System.out.println("variable1 holds " + variable1);
System.out.println("variable2 holds " + variable2);

variable2 = variable1;

variable1 = 'c';

System.out.println("variable1 holds " + variable1);
System.out.println("variable2 holds " + variable2);

[Return to Exercise](#Exercise6)

### Solution to Exercise 7<a id="Solution7"></a>

Strings are stored as references. However, each separate string value ("cat", "dog" and "ant") are allocated their own specific starage space on the heap that does not change. 

The initialisations make the two variable point to different places. The assignment
```java
variable2 = variable1;
```
stores a copy of the reference from variable1 into variable2. That makes them both point to the same memory address so the same sequence of characters ie "cat".

When we do the final assignment,
```java
variable2 = "ant";
```
we are creating a completely new string "ant" at another place in memory. The reference to that address is then stored in the variable, variable1.

Notice here all the assignments to variables are concerned with the variable itself. With arrays and records, the last assignment did not change the memory address value stored in the variable, but followed the reference and changed a value in the place pointed to... 

With arrays we follow the reference by giving an index as in ```a[1]```. 

With records we follow the reference by giving a field name as in ```b.name```

The final assignment in the String example is not following a reference, just overwriting it with a new one.


[Return to Exercise](#Exercise7)

### Solution to Exercise 8<a id="Solution8"></a>

```java
int myage = 50;
```
This creates a storage space on the stack, labels it my age and stores the value 50 in that place on the stack.

```java
int[] ages;
```

This creates a storage space on the stack and labels it my ages. Nothing is explicitly stored in it - no reference - so a null pointer (a pointer that points nowhere) is stored in it. 

```java
ages = new int[5];
```

The new command finds and allocates a new block of memory on the heap, that is big enough to store an array of integers. It is reserved so will not be allocated to anything else. The storage spaces in memory are also initialised with default values (0 for integers). new returns the reference to it (its memory address) and that memory address is stored by the assignment into the storage location labelled ages on the stack.

So after it is executed, the variable ages (on the stack) is now pointing to the new block of memory (on the heap) and in that block of memory, five 0 values are stored.

```java
ages[3] = myage;
```

This goes to myage which is on the stack and takes a copy of the value (50) stored there. It then goes to the variable ages (also on the stack) follows the reference to the heap and counts on 3 memory locations (staying on the heap) to get to the position which ```ages[3]``` refers to. The value 50 is stored in that storage location (on the heap).

```java
System.out.println(ages[3]);
```
This 
- goes to the stack where ages is stored, 
- follows the reference from there to the heap, 
- goes on 3 places further
- pulls out the value from that location (the value 50) and 
- prints that value 50.

[Return to Exercise](#Exercise8)

### Solution to Exercise 9<a id="Solution9"></a>
It prints something like
```
[I@4971bf88
```
though it changes each time. 

new returns the memory address (a reference) to a new area of memory that has been allocated. Therefore, if we print the result of new being called we print the memory address. A similar thing happens if we call new when declaring a new record variable.

[Return to Exercise](#Exercise9)

### Solution to Exercise 10<a id="Solution10"></a>

Memory is divided into the stack and the heap. Whenever a variable is declared, whatever the type, space is allocated for it on the stack. The variable's storage space resides on the stack. For simple values like integers, booleans etc that is the only memory used to store the value. For complex values of array, record and String type, however, what is stored there is a reference to a second storage space. That second storage space is on the heap. It is allocated when new is executed, whether explicitly as in 
```java
int [] a = new int [100];
```
or implicitly as in
```java
int [] b = {1,2,3};
```
The latter is just shorthand for a more complex expression involving new. new is also called implicitly whenever a String value is created.

The above declaration of int array variable, a, first creates an array variable and then initialises it with a reference to the new storage space allocated on the heap.

This has consequences for assignment. An assignment to a variable always changes the storage space on the stack whatever the type. For variables storing values with reference types this means it changes the reference stored there so it points to a new place. This means it is possible to have two variables pointing to the same place on the heap. Then changing one value changes the other. The two variable names end up being aliases of the same array value. This can lead to very hard to understand programs, with very subtle bugs, so should be used with great care of at all.

To change the value on the heap you have to indicate in some way that the value on the heap is meant by giving for example an index into an array or a filed name.

It also has consequences for tests of equality and method call as we will see.


[Return to Exercise](#Exercise10)