# Week 1: Introduction to Data Types

# Advanced: Copying and References

Welcome to the *Week 1 Advanced Python Notebook*. This notebook is designed for students who already have substantial experience with Python and feel confident working with both the [`Beginner`](./week_01_intro_to_data_types_beginner.ipynb) and [`Intermediate`](./week_01_intro_to_data_types_intermediate.ipynb) material.  

Your task today is to carefully read through the content and complete the exercises at the end. These exercises are more challenging and are intended to deepen your understanding of how Python handles data behind the scenes.  

> **Important:** This notebook is only recommended if you are already very confident with Python. Before beginning, you must have attempted at least $4$ exercises from both the [`Beginner`](./week_01_intro_to_data_types_beginner.ipynb) and [`Intermediate`](./week_01_intro_to_data_types_intermediate.ipynb) notebooks. If you have not done so, please return to those notebooks first, as the material here builds directly on those foundations.  

In this notebook, you will explore the critical differences between *immutable* and *mutable* data types in Python. Specifically, you will learn how Python handles *copying and references*, which can be critical knowledge when it comes to debugging complex code.  

Work through the examples carefully, and take your time with the exercises. They are designed to stretch your understanding and prepare you for advanced applications of Python.  


### Table of Contents

 - [Welcome Page](./week_01_home.ipynb)

 - [Beginner: Basic Data Types](./week_01_intro_to_data_types_beginner.ipynb)
 - [Intermediate: Collections](./week_01_intro_to_data_types_intermediate.ipynb)
 - [**Advanced: Copying and References**](./week_01_intro_to_data_types_advanced.ipynb)
   - [Immutable vs Mutable](#Immutable-vs-Mutable)
   - [The id Function](#The-id()-Function)
   - [Exercises](#Exercises)
 - [Slides](./week_01_slides.ipynb) ([Powerpoint](./Lecture1_Introduction_And_Data_Types.pptx))

## Immutable vs Mutable


In Python, all data types can be described as either "immutable" or "mutable".

To understand the difference between "immutable" and "mutable" types, it may be useful to introduce the notion of a *reference*. You can think of a *reference* as an address which tells us where some data lives physically on a machine. When we talk about variables, we really are talking about *reference*'s (which we have named) that point us to some data in memory.

When you reassign the value of a variable in your code, there are actually two possible things that could be happening. The *reference* could be changed (i.e. the variable now represents a different place in memory), or the data itself could be changed (i.e. the variable is still "looking" at the same location in memory, but the data that is stored there has changed).

What is important to know here is that when you change the value of a variable which has an "immutable" data type you are changing a *reference* whereas when you are changing "mutable" variables you are changing the data itself. Examples of mutable data types in Python include the `list`, `dictionary` and the `set`. On the other hand, examples of immutable data types are given by the `int`, `float`, `decimal`, `bool` and the `tuple`. In general, the more complicated data types discussed so far are "mutable". 

The distinction between a data type being "mutable" or "immutable" may seem dull and/or trivial but, in practice, can result in some very unexpected behaviour, especially when multiple variables are using the same *reference* (i.e. "looking" at the same place in memory)!

For example, in the below code we may expect `a` and `b` to have different values:

In [None]:
a = 7
b = a # a and b are now both looking in the same place in memory
a = 10 # Here we have changed the reference
print(a)
print(b) # Changing a has not changed b

And they do! However, if we change `a` from being `7` to a list containing `7`, perhaps surprisingly, changing the value of `a` also changes the value `b`!

In [None]:
a = [7]
b = a # a and b are now both looking in the same place in memory
a[0] = 10 # Here we have changed the data itself!
print(a)
print(b) # Changing a has changed b

In both the examples above, we start by assigning `a` and `b` as references to the same location in memory. 

In the first example, when we assign `a=10`, we are telling Python that `a` must change where it is "looking" in the computers memory. This does not have any effect on the value of `b`.


In the second example, however, when we assign `a[0]=10`, we are telling Python that the data stored in the location which `a` is "looking" at at must be changed. As `b` is also "looking" at this location in memory, this does have an effect on the value of `b`. It has changed!

It is worth noting though that if an operation is performed then a copy might be made:

In [None]:
a = [7]
b = a*2 # In this case b is a reference to a new object.
a[0] = 10
print(a)
print(b) # Changing a has not changed b

In this case, to ensure we are working with a copy of `a` and not just a reference to the same variable, we can use the `list` constructor (see below). 

 > **Note:** In this case `a` is a `list` so we use the `list` constructor. For other datatypes similar constructors exist and would be used in this situation (e.g. `set`, `dict`, etc...).

In [None]:
a = [7]
b = list(a) # This time, a and b are not both looking in the same place in memory!
a[0] = 10
print(a)
print(b) # Changing a has not changed b

 > **Warning:** Errors of this type can often cause extremely anti-intuitive behaviour, including unexpected interactions between functions. 
 >
 > For example, in the below a function is called on a variable `b`, yet a seemingly unrelated variable `a` was affected by calling the function. This is because `a` and `b` were both references to the same object in memory, as oppose to being distinct copies of the object.  
 >
 > If you are not familiar with functions, do not worry; these will be covered in depth later in the course and, for now, move past this example.

In [None]:
def function1(x):
    x.append(10)
    return(x)

# Create a variable a
a = [3]
print(a)

# Set b equal to a
b = a

# Run function1 on b; surely this couldn't affect a...
c = function1(b)

# In actual fact, as a and b are both names for the same object~
# in memory, changing b was the same as changing a (note that 
# the append operation in the function is where b was changed).
print(a)

## The `id()` Function


Now that we understand the difference between mutable and immutable types, let's take a look at the Python `id()` function, which can help us visualise what is happening with references and memory locations.

The `id()` function returns the unique identifier (memory address) of an object. This identifier tells us exactly where some data lives physically on a machine - essentially showing us the reference that a variable is pointing to.

In [None]:
# Basic usage of id()
x = 42
print("ID of x:", id(x))
print("ID of the literal 42:", id(42))

With immutable data types, when you reassign a variable, you are changing the reference (the variable now points to a different location in memory). Let's see this with our integer example:

In [None]:
print("---------------")
print("Before a = 10:")
print("---------------")

# Integer example - reproducing the behaviour from earlier
a = 7
b = a # a and b are now both looking in the same place in memory
print("a = " + str(a) + ", id(a) = " + str(id(a)))
print("b = " + str(b) + ", id(b) = " + str(id(b)))
print("a and b reference the same object:", id(a) == id(b))

# What happens when we "change" the integer?
original_id = id(a)
a = 10 # Here we have changed the reference
print("---------------")
print("After a = 10:")
print("---------------")
print("a = " + str(a) + ", id(a) = " + str(id(a)))
print("b = " + str(b) + ",  id(b) = " + str(id(b)))
print("a and b now reference different objects:", id(a) == id(b))
print("id(a) changed:", id(a) == original_id)
print("This is why changing a did not change b!")

With mutable data types, when you modify the data, you are changing the data itself (the reference stays the same - the variable is still "looking" at the same location in memory). Let's reproduce our list example:

In [None]:
print("---------------")
print("Before a[0] = 10:")
print("---------------")

# List example - reproducing the behaviour from earlier
a = [7]
b = a # a and b are now both looking in the same place in memory
print("a = " + str(a) + ", id(a) = " + str(id(a)))
print("b = " + str(b) + ", id(b) = " + str(id(b)))
print("a and b reference the same object:", id(a) == id(b))

# Modifying the list in place
original_id = id(a)
a[0] = 10 # Here we have changed the data itself!
print("---------------")
print("After a[0] = 10:")
print("---------------")
print("a = " + str(a) + ", id(a) = " + str(id(a)))
print("b = " + str(b) + ", id(b) = " + str(id(b)))
print("a and b still reference the same object:", id(a) == id(b))
print("id(a) remained the same:", id(a) == original_id)
print("This is why changing a also changed b!")

## The `is` Operator

The `is` operator and `==` operator are often confused with one another but they are not same. The `is` checks if both the variables point to the same object in memory whereas the `==` sign checks if the values of the two variables are equal. 

If the `is` operator returns `True` then the equality is definitely `True`, but the opposite may or may not be the case. For an example see the below.

 > **Warning:** Avoid using the `is` operator for "immutable" types such as strings and numbers; the result is unpredictable and in most cases the `==` is more appropriate for purpose. 

In [None]:
# a and b are set to both represent the same object in memory.
a = b = [1,2,3]

# c represents a list in a different location in memory but with the same
# value as a and b
c = [1,2,3]
print(a)
print(b)
print(c)

# a, b and c are all equal
print('a == b: ', a == b)
print('a == c: ', a == c)

# But only a and b point to the same location in memory, c is treated as a seperate 'copy' of [1,2,3]
print('a is b: ', a is b)
print('a is c: ', a is c)

## Exercises

**Question 1:** Predict the output of the following code. What will be printed and why?

```
list_a = [1, 2, 3]
list_b = list_a
list_a.append(4)
print(list_b)
```

Run the code in the box below to verify your answer.

In [None]:
# Write your code here...

**Question 2:** Consider these two similar-looking pieces of code. Predict what each will output and explain the difference:

Code A:

```
a = [1, 2, 3]
b = a
a = [4, 5, 6]
print(b)
```

Code B:

```
a = [1, 2, 3]
b = a
a[0] = 4
a[1] = 5
a[2] = 6
print(b)
```

Run the code above to verify your answers.

In [None]:
# Write your code here...

**Question 3:** In the below we have a list of 3 values, `x=1`, `y=2` and `z=3`. We want to work out the value of:

   > $x + y + z + x^2 + y^2 + z^2$ 
   > $= 1 + 2 + 3 + 1 + 4 + 9$
   > $= 20$
    
The below code should give us $20$ as an answer... but it doesn't - something has gone wrong! Can you see what is wrong in the below code? How would you fix it?
    

In [None]:
xyz = [1,2,3]

# Make a list of x squared, y squared, z squared
xyzsquared = xyz
xyzsquared[0] = xyzsquared[0]**2
xyzsquared[1] = xyzsquared[1]**2
xyzsquared[2] = xyzsquared[2]**2

# Get x, y and z from xyz list
x = xyz[0]
y = xyz[1]
z = xyz[2]

# Get x squared, y squared and z squared from
# xyzsquared list
xsquared = xyzsquared[0]
ysquared = xyzsquared[1]
zsquared = xyzsquared[2]

print(x + y + z + xsquared + ysquared + zsquared)

**Question 4:** You have a list of lists representing a grid. You want to create a backup copy before making changes, but the following approach doesn't work correctly. Explain what has gone wrong here.

In [None]:
grid = [[1, 2], [3, 4], [5, 6]]
backup = list(grid)

# Make a change to the grid
grid[0][0] = 999

print("grid =", grid)
print("backup =", backup)  # This should be unchanged, but it isn't!

**Question 5:** The following code creates a 3x3 grid filled with zeros, but when you try to modify one cell, unexpected behavior occurs. Identify the problem and provide a solution:

In [None]:
# This code has a problem!
grid = [[0] * 3] * 3
print("Initial grid:", grid)

grid[0][0] = 1
print("After setting grid[0][0] = 1:", grid)
# Expected: [[1, 0, 0], [0, 0, 0], [0, 0, 0]]
# Actual: [[1, 0, 0], [1, 0, 0], [1, 0, 0]]

In [None]:
# Write your solution code here...