<div style="text-align:right;color:blue">version id: __VERSION_ID__</div>

# OOP - Part III b: Copying Objects

In this notebook, we will learn the issues with copying an object in a naive way and the correct way of copying objects. This also tells us more about how objects are stored in memory. 

### What you will learn

In particular, we will cover:

* Deep and shallow copying of lists
* Deep and shallow copying of objects

<hr style="height: 2px">

*&#169; Pranav Singh, University of Bath 2021-2022. This problem sheet is copyright of Pranav Singh, University of Bath. It is provided exclusively for educational purposes at the University and is to be downloaded or copied for your private study only. Further distribution, e.g. by upload to external repositories, is prohibited.*

# Copying objects - shallow vs deep copy

Recall that if `a` is a `list` such as `[3,1,4]`, then `b=a` does not create a copy of the data inside the list. Instead, it just copies the reference or address - i.e. the location where the data `[3,1,4]` is stored in your computer's memory. Thus, both variables `a` and `b` end up referring to the same data.

We can see this behaviour by assigning `b=a` and then changing an element of `b`. We find that the list `a` is also modified! Infact `a` and `b` are the same object effectively (as verified by `a is b`) since they store the same address (or location in memory).

In [None]:
a = [3,1,4]
b = a
b[1]=-5

print(a)
print(b)

a is b

Often this is not the behaviour we want or expect! Typically we want to or expect to be able to create a *copy of the data in the list* `a` and store it in the variable `b`. Afterwards, any changes to `b` should not affect `a`. 

In order to achieve this we should use the `copy` command. To do this we must import the `copy` package and call `copy.copy` function with input `a`.

In [None]:
import copy

a = [3,1,4]
b = copy.copy(a)
b[1]=-5

print(a)
print(b)

a is b

We no longer have `a is b`, and `a` and `b` are now refering to different locations in memory where the list data is stored. 

### Shallow copy

There is still an issue with this approach. Let us see this with another example.

In [None]:
a = [[3,1,4], [0,1,0]]
b = copy.copy(a)

b[1] = [-1,-1,-1]

print(a)
print(b)

a is b

This worked as expected! Changing the 2nd element of `b` (i.e. `b[1]`) does not change `a`. We can see that `a` and `b` are not the same object since we get `False` for `a is b` (they do not refer to the same location in memory). 

However, consider the following example:

In [None]:
a = [[3,1,4], [0,1,0]]
b = copy.copy(a)
b[1][1]=-5

print(a)
print(b)

a is b

Even though we used `copy.copy` to create a copy of `a`, modifying an element inside the second element of `b` still changed `a`!

What exactly is going on here? 

What is happening is that `a[0]` (the first element of `a`) stores the address (or location in memory) for the list `[3,1,4]`, while `a[1]` (the second element of `a`) stores the address (or location in memory) for the list `[0,1,0]`.

When we use `copy.copy`, `b` is indeed a new list distinct from `a`, hence `a is b` returns `False`. `b` stores a new copy of the data stored in `a`. 

However, as we have mentioned the data actually stored in `a` is not the lists `[3,1,4]` and `[0,1,0]` themselves, but the addresses (or locations in memory) to `[3,1,4]` and `[0,1,0]`, which are stored in the elements `a[0]` and `a[1]`, respectively. Thus, `b = copy.copy(a)` ends up copying the values of the addresses (or locations) stored in `a[0]` and `a[1]`. Therefore, `b[0]` still refers to the same data as `a[0]`! We can see this by checking `b[0] is a[0]`.

In [None]:
b[0] is a[0]

In particular, no copies of `[3,1,4]` and `[0,1,0]` are created!

What has happened here is a called a shallow copy. That is, we have created a new copy of the data only at the topmost level: we create a new list `b` with two elements `b[0]` and `b[1]`, but we then assign the values of `a[0]` and `a[1]` to these elements. This is equivalent to the following call:  

```Python
b[0]=a[0]
b[1]=a[1]
```

As we have already seen, doing so does not create a new copy of the list `a[0]` (or `a[1]`)! In order to do so, we must use `copy.copy()` at this level as well. How can we achieve this (recusive) copying that copies at all levels?


### Deep copy

The problems identified above can be fixed by using `copy.deepcopy`, which recursively goes through the data and creates a fresh copy at each level! 

In [None]:
a = [[3,1,4], [0,1,0]]
b = copy.deepcopy(a)
b[1][1]=-5

print(a)
print(b)

We now find that `a` is not changed on changing an element in second list of `b`. In fact, the elements `b[0]` and `b[1]` are no longer the same objects as `a[0]` and `a[1]`:

In [None]:
b[0] is a[0]

In [None]:
b[1] is a[1]

Unless you have good reasons to create a shallow copy, it is much safer to create deep copies. Of course deep copies replicate all data recursively, and this increases storage requirement. 

# Copying user defined types

To see equivalent behaviour in context of user defined types, let us start by defining new user types `A` and `B`. 

In [None]:
class A(object):
    '''test class A'''
    
    def __init__(self, lst):
        self.lst = lst
        
    def __str__(self):
        return 'A{' + str(self.lst) + '}'

In [None]:
class B(object):
    '''test class B'''
    
    def __init__(self, a1, a2):
        self.a1 = a1
        self.a2 = a2
        
    def __str__(self):
        return 'B[' + str(self.a1) + ' ; '  + str(self.a2) + ']'

Let us create some instances of the class `A`. 

In [None]:
a1 = A([1,5,-1])
a2 = A([0,2])
a3 = A([-1,0,1])

Let us create instances of the class `B` where the attributes `a1` and `a2` are set to instances of class `A` that we have just created:

In [None]:
b1 = B(a1, a2)
b2 = B(a1, a3)

Since we have defined `__str__()` methods in both classes `A` and `B`, it becomes easy to print these objects:

In [None]:
print(a3)
print(b1)
print(b2)

### Copy by reference

If we simply assign the value of the object `b1` to the object `b3`, we will only be copying the reference (or address). That is `b3` and `b1` are the same object (as we can verify using `b3 is b1`) and modifying `b3` modifies the object `b1`. 

In [None]:
b3 = b1
b3.a1 = a3

print(b3)
print(b1)

b3 is b1

### Shallow copy

We can create a shallow copy of `b1` by using `copy.copy`. Now `b3` and `b1` are not the same object!

In [None]:
b3 = copy.copy(b1)
b3.a1 = a2

print(b3)
print(b1)

b3 is b1

However, just like the case of lists, see what happens if we modify an attribute of `b3.a1`

In [None]:
b3 = copy.copy(b1)
b3.a1.lst = [0,0,0,0]

print(b3)
print(b1)

This is despite the fact that `b3` is *not* the same object as `b1`:

In [None]:
b3 is b1

However, this uniqueness only goes one level deep since we have used shallow copy, and `b3.a1` and `b1.a1` are the same object!

In [None]:
b3.a1 is b1.a1

### Deep copying objects

In order to copy the data in the object at all levels, we need to use the `copy.deepcopy` function.

In [None]:
b1 = B(a1, a2)
b2 = B(a1, a3)
b3 = copy.deepcopy(b1)
b3.a1.lst = [0,0,0,0]

print(b3)
print(b1)

In [None]:
b1 = B(a1, a2)
b2 = B(a1, a3)
b3 = copy.deepcopy(b1)
b3.a1.lst[0] = -100

print(b3)
print(b1)

We can verify that the attributes are unique even at deeper levels.

In [None]:
b3.a1 is b1.a1

In [None]:
b3.a1.lst is b1.a1.lst

## Check your understanding

The solutions to these excercises are provided at the very end of this notebook.

**Q1)** What does the following code print?

```Python
import copy
a = [3,1,4]
b = a
a(1) = 2
c = copy.copy(b)
b(2) = -1
c(0) = 0
print(a)
print(b)
print(c)
```

a.
```Python
[3, 2, 4]
[3, 2, -1]
[0, 1, 4]
```

b.
```Python
[3, 2, 4]
[3, 1, -1]
[0, 1, -1]
```

c.
```Python
[3, 2, -1]
[3, 2, -1]
[0, 2, 4]
```

d.
```Python
[3, 2, -1]
[3, 2, -1]
[3, 2, -1]
```



**Q2)** Consider the following definition of the class `Test`:

```Python
class Test(object):
    def __init__(self, lst):
        self.lst = lst
        
    def __str__(self):
        return str(self.lst)
```
What is the output of the following code?
```Python
a = [3,1,4]
b = a
f = Test(a)
a[1] = 0
g = Test(b)
g.lst[0] = -1
f.lst[2] = 1

print(b)
print(g)
```


a.
```Python
[-1, 0, 1]
[-1, 0, 1]
```


b.
```Python
[-1, 0, 4]
[-1, 0, 4]
```

c.
```Python
[0, 0, 4]
[-1, 1, 4]
```

d.
```Python
[3, 1, 4]
[-1, 1, 4]
```


**Q3)** With the same definition of `Test` class as above, what is the output of the following code?

```Python
import copy
a = [3,1,4]
f = Test(a)
g = copy.copy(f)
h = copy.deepcopy(f)
h.lst[1] = 3
g.lst[0] = -1
f.lst[2] = 1

print(f)
print(g)
print(h)
```



a.
```Python
[3, 1, 1]
[-1, 1, 4]
[3, 3, 4]
```

b.
```Python
[-1, 3, 1]
[-1, 3, 1]
[-1, 3, 1]
```

c.
```Python
[-1, 1, 1]
[-1, 1, 1]
[3, 3, 4]
```

d.
```Python
[3, 3, 1]
[-1, 1, 4]
[3, 3, 1]
```

<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>

## Solutions to "Check your understanding"

**Q1)** Answer: c.

**Q2)** Answer: a.

**Q3)** Answer: c.
