# Deep vs shallow copy

Copying a list in Python doesn't always work the way you expect it to. A quick overview!

In [1]:
a = 5

def add_one(x):
    x += 1
    return x

b = add_one(a)

print(a, b)

5 6


Works completely as expected. We send a as a parameter to the function, the function changes it and returns the changed value, but the original stays the same.

In [4]:
c = [1, 2, 3]

def add_one_list(x):
    for i in range(len(x)):
        x[i] += 1
    return x

d = add_one_list(c)

print(c, d)

[2, 3, 4] [2, 3, 4]


Disturbing, no? X is not a copy of C, it's a reference to it. Everything we change in X is also changed in C.

How can we fix this?

In [7]:
c = [1, 2, 3]

def add_one_list(x):
    for i in range(len(x)):
        x[i] += 1
    return x

d = add_one_list(c.copy())

print(c, d)

[1, 2, 3] [2, 3, 4]


By not sending C, but a copy of C. This way the copy of C is overwritten, but not C.

This is also a great trick if you want to go over a list and delete certain items from it. Deleting items changes the indexes, so using "range(len(..))" won't work. Simply looping over the list is tricky as you are changing the list that is referenced by the loop you are in. This means not everything is getting deleted. Looping over a copy works perfectly!

In the example we create a list of random numbers and delete all even numbers.

In [16]:
import random

random.seed(10) # always generate the same numbers
random_numbers = random.sample(range(0, 10), 5)

print(random_numbers)

# for i in range(len(random_numbers)):
#     if random_numbers[i] % 2 == 0:
#         del random_numbers[i]

# print(random_numbers) # -> index error, you're deleting items that are not there

# for item in random_numbers:
#     if item % 2 == 0:
#         random_numbers.remove(item)

# print(random_numbers) # -> what is 6 still doing in this list?

for item in random_numbers.copy():
    if item % 2 == 0:
        random_numbers.remove(item)

print(random_numbers)

[9, 0, 6, 3, 4]
[9, 3]


An afterthought: What if we rewrote the function to use a list comprehension?

In [6]:
e = [1, 2, 3]

def add_one_list(x):
    x = [ i+1 for i in x]
    return x

f = add_one_list(e)

print(e, f)

[1, 2, 3] [2, 3, 4]


A list comprehension creates a new list (based on the old list). (A reference to) this list is in turn saved to X, overwriting the reference to C (but not C). So now there's no problem.