To open this notebook in Google Colab and start coding, click on the Colab icon below.

<table style="border:2px solid orange" align="left">
  <td style="border:2px solid orange ">
    <a target="_blank" href="https://colab.research.google.com/github/neuefische/ds-meetups/blob/main/01_Python_Workshop_Revisiting_Some_Fundamentals/Copying_Exercises.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
 </table>

# Copying Exercises

The first notebook about copying in Python was just a "follow-along". You learnt the three ways of copying in Python:
- how to copy immutalbe objects 
- how to create shallow copies of mutable objects
- how to create deep copies of mutable objects.

All of this was covered mainly with the python objects "lists". 
To get the intuition, let's try these concepts also on some other objects.

Our example will still be an aquarium. But this time in some other forms...


In [1]:
# some imports
import copy
import numpy as np
import pandas as pd


## Exercise 1: Can you copy dictionaries with the `=` operator?

Let's have the content of the aquarium in a dictionary... with a bit of a new order.

In [2]:
aquarium_dict = {
    'fishes': ['big fish', 'small fish', 'second big fish'],
    'plants': None,
    'black box': 1
}

So what happens when you copy this dictionary with the `=` operator?

1. check the id of `aquarium_dict` and your created copy!
    --> will it be the same?

Use code to confirm your opinion.

In [3]:
my_aquarium_dict = aquarium_dict

id(aquarium_dict) == id(my_aquarium_dict)  # True
# Both variables reference the same object at ID 4350994560.

True

<details><summary>
Click here for the solution.
    </summary>
    1. id will be the same! "=" operator will not create a new object but will create a new variable that will refer to the original object. Everything that will be changed in "aquarium_dict" will be visible in the new variable as well

    Python code:
    aquarium_dict_copy = aquarium_dict

    id(aquarium_dict_copy) == id(aquarium_dict)
    
</details>

2. what will happen to `aquarium_dict` if you change the entry of `blackbox` in your copy to `"Party!!!"`? 

In [4]:
my_aquarium_dict['blackbox'] = 'Party!!!'
print(my_aquarium_dict)
print(aquarium_dict)

# Value of Key 'blackbox' changed in both dicts. 

{'fishes': ['big fish', 'small fish', 'second big fish'], 'plants': None, 'black box': 1, 'blackbox': 'Party!!!'}
{'fishes': ['big fish', 'small fish', 'second big fish'], 'plants': None, 'black box': 1, 'blackbox': 'Party!!!'}


<details><summary>
Click here for the solution.
    </summary>
    2. "blackbox" will be changed to "Party!!!" as well.

    Python code:
    aquarium_dict_copy['black box'] = "Party!!!"

    print(aquarium_dict_copy)
    print(aquarium_dict)
    
</details>

3. what will happen to `aquarium_dict_copy` if you append the `"Rainbow fish"` to the list of fishes in the original `aquarium_dict`?

In [5]:
my_aquarium_dict['fishes'].append('Rainbow fish')
print(my_aquarium_dict)

{'fishes': ['big fish', 'small fish', 'second big fish', 'Rainbow fish'], 'plants': None, 'black box': 1, 'blackbox': 'Party!!!'}


<details><summary>
Click here for the solution.
    </summary>
    3. "Rainbow fish" will added to fishes list for the copy as well.

    Python code:
    aquarium_dict['fishes'].append('Rainbow fish')

    print(aquarium_dict_copy)
    print(aquarium_dict)
    
</details>


## Exercise 2: Can you copy dictionaries as shallow copies?

We will start again with our typical aquarium_dict.
Try out some different ways to create a shallow copy of it.

In [6]:
aquarium_dict = {
    'fishes': ['big fish', 'small fish', 'second big fish'],
    'plants': None,
    'black box': 1
}

In [7]:
# Real copy 1:
aquarium_dict_1 = dict(aquarium_dict)
print(aquarium_dict_1)

# Real copy 2:
aquarium_dict_2 = aquarium_dict.copy()
print(aquarium_dict_2)

# Real copy 3:
aquarium_dict_3 = copy.copy(aquarium_dict)
print(aquarium_dict_3)

{'fishes': ['big fish', 'small fish', 'second big fish'], 'plants': None, 'black box': 1}
{'fishes': ['big fish', 'small fish', 'second big fish'], 'plants': None, 'black box': 1}
{'fishes': ['big fish', 'small fish', 'second big fish'], 'plants': None, 'black box': 1}


<details><summary>
Click here for the solution.
    </summary>
    What worked?

- creating a new dictionary with the dict() method:
    `dict(aquarium_dict)`

- creating a new dictionary with the .copy() Method
    `aquarium_dict.copy()`

- creating a new dictionary with copy library
    `copy.copy(aquarium_dict)`
    
What **didn't work**?
- slicing (lists can be sliced, dictionaries not)
    `aquarium_dict[:]`
        
- list method created a list of the dictionary-keys
    `list(aquarium_dict)`
    
</details>

Answer the three question from above!
Use code to confirm your opinion.

1. check the id of `aquarium_dict` and your created copy!
    --> will it be the same?

2. what will happen to `aquarium_dict` if you change the entry of `blackbox` in your copy to `"Party!!!"`? 

3. what will happen to `aquarium_dict_copy` if you append the `"Rainbow fish"` to the list of fishes in the original `aquarium_dict`?

In [8]:
# 1.) Same ID?
print(id(aquarium_dict) == id(aquarium_dict_1))  # False
print(id(aquarium_dict) == id(aquarium_dict_2))  # False
print(id(aquarium_dict) == id(aquarium_dict_3))  # False

# 2.) Change entry of 'blackbox' in shallow copy to 'Party!!!'
aquarium_dict_1['black box'] = 'Party!!!'  
print(aquarium_dict_1)  # Only shallow copy was changed.
print(aquarium_dict)

# 3.) Append 'Rainbow fish' to list of fishes in original:
aquarium_dict['fishes'].append('Rainbow fish')
print(aquarium_dict)  # 'Rainbow fish' was appended in all dicts --> shallow copies.


False
False
False
{'fishes': ['big fish', 'small fish', 'second big fish'], 'plants': None, 'black box': 'Party!!!'}
{'fishes': ['big fish', 'small fish', 'second big fish'], 'plants': None, 'black box': 1}
{'fishes': ['big fish', 'small fish', 'second big fish', 'Rainbow fish'], 'plants': None, 'black box': 1}


<details><summary>
Click here for the solution of all three questions.
    </summary>
1. No it's not the same! These are real copies: a new dict object was created:

`aquarium_dict_shallow_copy = dict(aquarium_dict)`

`id(aquarium_dict) == id(aquarium_dict_shallow_copy)`

2."black box" will be only changed in the copy.

`aquarium_dict_shallow_copy['black box'] = "Party!!!"`

`print(aquarium_dict_shallow_copy)`

`print(aquarium_dict)`

3.the value to the key "fishes" is a mutalbe list, which is then a nested object. Content of nested objects are shared between original and copies.

`aquarium_dict['fishes'].append('Rainbow fish')`

`print(aquarium_dict_shallow_copy)`

`print(aquarium_dict)`

</details>

## Exercise 3: Can you copy dictionaries as deep copies?

We will start again with our typical aquarium_dict.

Create a deep copy of the dictionary.

What will be the solution to the three questions?

In [9]:
aquarium_dict = {
    'fishes': ['big fish', 'small fish', 'second big fish'],
    'plants': None,
    'black box': 1
}

In [10]:
# 1.)
aquarium_deep_copy = copy.deepcopy(aquarium_dict)
print(id(aquarium_deep_copy) == id(aquarium_dict))  # False

# 2.) 
aquarium_deep_copy['black box'] = 'Party!!!'
print('Aufgabe2')  
print(aquarium_deep_copy)  # Only deepcopy was changed.
print(aquarium_dict)

# 3.) 
aquarium_dict['fishes'].append('Rainbow fish')
print('Aufgabe3')
print(aquarium_deep_copy)  # Only deepcopy was changed!
print(aquarium_dict)



False
Aufgabe2
{'fishes': ['big fish', 'small fish', 'second big fish'], 'plants': None, 'black box': 'Party!!!'}
{'fishes': ['big fish', 'small fish', 'second big fish'], 'plants': None, 'black box': 1}
Aufgabe3
{'fishes': ['big fish', 'small fish', 'second big fish'], 'plants': None, 'black box': 'Party!!!'}
{'fishes': ['big fish', 'small fish', 'second big fish', 'Rainbow fish'], 'plants': None, 'black box': 1}


<details><summary>
Click here for the solution of all three questions.
    </summary>
1. No it's not the same! These are real copies: a new dict object was created.

`aquarium_dict_deep_copy = copy.deepcopy(aquarium_dict)`

`id(aquarium_dict) == id(aquarium_dict_deep_copy)`

2."black box" will be only changed in the copy.

`aquarium_dict_deep_copy['black box'] = "Party!!!"`

`print(aquarium_dict_deep_copy)`

`print(aquarium_dict)`

3.the value to the key "fishes" is a mutalbe list, which is then a nested object. Content of nested objects are not shared between original and deep copies.

`aquarium_dict['fishes'].append('Rainbow fish')`

`print(aquarium_dict_deep_copy)`

`print(aquarium_dict)`

</details>

## Extra

Some of you might use Pandas DataFrames regularly.

Of course you might also want to create some copies of DataFrames as well.

We imported pandas already. That's why we can start directly with trying some ways to copy DataFrames...

To make this concept a bit clearer. We start with our data as a **nested list**. Each nested list will be a row in our DataFrame.

The column names will be explained when creating the DataFrame:
The DataFrame is concered of the status if fish got food and if it is ill.



In [11]:
list_of_lists = [['big fish', True, False], 
                 ['small fish', False, True], 
                 ['second big fish', False, True]]

In [33]:
aquarium_dataframe = pd.DataFrame(data= list_of_lists, columns = ['fishes', 'got_food', 'is_ill'])

In [13]:
aquarium_dataframe

Unnamed: 0,fishes,got_food,is_ill
0,big fish,True,False
1,small fish,False,True
2,second big fish,False,True


Let's create a copy!
**Attention!** This will use the [**Pandas Method .copy()**](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.copy.html)

By default the attribute `deep=True`. 
The documentation states:

When deep=True (default), a new object will be created with a copy of the calling object’s data and indices. Modifications to the data or indices of the copy will not be reflected in the original object (see notes below).

When deep=False, a new object will be created without copying the calling object’s data or index (only references to the data and index are copied). Any changes to the data of the original will be reflected in the shallow copy (and vice versa).

Ok. We will stay true to this notebook and we will start with a "shallow copy".

In [14]:
aquarium_dataframe_shallowcopy = aquarium_dataframe.copy(deep=False)

It's a new object:

In [15]:
id(aquarium_dataframe) == id(aquarium_dataframe_shallowcopy)

False

But: Any changes to the data of the original will be reflected in the shallow copy (and vice versa).

In [16]:
aquarium_dataframe_shallowcopy.at[0, 'fishes'] = 'dead fish'

In [17]:
aquarium_dataframe_shallowcopy

Unnamed: 0,fishes,got_food,is_ill
0,dead fish,True,False
1,small fish,False,True
2,second big fish,False,True


In [18]:
aquarium_dataframe

Unnamed: 0,fishes,got_food,is_ill
0,dead fish,True,False
1,small fish,False,True
2,second big fish,False,True


Ok. Now to the "deep copy".

In [19]:
list_of_lists = [['big fish', True, False], 
                 [['small fish mama','small fish papa'], False, True], 
                 ['second big fish', False, True]]

In [20]:
aquarium_dataframe = pd.DataFrame(data= list_of_lists, columns = ['fishes', 'got_food', 'is_ill'])

In [21]:
# per default "deep=True"
aquarium_dataframe_copy = aquarium_dataframe.copy()

It's a new object:

In [22]:
id(aquarium_dataframe) == id(aquarium_dataframe_shallowcopy)

False

And: Modifications to the data or indices of the copy will not be reflected in the original object.

In [23]:
aquarium_dataframe_copy.at[0, 'fishes'] = 'dead fish'

In [24]:
aquarium_dataframe_copy

Unnamed: 0,fishes,got_food,is_ill
0,dead fish,True,False
1,"[small fish mama, small fish papa]",False,True
2,second big fish,False,True


In [25]:
aquarium_dataframe

Unnamed: 0,fishes,got_food,is_ill
0,big fish,True,False
1,"[small fish mama, small fish papa]",False,True
2,second big fish,False,True


**But** how deep is this deep copy?

In [26]:
aquarium_dataframe['fishes'][1][1] = 'small fish baby'

In [27]:
aquarium_dataframe


Unnamed: 0,fishes,got_food,is_ill
0,big fish,True,False
1,"[small fish mama, small fish baby]",False,True
2,second big fish,False,True


In [28]:
aquarium_dataframe_copy

Unnamed: 0,fishes,got_food,is_ill
0,dead fish,True,False
1,"[small fish mama, small fish baby]",False,True
2,second big fish,False,True


Note that when copying an object containing Python objects, a deep copy will copy the data, but will not do so recursively. Updating a nested data object will be reflected in the deep copy.

This might not happen often, that you really have a list etc. as an object in your DataFrame. But if it does... there is again a way to create a real deep copy.

To take a truly deep copy of a DataFrame containing a list(or other python objects), so that it will be independent – you can use one of the methods below.

Create a new DataFrame with a deepcopy of the values in the DataFrame.

In [29]:
aquarium_dataframe_deep_copy = pd.DataFrame(columns = aquarium_dataframe.columns, data = copy.deepcopy(aquarium_dataframe.values))

# Hier wird die Modulfunktion copy.deepcopy() - als dritte Möglichkeit eine deep copy zu erstellen, benutzt.

In [30]:
aquarium_dataframe_deep_copy['fishes'][1][1] = 'small fish papa again'

In [31]:
aquarium_dataframe

Unnamed: 0,fishes,got_food,is_ill
0,big fish,True,False
1,"[small fish mama, small fish baby]",False,True
2,second big fish,False,True


In [32]:
aquarium_dataframe_deep_copy

Unnamed: 0,fishes,got_food,is_ill
0,big fish,True,False
1,"[small fish mama, small fish papa again]",False,True
2,second big fish,False,True


Complicated! But now you have seen also all three ways to create copies of DataFrames.
And it is always good to have an intuition what could have went wrong, when your DataFrames don't look like you expected them to look like.