# Tuples, Lists, Aliasing, Mutability, and Cloning

So far we mostly talked on different types of objects:

* integers
* floats
* booleans
* None
* strings

In contrast to other types, strings are structured. That means that we can use indexing to extract individual characters or slice them to extract substrings. No the time has come (unlike Winter) to talk about other types in _Python_ that are structured.

## Tuples

We already have seen tuples when we were writing test functions. In general, you might thing about them as a generalization of a string. The only difference here is that the elements not neccessarly need to be characters. Actually, the individual elements might be of any type and need not be of the same type as each other. So let's see what they look in practice.

In [1]:
## Let's define a couple of tuple
t1 = ()
t2 = (1, 'two', 3)

print(t1)
print(t2)

()
(1, 'two', 3)


More or less you can perform the same type of operations on tuples you could have on strings (IMPORTANT: they don't have the same methods defined on them). There is only one but. If you would like to create a tuple that contains only one value it is not enough to just put an object in braces you have to add extra comma.

In [5]:
## Let's define a tuple with only one element
t1 = (1,)
print(type(t1))

## And what would happen without this comma?
t1 = (1)
print(type(t1))

<class 'tuple'>
<class 'int'>


Other than that tuples can be concatenated, indexed, and sliced.

In [6]:
## Let's define a couple of tuples
t1 = (1, 'two', 3)
t2 = (t1, 3.25) ## tuples can contain other tuples

## Contenacation
t1 + t2

(1, 'two', 3, (1, 'two', 3), 3.25)

In [7]:
## Indexing
(t1 + t2)[3]

(1, 'two', 3)

In [8]:
## Slicing
(t1 + t2)[2:5]

(3, (1, 'two', 3), 3.25)

Also similarly as with strings you can iterate over elements of a tuple, for example

In [9]:
t3 = (t1,t2,1)

for item in t3:
    print(item)

(1, 'two', 3)
((1, 'two', 3), 3.25)
1


Let's now consider the following function. It returns a string that is an intersection of two strings.

In [29]:
def intersect_str(s1, s2):
    """
    Returns a string containing charachters that are in both s1 and s2.
    Args:
    	s1 (str): non-empty string
     	s2 (str): non-empty string
	
	Returns:
		result (str): contains characters that are in both s1 and s2.
	"""
    result = ''
    if len(s2) < len(s1):
        s1, s2 = s2, s1
    for s in s1:
        if s in s2:
            result += s
    return result
	
intersect_str('aaaaak', 'a')
      
     

'a'

Now let's try to write a similar function but for tuples.

In [28]:
def intersect_tuple(t1, t2):
    """
    Returns a tuple containing elements that are in both t1 and t2.
    Args:
    	t1 (tuple): non-empty tuple
     	t2 (tuple): non-empty tuple
	
	Returns:
		result (tuple): contains elements that are in both t1 and t2.
	"""
	
    

### Multiple assigments

In HW2, we sued multiple assigment to initialize at the same time two variables. It looked something like that.
```python
low, num_guesses = 0, 0
```
It was kind of straightforward but what actually happened was that _Python_ intepreted both sides of the assignment sign as tuples. It would be exactly the same if we had.
```python
(low, num_guesses) = (0,0)
```
But out of convinience we don't write like that. What is more we can use the same assignment methods with strings, for example. 

In [47]:
x, y, z = 'xyz'

print(f'x = {x}')
print(f'y = {y}')
print(f'z = {z}')

x = x
y = y
z = z


This mechanism of multiple assignments is of particular convenience when used with functions that return multiplie value. Consider the following function.

In [51]:
def find_extreme_divisors(n1, n2):
	"""
	Assumes that n1 and n2 are positive integers and returns the smalles common
	divisor > 1 and the largest common divisor of n1 and n2. If no common divisor
	other than , returns (None, None)

	Args:
		n1 (int): positive integers 
		n2 (int): positive integers

	Returns:
		min_val, max_val (tuple): it returns two integers > 1 or pair of None
	"""
	min_val, max_val = None, None
	for val in range(2, min(n1, n2) + 1):
		if n1 % val == 0 and n2 % val == 0:
			if min_val == None:
				min_val = val
			max_val = val
	return min_val, max_val

smallest_divisor, largest_divisor = find_extreme_divisors(100,200)


Write a function that for integers bigger than 2 returns the smallest and biggest divisor (look at [HW2](https://github.com/MikoBie/ppss/blob/main/notebooks/HW2.ipynb) for reference). If an integer is a prime it should return `(None, None)` and print a message that the number was a prime.

In [54]:
def find_divisor(n1):
	"""
	Assumes that n1 is an integer bigger than 2 and returns its smalles divisor > 1
	and the largest divisor. If n1 is a prime it prints the message 'n1 is a prime'
	and return (None, None).


	Args:
		n1 (int): integer bigger than 2
	
	Returns:

	smallest_divisor, largest_divisor: it returns two integers > 1 or a pair of None
	"""

## Lists

I would say the most important data structure in _Python_ is lists. They are similar to tuples because a list is an ordered sequence of values, where each value is identified by an index. However, instead of braces you use square braces, for example.

In [59]:
## An empty list
L1 = []
## 
L2 = ['I did it all', 4, '<3']
## You don't have to use extra comma if a list has a single value but if you do nothing wrong happens
L3 = [L2,]

print(L1)
print(L2)
print(L3)

[]
['I did it all', 4, '<3']
[['I did it all', 4, '<3']]


Similarly to strings and tuples we can index, slice and iterate over lists.

In [62]:
## Let's define a couple of lists
L1 = [1, 2, 3]
L2 = L1[::-1]

for i in range(len(L1)):
    print(L1[i] * L2[i])


3
4
3


And now comes the most important distinction between the data structure we discussed and the one that are ahead of us (including lists). Tuples and strings are immutable while lists are mutable. What does it mean and what consequances it has? Objects that are immutable when they are created they can't be changed afterward. On the other hand, mutable types might be modified after they were created. For now it sounds simple but let's see some implications of it.

In [99]:
OneD = ['Niall Horan', 'Liam Payne', 'Harry Styles', 'Louis Tomlinson', 'Zayn Malik']
BTS = ['Jin', 'Suga', 'J-Hope', 'RM', 'Jimin', 'V', 'Jungkook']

In [100]:
BBands = [OneD, BTS]
BBands1 = [['Niall Horan', 'Liam Payne', 'Harry Styles', 'Louis Tomlinson', 'Zayn Malik'],['Jin', 'Suga', 'J-Hope', 'RM', 'Jimin', 'V', 'Jungkook']]

So we created two lists `Bands` and `Bands1` and assigned variables to them. The elements of these lists are themselves lists. So far so good. Let's print them and check whether they are the same.

In [101]:
print(f'BoysBands = {BBands}')
print(f'BoysBands1 = {BBands1}')
print(BBands == BBands1)

BoysBands = [['Niall Horan', 'Liam Payne', 'Harry Styles', 'Louis Tomlinson', 'Zayn Malik'], ['Jin', 'Suga', 'J-Hope', 'RM', 'Jimin', 'V', 'Jungkook']]
BoysBands1 = [['Niall Horan', 'Liam Payne', 'Harry Styles', 'Louis Tomlinson', 'Zayn Malik'], ['Jin', 'Suga', 'J-Hope', 'RM', 'Jimin', 'V', 'Jungkook']]
True


At the first glance it seems that both lists are bound to the same value. But [your eyes can decieve you, don't trust them](https://www.youtube.com/watch?v=oDIrOE_fnl8). The image below shows that `Bands` and `Bands1` are bound to different objects. A simple way to test for object equality is performed using `is` opertor. Let's now see what it returns.

In [102]:
print(BBands1 is BBands)

False


This is pretty cool but why it matters? Because it has grave consequences on lists. Let's use an simple this horrible morning of March 25th 2015 when [Malik announced that was leaving One Direction](https://www.facebook.com/onedirectionmusic/posts/869295683125227).

In [103]:
## This method on a list removes an element of the lisst
OneD.remove('Zayn Malik')

## Let's see what happend with our lists
print(f'One Direction without Zayn Malik {OneD}')
print(f'Boys Bands = {BBands}')
print(f'Boys Bands1 = {Bands1}')


One Direction without Zayn Malik ['Niall Horan', 'Liam Payne', 'Harry Styles', 'Louis Tomlinson']
Boys Bands = [['Niall Horan', 'Liam Payne', 'Harry Styles', 'Louis Tomlinson'], ['Jin', 'Suga', 'J-Hope', 'RM', 'Jimin', 'V', 'Jungkook']]
Boys Bands1 = [['Scary Spice', 'Sport Spice', 'Baby Spice', 'Ginger Spice', 'Posh Spice'], ['Jin', 'Suga', 'J-Hope', 'RM', 'Jimin', 'V', 'Jungkook']]


So what happened here? Zayn Malik was not only removed from the squad (`OneD`) but also from `BBands`. However, he was not removed from `BBands1`. Why is so? That is because method `OneD.remove('Zayn Malik')` muttates the list `OneD` and in `BBands` we don't have a new list that contains elements of `OneD` but rather reference to `OneD`. Therefore, whenever we change `OneD` it will have an effect on `BBands` as well but not on `BBands1`. In _Python_ the fact that there are mor than one path to the same object is called aliasing. One path goes through variable `OneD` and the second through first element of the list `BBands`. We can mutate the object via either path, and the effect of the mutation will be visisble throgh both paths.

Ok, so now when we know that lists are muttable and we know the consequences let's talk about different methods that are defined on them.

In [16]:
## Let's first define a list
SG = ['Melanie Brown', 'Melanie Chisholm', 'Emma Bunton', 'Gerri Halliwell']
print(f'Inital list SG contains of {SG}')

## Add an object to the end of the list
SG.append('Victoria Adams')

print(f'SG after addition of "Vicotria Adama" contains of {SG}')

Inital list SG contains of ['Melanie Brown', 'Melanie Chisholm', 'Emma Bunton', 'Gerri Halliwell']
SG after addition of "Vicotria Adama" contains of ['Melanie Brown', 'Melanie Chisholm', 'Emma Bunton', 'Gerri Halliwell', 'Victoria Adams']


In [17]:
## returns the number of times that an object occurs in list
print(f'Number of times Victoria Adams oocurs in the list SG is {SG.count("Victoria Adams")}')

## let's add a new element to the end of the list
SG.append('Victoria Adams')

## And count it again
print(f'Number of times Victoria Adams oocurs in the list SG is {SG.count("Victoria Adams")}')

Number of times Victoria Adams oocurs in the list SG is 1
Number of times Victoria Adams oocurs in the list SG is 2


In [18]:
## Insert an object into list at index i
SG.insert(1, 'Victoria Adams')

## Print the list
SG

['Melanie Brown',
 'Victoria Adams',
 'Melanie Chisholm',
 'Emma Bunton',
 'Gerri Halliwell',
 'Victoria Adams',
 'Victoria Adams']

In [35]:
## Let's define two lists now
DC = ['Beyoncé Knowles', 'Kelly Rowland', 'LaTavia Roberson', 'LeToya Luckett']
DC_2000 = ['Michelle Williams', 'Farrah Franklin']

## Add items of a list at the end of the list
DC.extend(DC_2000)
print(DC)

## And it is equivalent to just contenating list
DC += DC_2000
print(DC)


['Beyoncé Knowles', 'Kelly Rowland', 'LaTavia Roberson', 'LeToya Luckett', 'Michelle Williams', 'Farrah Franklin']
['Beyoncé Knowles', 'Kelly Rowland', 'LaTavia Roberson', 'LeToya Luckett', 'Michelle Williams', 'Farrah Franklin', 'Michelle Williams', 'Farrah Franklin']


In [36]:
## Remove a fist occurance of an item from the list
DC.remove('Farrah Franklin')

print(DC)

['Beyoncé Knowles', 'Kelly Rowland', 'LaTavia Roberson', 'LeToya Luckett', 'Michelle Williams', 'Michelle Williams', 'Farrah Franklin']


In [37]:
## Return an index of the first occurance of the item
DC.index('Michelle Williams')

4

In [38]:
## Remove and return the item at an index i in the list
## By defualt i is -1
DC.pop()

'Farrah Franklin'

In [40]:
## Sort the elments of the list in ascending order
DC.sort()

print(DC)

['Beyoncé Knowles', 'Kelly Rowland', 'LaTavia Roberson', 'LeToya Luckett', 'Michelle Williams', 'Michelle Williams']


In [41]:
## Reverse the order of the elements in L
DC.reverse()

print(DC)

['Michelle Williams', 'Michelle Williams', 'LeToya Luckett', 'LaTavia Roberson', 'Kelly Rowland', 'Beyoncé Knowles']


Write a function that takes two lists L1 and L2 and removes any element from L1 that also occurs in L2. **Hint**: use a `for-loop` and `in` operator.  

In [42]:
def remove_dups(L1, L2):
	"""
	Takes two lists L1 and L2. Removes from L1 every element that also occurs in L2.
 
	Args:
		L1 (list): non-empty list
		L2 (list): non-empty list
	"""
	for item in L1:
		if item in L2:
			L1.remove(item)



In [45]:
TrzyHa = ['Jacek Graniecki', 'Michał Witak', 'Mariusz Sobolewski', 'Sebastian Imbierowicz', 'Jakub Sawicki']
Remove = ['Michał Witak', 'Mariusz Sobolewski', 'Sebastian Imbierowicz', 'Jakub Sawicki']
remove_dups(L1 = TrzyHa, L2 = Remove)
print(TrzyHa)

['Jacek Graniecki', 'Mariusz Sobolewski', 'Jakub Sawicki']


Is it something we expected? Not really. This task was somehow design for you to fail. We wanted to mutate the list `TrzyHa` so it contains only one name -- `Jacek Graniewski` but we failed miserably. Why is so? That is because during a `for loop` (no matter that we are iterating over elements) _Python_ uses an internal counter that is incremented at the end of each iteration. When the value of the counter reaches **the current length of the list**, the loop terminates. It is quite simple, right? However, when the list is mutated on the fly strange things happen. Let's examine what happened in our example iteration after iteration. At the beginign our lists look like the one below.

![list-0](png/list-0.png)

1. In the first iteration the internal counter was `0`. Under index `0` we have `Jacek Graniewski`. We examined if `Jacek Graniewski` is present also in the second list but since he is not we increment the counter. The list `TrzyHa` has not changed.

![list-0](png/list-0.png)

2. In the second iteration the itnernal counter was `1`. Under index `1` we have `Michał Witak`. We examine if he is present also in the second list. He is so we remove it from `TrzyHa` list and increment the counter. The list `TrzyHa` now looks as the following.

![list-1](png/list-1.png)

3. In the third iteration the internal counter is `2`. Under index `2` now we have `Sebastian Imbirowicz`. We examine if he is present also in the second list. He is so we remove him from `TrzyHa` list and icrement the counter. The list `TrzyHa` now looks as the following.

![list-2](png/list-2.png)

The counter reaches the current length of the list and terminates the `for-loop` and we end up with not what we expected. To avoid this kind of problems we should have iterated over the copy of the `TrzyHa` list. Easier said than done, right? Because we know that if we wrote the following it would not solve the issue, right?

In [49]:
def remove_dups_copy(L1, L2):
	"""
	Takes two lists L1 and L2. Removes from L1 every element that also occurs in L2.
 
	Args:
		L1 (list): non-empty list
		L2 (list): non-empty list
	"""
	L1_copy = L1
	for item in L1_copy:
		if item in L2:
			L1.remove(item)

In [50]:
TrzyHa = ['Jacek Graniecki', 'Michał Witak', 'Mariusz Sobolewski', 'Sebastian Imbierowicz', 'Jakub Sawicki']
Remove = ['Michał Witak', 'Mariusz Sobolewski', 'Sebastian Imbierowicz', 'Jakub Sawicki']
remove_dups_copy(L1 = TrzyHa, L2 = Remove)
print(TrzyHa)

['Jacek Graniecki', 'Mariusz Sobolewski', 'Jakub Sawicki']


The problem would remain because L1_copy is not a new separate list but just an alias of L1. So how else we can copy a list? There are two ways: by slicing or using method copy.

In [51]:
## Slicing
TrzyHa_copy = TrzyHa[:]

print(TrzyHa is TrzyHa_copy)

False


In [52]:
## Copy method
TrzyHa_copy2 = TrzyHa.copy()

print(TrzyHa is TrzyHa_copy2)

False


Both described above methods produce a **shallow copy** of a list. Therefore, they create a new list and then insert the objects (not copies of the objects) of the list to be copied into the new list. However, if the list contains a mutable object you would want to make a **deep copy** which we are not going to cover right now.

To fix the `remove_dups` function we just need to iterate over a shallow copy of the L1 list. It is an easy fix.

In [53]:
def remove_dups_copy(L1, L2):
	"""
	Takes two lists L1 and L2. Removes from L1 every element that also occurs in L2.
 
	Args:
		L1 (list): non-empty list
		L2 (list): non-empty list
	"""
	L1_copy = L1[:]
	for item in L1_copy:
		if item in L2:
			L1.remove(item)

In [None]:
TrzyHa = ['Jacek Graniecki', 'Michał Witak', 'Mariusz Sobolewski', 'Sebastian Imbierowicz', 'Jakub Sawicki']
Remove = ['Michał Witak', 'Mariusz Sobolewski', 'Sebastian Imbierowicz', 'Jakub Sawicki']
remove_dups_copy(L1 = TrzyHa, L2 = Remove)
print(TrzyHa)

['Jacek Graniecki']


### List comprehension

Very often in _Python_ you would see a very concise way of producing a list. It is called a list comprehension and is baisically a for loop nested in the list. Let's consider the following code.

In [55]:
L1 = []
for num in range(2,100):
    if num % 2 == 0:
        L1.append(num)
        
print(L1)

[2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98]


This was quite simple, right? We just iterated over numbers between `2` and `100` and appended to our list only the ones that are even.  It is a perfectly fine solution. However, in _Python_ it is possible to write it down in one line using list comprehension.

In [57]:
## List comprehension
L1 = [ num for num in range(2,100) if num % 2 == 0 ]
print(L1)

[2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98]


The result is exactly the same but at first list comprehensions might be a bit tricky to understand because the results is written first. In general it takes practice to get comfortable with them but they might be quite useful. For example, if we were to find all prime numbers between `2` and `100` we could have written something like that.

In [61]:
primes = []
for x in range(2,100):
    is_prime = True
    for y in range(3,x):
        if x%y == 0:
            is_prime = False
    if is_prime:
        primes.append(x)

print(primes)

[2, 3, 4, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]


In [66]:
## or something like that
primes = [ x for x in range(2, 100) if all( x%y != 0 for y in range(3, x)) ]

print(primes)

[2, 3, 4, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]
