# Python catalog: Data structure and Mutability

### primitive data structures in python

- integer
- float
- string 
- boolean

- primitive data types are **immutable**

In [12]:
# floats are immutable
# when value of variable changes, it changes what the variable refers to

x = 4.1
print(id(x))
x = x + 3
print(id(x))
print(x)

4447673104
4447672840
7.1


In [13]:
# strings are also immutable

s = 'hello'
print(id(s))
s = s + ' there'
print(id(s))
print(s, "\n")

s2 = 'second'
s3 = s2 + ' string'
print(f"s2: {s2}\ns2 location: {id(s2)}")
print(f"s3: {s3}\ns3 location: {id(s3)}")

4450660112
4450002544
hello there 

s2: second
s2 location: 4419506784
s3: second string
s3 location: 4450666544


Python is a dynamically typed language.

- You do not have to explicitly state the type of the variable. 
- The type of data an object can store is **mutable**.

In [14]:
# two variables set to the same value point to the same place in memory

a = 4 
b = 5
c = 5
print(f"a location: {id(a)}, \nb location: {id(b)}, \nc location: {id(c)}")
c = 12
print(f"new c location: {id(c)}, \nb location remains the same: {id(b)}")


a location: 4413019584, 
b location: 4413019616, 
c location: 4413019616
new c location: 4413019840, 
b location remains the same: 4413019616


In [15]:
f = 24 
print(f"f location: {id(f)}")
f = 25
g = 24
print(f"g location: {id(g)}")


f location: 4413020224
g location: 4413020224


In [16]:
# NOTE that python keeps an array of integer objects for all integers between -5 and 256. 
# When you create an integer in that range, you get back a reference to the already existing object.
# Otherwise

q = 1
w = 1
print(id(q) == id(w))

r = 280
t = 280
print(id(r) == id(t))


True
False


### 4 built in data strucures in python 

1. List
2. Dictionary
3. Tuple 
4. Set

List

- Can contain any data type: ints, strings, lists, etc. 
- **Mutable**; Variable length (can grow and shrink)
- Accessed like strings: slicing and concatenation
- Actually C arrays (act like an array of pointers)

In [17]:
my_list = [1, 2, 3, 5, 6, 'seven', 'eight', 'nine', 'ten']

In [18]:
print(my_list)
my_list.append(11)
print(my_list)
my_list.pop(6)
print(my_list)
my_list.remove(5)
print(my_list)

[1, 2, 3, 5, 6, 'seven', 'eight', 'nine', 'ten']
[1, 2, 3, 5, 6, 'seven', 'eight', 'nine', 'ten', 11]
[1, 2, 3, 5, 6, 'seven', 'nine', 'ten', 11]
[1, 2, 3, 6, 'seven', 'nine', 'ten', 11]


In [19]:
print(my_list)
print(id(my_list[0]))
print(id(my_list))
# when we change element in list - the location of that element changes! BUT the location of the list does not.
my_list[0] = 12
print(my_list)
print(id(my_list[0]))
print(id(my_list))

[1, 2, 3, 6, 'seven', 'nine', 'ten', 11]
4413019488
4450385800
[12, 2, 3, 6, 'seven', 'nine', 'ten', 11]
4413019840
4450385800


In [20]:
# NOTE that list comprehensions create a new list

list_for_comprehension = [1, 2, 3]
print(f"list location: {id(list_for_comprehension)}")
list_for_comprehension = [x*2 for x in list_for_comprehension]
print(f"list location: {id(list_for_comprehension)}")


list location: 4450665032
list location: 4447226568


In [37]:
# Slicing a list also creates a new list
list_y = [1, 2, 3]
list_x = list_y[1:]
list_z = list_y[0:]
print(id(list_x),'\n',id(list_y),'\n',id(list_z))

4450631752 
 4450698760 
 4450608008


Dictionary
 
- Similar to **hash** or **maps** in other languages. 
- A mapping object maps hashable values to arbitrary objects. 
    - Mappings are **mutable** objects. 
    - Dictionaries are the standard mapping objects in python.


- Dictionaries are indexed by keys (unlike sequences, which are indexed by numbers). 
- Keys must be **immutable**: therefore, a key can be:
    - number
    - string
    - tuple containing only immutable datatypes
    

In [23]:
dict_a = {1: 'one', 2: 'two'}
dict_b = {'one': 'one', 'two': 'two'}
dict_c = {(1, 'one'): 'one', (2, 'two'): 'two'}
print(f"dict_a: {dict_a},\ndict_b: {dict_b},\ndict_c: {dict_c}")
print(dict_c[(1, 'one')])

dict_a: {1: 'one', 2: 'two'},
dict_b: {'one': 'one', 'two': 'two'},
dict_c: {(1, 'one'): 'one', (2, 'two'): 'two'}
one


In [24]:
# Dictionaries can be created using {} or dict function
dict_d = {3: 'three', 4: 'four'}
dict_e = dict(three ='three', four = 'four')
print(f"dict_d: {dict_d}\ndict_e: {dict_e}")

dict_d: {3: 'three', 4: 'four'}
dict_e: {'three': 'three', 'four': 'four'}


In [25]:
# dict is an overloaded function
print(dict(three ='three', four = 'four'))
print(dict(zip(['three', 'four'], ['three', 'four'])))
print(dict(zip([1, 2], ['one', 'two'])))

{'three': 'three', 'four': 'four'}
{'three': 'three', 'four': 'four'}
{1: 'one', 2: 'two'}


In [26]:
# can use list comprehension to build dictionary!
{x: x**2 for x in (2, 4, 6)}

{2: 4, 4: 16, 6: 36}

In [27]:
# Accessing keys and values
for key in dict_a.keys():
    print(key)
for values in dict_a.values():
    print(values)
for k, v in dict_a.items():
    print(k, v)
print(dict_a[1])

1
2
one
two
1 one
2 two
one


Tuples

- Like python lists, but **immutable**

In [28]:
t1 = (0, )
t2 = (0, 1)
t3 = (0, 1, 4, 5)
# tuples don't requre parentheses (but often easier to distinguish)
t4 = 0, 1, 4, 5
t5 = 0,

In [29]:
print(type(t3))
print(type(t4))
print(type(t5))

<class 'tuple'>
<class 'tuple'>
<class 'tuple'>


In [30]:
# Can index tuple
print(t3[2])
# Can't mutate tuple (below results in error)
# t3[2] = 5

4


In [31]:
# Nested tuple

t6 = (1, (3, 4), (4, 5))
print(t6[2][1])

5


Sets

- Unordered collection of unique objects
    - because set is unordered, it does not support indexing
- **mutable**
- methods: difference(), intersection(), etc.

In [32]:
s1 = set([1, 2, 3])
s2 = set([1, 1, 1])

In [33]:
print(s1)
print(s2)

{1, 2, 3}
{1}


In [34]:
print(s1)
s1.add(9)
print(s1)
# Cannot add duplicate valie
s1.add(9)
print(s1)

{1, 2, 3}
{1, 2, 3, 9}
{1, 2, 3, 9}


### Immutable and mutable objects as function parameters

Why you shouldn't pass mutable data types as function parameters

- For **immutable** variables, no matter how many times the method is called with the same variable/value, the output will always be the same. 
- For **mutable** variables, means that calling the same method with the same variables does not guarantee the same output, because the variable can be mutated at any time by another metho.

In [68]:
# This is a bad idea!
def add_to_list_incorrectly(a=[0, 1, 2]):
    a.append(3)
    return a

a = add_to_list_incorrectly()
print(a)
b = add_to_list_incorrectly()
print(b)
print(a)

[0, 1, 2, 3]
[0, 1, 2, 3, 3]
[0, 1, 2, 3, 3]


In [69]:
print(add_to_list_incorrectly())
print(add_to_list_incorrectly())
print(a)
add_to_list_incorrectly()
print(a)

[0, 1, 2, 3, 3, 3]
[0, 1, 2, 3, 3, 3, 3]
[0, 1, 2, 3, 3, 3, 3]
[0, 1, 2, 3, 3, 3, 3, 3]


What you should do intead

In [70]:
def add_to_list_correctly(a=None):
    if a is None:
        a = [0, 1, 2]
    a.append(3)
    return a

print(add_to_list_correctly())
print(add_to_list_correctly())
print(add_to_list_correctly())
print(add_to_list_correctly(a=[0, 1, 2, 3]))
print(add_to_list_correctly(a=[0, 1, 2, 3]))

[0, 1, 2, 3]
[0, 1, 2, 3]
[0, 1, 2, 3]
[0, 1, 2, 3, 3]
[0, 1, 2, 3, 3]


In [75]:
# Another demonstration that a list is mutable
d_list = [6, 5, 4]
print(d_list)
e_list = add_to_list_correctly(d_list)
print(e_list)
print(d_list)

[6, 5, 4]
[6, 5, 4, 3]
[6, 5, 4, 3]


Immutable variables as function params are OK

In [71]:
def add_to_int(c=4):
    return c+8

print(add_to_int())
print(add_to_int())   
print(add_to_int())    

12
12
12


### Immutable and mutable objects and efficient code

Need to consider whether data type is mutable or immutable to write efficient code:

- Python handles mutable and immutable objects differently. Immutable are quicker to access than mutable objects. Also, immutable objects are fundamentally expensive to "change", because doing so involves creating a copy. Changing mutable objects is cheap. For example, concatenating strings in loops wastes lots of memory (strings are immutable, so concatenating two strings creates a third strign). Use list compression join technique instead.

