In [3]:
list  = [1,2,3]
tuple = (1,2,3)
set   = {1,2,3}
dict  = {"first":1, "second":2, "third":3}

# Why more data structures?

While lists are widely used, they can be inefficient for certain tasks, particularly when searching for specific elements. This inefficiency arises from the linear nature of lists, where accessing an element requires iterating through the list from the beginning until the desired element is found. To address these limitations and optimize data organization and retrieval, Python offers specialized data structures like sets, dictionaries, and tuples.

---
# Set

### Motivation
Look at the following problem. We do not care where in the list the element is. If we sorted it somehow, we could find the element faster. We could also reduce the list to remove duplicates. This is where sets come in.

In [4]:
favorite_colors = ["blue", "green", "red", "blue"]
x = "blue"
if x in favorite_colors:
    print("wheeee")

wheeee


## Introduction
<div class="alert alert-block alert-info">
<ol>
<li>Unique Elements: Sets automatically eliminate duplicates, ensuring each element appears only once.</li>
<li> Efficient Membership Testing: Time complexity for searching for an element in a list is O(n) while for a set it is O(1) due to <a href=https://en.wikipedia.org/wiki/Hash_function>hashing</a>. </li>
</div>

In [5]:
s = {"blue", "green", "red", "blue"}
print(s)

{'green', 'red', 'blue'}


In [6]:
"red" in s

True

In [7]:
print(set("abca"))
print(set(["blue", "green", "red", "blue"]))


TypeError: 'set' object is not callable

In [None]:
set() # is not {} because {} is a so called dictionary, see below

set()

In [None]:
sorted(s)

['blue', 'green', 'red']

In [None]:
s1 = {"blue", "green", "red"}
s2 = {"black", "red"}

In [None]:
print(s1 | s2)  # s1.union(s2)
print(s1 & s2)  # s1.intersection(s2)
print(s1 - s2)  # s1.difference(s2)
print(s1 ^ s2)  # s1.symmetric_difference(s2)

print(s1 <= s2) # s1.issubset(s2)
print(s1 >= {"a"}) # s1.issuperset(s2) 
print(s2=={"red","black"})

{'red', 'black', 'green', 'blue'}
{'red'}
{'green', 'blue'}
{'green', 'black', 'blue'}
False
False
True


In [None]:
print(s1)
s1.add("purple")
s1.remove("red")
print(s1)

{'green', 'red', 'blue'}
{'green', 'purple', 'blue'}


Now there is no "red" in the set, so we get `KeyError`. This is because the elements are called `keys` in a `set`.

In [None]:
s1.remove("red")

KeyError: 'red'

---
# Dictionary
### Motivation
Look at the following structure
```python
book = ["The Lord of the Rings", "J.R.R. Tolkien", 1954]
```
One needs to remember that the first element is the book name, second is the author, and third is the year. One way is to design a `class Book`, as we know. Dictionaries offer similar functionality in many ways, without the need of creating complicated classes.

### Introduction
<div class="alert alert-block alert-info">
    <ol>
        <li>data are of a form `{key1: value1, key2: value2, ...}`</li>
        <li>keys are unique, can be any immutable type (for example list would not work)</li>
        <li>values can be any type</li>
    </ol>
</div>

**Complexity:**
- operations with elements are O(1) (computer knows where to find the element)
- operations with the whole dictionary runs in linear time O(n) (computer still needs to iterate over all elements)

In [None]:
# dictionary of countries and their population in millions
countries = {
    "Czechia": 11, 
    "Italy": 59,
    "Turkey": 85, 
    "Poland": 38
      }
countries["Czechia"] # value for key "Czechia"

11

In [None]:
"Uzbekistan" in countries

False

Now when we try to obtain the value of a key that does not exist, we get a KeyError:

In [None]:
countries["Uzbekistan"]

KeyError: 'Uzbekistan'

Or we can ask nicely and obtain None instead of an error:

In [None]:
a = countries.get("Uzbekistan")
b = countries.get("Uzbekistan", "did not find")
print(a)
print(b)

None
did not find


In [None]:
countries["Uzbekistan"] = 35
countries["Uzbekistan"]

35

In [None]:
countries.items()

dict_items([('Czechia', 11), ('Italy', 59), ('Turkey', 85), ('Poland', 38), ('Uzbekistan', 35)])

#### Example: traffic light
Let's try to implement `next_color` function using dictionaries. The construction using classes was really safe and good, but there are cases where this can get handy. Especially when comunicating between more programming languages.
```python
def next_color(col: str)->str:
    """Takes a light color and returns the next on in a row.
    "red"->"orange"->"green"
    """
    if col=="red":
        return "orange"
    elif col=="orange":
        return "green"
    else:
        return "red"
```
1. What changes needs to be done in the following code to add another color to the traffic light?

In [15]:
colors = {
    "red": 1,
    "orange": 2,
    "green": 3
    }

def next_color(col: str)->str:
    """Takes a light color and returns the next on in a row.
    "red"->"orange"->"green"

    Examples:
        >>> next_color("red")
        'orange'
        >>> next_color("orange")
        'green'
        >>> next_color("green")
        'red'
    """
    if col not in colors:
        raise ValueError("Invalid color")
    
    next_value = (colors[col] % len(colors)) + 1
    
    return color_from_value(next_value)

def color_from_value(val: int) -> str:
    """Returns color corresponding to the value in colors.
    
    Examples:
        >>> color_from_value(1)
        'red'
        >>> color_from_value(2)
        'orange'
        >>> color_from_value(3)
        'green'
    """
    return [col for col, v in colors.items() if v == val][0]

import doctest
doctest.testmod()

TestResults(failed=0, attempted=6)

#### Iterating over dictionaries

In [None]:
[k for k in countries.keys()]

['Czechia', 'Italy', 'Turkey', 'Poland', 'Uzbekistan']

In [None]:
[v for v in countries.values()]

[11, 59, 85, 38, 35]

In [None]:
a = [print(k,"has about", v, "million people") for k,v in countries.items()]
print(a)

Czechia has about 11 million people
Italy has about 59 million people
Turkey has about 85 million people
Poland has about 38 million people
Uzbekistan has about 35 million people
[None, None, None, None, None]


#### Creating lists using comprehensions

In [None]:
{k for k in range(5)}

{0, 1, 2, 3, 4}

In [None]:
powers = {x: x**3 for x in range(5)}
powers[3]

27

In [None]:
powers[3]

#### Initializing dictionaries
For computing the frequency of words in a text, we can use a dictionary to store the words and their counts `{word: count}`. If the word is not in the dictionary, we add it with a count of 1. If it is already in the dictionary, we increment its count.

<div class="alert alert-block alert-info">
Adding new keys into the dictionary when needed can be achieved using a `defaultdict` from the `collections` module.
</div>

In [None]:
from collections import defaultdict
d = defaultdict(int) # which function should be called empty to obtain a default value? int()=0
print(d)
d["something"] # returns 0, because we set the default type to int

defaultdict(<class 'int'>, {})


0

If we choose different type, another typical choice is list, we get

In [None]:
l = defaultdict(list)
l["a"]

[]

In [None]:
d["a"] += 1
d["d"] += 2
print(d)
print(list(d))
print(list(d.items()))

defaultdict(<class 'int'>, {'something': 0, 'a': 1, 'd': 2})
['something', 'a', 'd']
[('something', 0), ('a', 1), ('d', 2)]


In [None]:
word_occurencies = defaultdict(int)
for w in "hello hello world worldy world".split():
    word_occurencies[w] += 1 # without default dict, this could be written as d[w] = d.get(w, 0) + 1
word_occurencies.items()

dict_items([('hello', 2), ('world', 2), ('worldy', 1)])

In [None]:
word_lengths = defaultdict(list)
for word in "hello hello my something world".split():
    word_lengths[len(word)].append(word)

word_lengths.items()
word_lengths[5]

['hello', 'hello', 'world']

#### Complicated example of a dictionary

In [None]:
contacts = [
    {
        "name": "John",
        "email": ["john123@seznam.cz", "john666@de.com"],
        "adress": {
            "street": "Karlovo namesti",
            "number": 1
        }
    },
    {
        "name": "Dohn",
        "email": ["do@h.n"],
        "adress": {
            "street": "Somewhere",
            "number": 11
        }
    }
]
print(contacts[0]["email"][1])
print(contacts[1]["adress"]["street"])

john666@de.com
Somewhere


In [None]:
# find Jane in contacts and print her email, without knowing the index of Jane
for contact in contacts:
    if contact["name"] == "Dohn":
        print(contact["email"])
        break
    
# and using list comprehension
[contact["email"] for contact in contacts if contact["name"] == "Dohn"]

['do@h.n']


[['do@h.n']]

### Choose function based on a string
When we know that new functionalities will be added, we can use a dictionary to store the functions and call them based on a string. Another usage is for creating games, where we can store the functions for moving the player etc. in a dictionary.

In [None]:
def f2():
    print(5**2)
def f3():
    print(5**3)

function_choice = {
    "s": f2,
    "t": f3
}

def execute(order: str):
    if order in function_choice:
        function_choice[order]()
    else:
        print("I do not know this order!")


while True:
    order = input("order: ")
    if order == "end":
        break
    execute(order)

25
125
125
25


---
# Tuples

### Motivation
Tuples are used when we want to group elements together, but we do not want to change them. This can be useful when we want to ensure that the data is not changed by accident.

In [None]:
immortal_coordinate =  (4,5,1)
immortal_coordinate[1] = 3

TypeError: 'tuple' object does not support item assignment

In [None]:
# in a snake game, we have obstacles and a list of messages that appear at each of them
obstacles = {
    (0,0): "you are at the origin",
    (3,1): "You like stones",
    (4,1): "You like stones"
}
obstacles[(0,0)]

'you are at the origin'

### Introduction
<div class="alert alert-block alert-info">
    <ol>
        <li>immutable</li>
        <li>can be used as keys in dictionaries</li>
    </ol>

In [None]:
t = (1,2,3) # equivalent to t = 1,2,3
print(t)

This can be used for saving more variables at once, switching them...

In [None]:
a,b = 1,2
a,b = b,a # swap values
print(a)
print(b)

In [None]:
print(t)
a,b,c = t # unpacking tuple t to a,b,c
print(a)

Also can be used inside functions for returning more than one value:

In [None]:
def complex(real: float, imag: float) -> (int,int):
    return real, imag

print(complex(1,2))

Conversions between lists and tuples are done using `list()` and `tuple()`:

In [None]:
print(t)
print(list(t))
print(tuple(list(t)))

### Unpacking 
We can even put more variables into one cell using `*args`. Here we can pass as many arguments as we want to the function. They are acceptes as tuple.

In [22]:
def f_args(*a):
    return a[0]
print(f_args(1,2,3))

1


Using double star as `**kwargs`, we pass the arguments as a dictionary.

In [30]:
def f_kwargs(**kwargs):
    return kwargs
print(f_kwargs(a=1,b=2,c=3))
print(f_kwargs(**{'a':1,'b':2,'c':3}))

{'a': 1, 'b': 2, 'c': 3}
{'a': 1, 'b': 2, 'c': 3}


Unpacking can be used even for lists, as shown below. This can be handy for example for going through a list using recursion.

In [None]:
lst = [1, 2, 3, 4, 5]
first, *rest = lst
print(first)
print(rest)

1
[2, 3, 4, 5]


#### Example:
Imagine you're building a simple pricing calculator for a product. The base price is fixed, but you might add some extras. Adding only one is for free, but adding both costs extra.

In [42]:
def calculate_price(base_price, **options):
    try:
        return base_price + max(options["extra_storage"],options["extended_warranty"])
    except KeyError:
        return base_price

# base model
total_cost3 = calculate_price(500)
print(total_cost3)

# customer gets charged only for extra_storage
total_cost1 = calculate_price(500, extra_storage=100, extended_warranty=50)
print(total_cost1)

# one extra is for free
total_cost2 = calculate_price(500, extra_storage=100)
print(total_cost2)



600
500
500
