<div class="frontmatter text-center">
<h1>Introduction to Data Science and Programming</h1>
<h2>Lecture 5: Python Crash Course - Dictionaries</h2>
<h3>IT University of Copenhagen, Fall 2023</h3>
<h3>Instructor: Anastassia Vybornova</h3>
</div>

# Recap of last time + additional info

* loops: `for` and `while` (+ their scope!) 
* data types: `tuple, set, range`
* `methods` (for tuples, sets, and lists)

## `for` loop syntax in Python

With a `for` loop, we can iterate over an iterable object, 

and execute some code (the body of the for-loop) repeatedly (once for each item in the iterable object). 

**iterable objects (sequences):** `list`; `range`; `str`; `set`; `tuple`

```python
for variable in iterable_object:
    statement
# in this for loop, the statement will be executed len(iterable_object) number of times.

for variable in iterable_object:
    statement
    if condition:
        break
# this for loop will be executed maximum len(iterable_object) number of times;
# but if the condition becomes True, the for loop execution will stop earlier
```


In [None]:
for item in [1, 2, 3, 4, 5]:
    print(item)

In [None]:
for item in [1, 2, 3, 4, 5]:
    print(item)
    if item > 2:
        break


In [None]:
# the order matters! if we first "break", then the code 
# *after* the break statement will not be executed:
for item in [1, 2, 3, 4, 5]:
    if item > 2:
        break
    print(item)

## for-loops "leak":

The current value of the loop variable will "leak" (become available) once the for-loop ends. 

In [None]:
for loop_var in ["a", "b", "c", "d"]:
    print("in for-loop:", loop_var) 

# the last value of loop_var has "LEAKED":
print("after for-loop:", loop_var)

In [None]:
# THE LOOP VARIABLE "LEAKS"
# ... even if it was defined with a different value before!
x = "efghij"
print("before for-loop:", x)

for x in ["a", "b", "c", "d"]:
    print("in for-loop:", x) # just a fancy way of saying "do nothing"
    # loop_var takes on the values "a", "b", "c", "d" at each iteration step

# the last value of loop_var has "LEAKED"_
print("after for-loop:", x)

## `while` loop syntax in Python

With a `while` loop, we execute some code (the body of the while-loop) conditionally and repeatedly - until the condition of the while loop becomes False; or until we break out of the while loop.
```python
while condition:
    statement # (which modifies the condition!)
# this while loop will check the condition; 
# if it is True, it will execute the statement UNTIL the condition becomes False.

while condition1:
    statement 
    if condition2:
        break # make sure to "break out" at some point!
```

In [None]:
i = 0
while i < 5:
    i = i + 1
print(i) # this line will be executed AFTER the while loop is over

In [None]:
i = 0
while True: # will run forever,
    i = i + 1
    if i == 5: # UNLESS i becomes EXACTLY 5
        break
print(i) # this line will be executed AFTER the while loop is over

## Data types you know so far

* numeric: `int`, `float`
* text: `str`
* boolean: `bool`
* sequences: `list`, `range`, `tuple` (order matters!)
* sets: `set` (unordered collection of unique items!)

## Data types you learned last time 

`range()`, `tuple` and `set`

```python
### RANGE
range(x) # defines a "range" of length x; goes from 0 to x-1

### TUPLE
tuple(1, 1, 0, 0) # defines the INMUTABLE ordered sequence of items (1, 1, 0, 0)

### SET
{1, 1, 0, 0} # defines the set {0,1} of UNIQUE items
```

# Methods in Python

A method is a function that you can **call on an object**, using the notation

# `object.method()`

We learned *some* methods for:
* lists: `.index()`, `.sort()`, `count()`, `remove()`...
* tuples: `.index()`, `.count()`
* sets: `.add(), .remove(), .pop()`

<hr>

Today we will learn:

* update on operators (modulo `%` and assignment `+=, -=, *=, /=`)
* update on data types: the None object (absence of everything)
* update on functions (parameters, arguments)
* a new data type: dictionaries `dict`

# Update on your operators skill set!

### Arithmetic operators in Python:
* Addition `+`, 
* subtraction  `-`, 
* multiplication `*`, 
* division `/`, 
* exponentiation `**`, 
* integer division `//`
* **Modulus** `%`

> For the purposes of this crash course, it is enough for you to know how to use modulus `x % y` to check whether `x` is divisible by `y`. If you're curious about what else modulo can do for you (and it can do a lot!), [click here](https://realpython.com/python-modulo-operator/)

In [None]:
# it gives us the *remainder* of a division
33 % 10

In [None]:
# for even numbers, the *remainder* of a division by 2 is 0 (by definition)
16 % 2

In [None]:
# for odd numbers, the *remainder* of a division by 2 is NOT 0 (by definition)
17 % 2

In [None]:
# for even numbers, the *remainder* of a division by 2 is 0 (by definition)
16 % 2 == 0 # making sure that modulo 0 ... number is even

In [None]:
# for odd numbers, the *remainder* of a division by 2 is NOT 0 (by definition)
17 % 2 != 0 # making sure that modulo 1 ... number is odd

In [None]:
# and more generally speaking, we can test whether x is divisible by y
# (if it is divisible, x % y will be 0)
49 % 7

In [None]:
# and more generally speaking, we can test whether x is divisible by y
# (if it is divisible, x & y will be 0)
x = 49
y = 7
x % y == 0 # the equivalent of saying: x is divisible by y

# Update on your operators skill set!

### Assignment operators in Python:
* `=` assign value of expression (right) to variable (left)
* `+=, -=, *=, /=` **add and assign; subtract and assign; multiply and assign; divide and assign**

In [None]:
# What if I want to add 1 to a variable, but keep the variable name?
i = 4
i = i + 1 # we can make this shorter and more elegant:
i

In [None]:
i = 4
i += 1 # equivalent to: i = i + 1 ; can read as: "add 1 and assign"
i

In [None]:
# same logic for all other mathematical operators:
a = 9
a = a - 2 # we can use -= instead!
a

In [None]:
# same logic for all other mathematical operators:
a = 9
a -= 2
a

In [None]:
# same logic for all other mathematical operators:
b = 15
b *= 2
b

In [None]:
# same logic for all other mathematical operators:
b = 15
b /= 3
b
# note how the variable type changed to float!

In [None]:
# this also works for lists
my_list = [3,6,9]
my_list += [8,4,7]
my_list

In [None]:
# ... and for strings
my_word = "tusind "
my_word += "tak"
my_word

# Update on data types: `None` 

To define the absence of everything

In [None]:
# the None object, with a capital N:
x = None
print(x)
print(type(x))

In [None]:
# ONLY NONE CAN BE NONE! 
# None is not the same as empty
x = None
empty_string = ""
empty_list = []
empty_tuple = ()
print(x == empty_string)
print(x == empty_list)
print(x == empty_tuple)

In [None]:
# Only None is None!
print(None == True)
print(None == False)
print(None == None)

## Update on functions
* returning `None` (if no return statement!)
* returning multiple values
* **parameter** temporary variable declared in the function definition
* **argument** values passed into the function call
* using several parameters
* required and optional parameters (that have a default)
* mixing required and optional parameters in a function call
* assigning arguments explicitly (by parameter name)

In [None]:
# if there is no return statement,
# the function returns None:
def function_without_return():
    1 + 2

function_without_return()

In [None]:
# if there is no return statement,
# the function returns None:
def function_without_return():
    1 + 1

print(function_without_return())

In [None]:
# if there is no return statement,
# the function returns None:
def function_without_return():
    1 + 1

result = function_without_return()
print(result)

In [None]:
# some functions that we already learned 
# (and some methods!) return None!
result = print("hello world!")
print(result)

In [None]:
# some functions that we already learned 
# (and some methods!) return None!
mylist = [1, 2, 3]
result = mylist.append(4)
print(result) # the .append() method has no return value
print(mylist) # but the appending worked!

In [None]:
mylist = [1, 2, 3]
mylist = mylist.append(4) # THIS IS WRONG!!
print(mylist) # we overwrote "mylist" with the return of the .append() method, which is None

In [None]:
# functions can return multiple values
# just write them all after your "return" keyword, separated by commas:
def square_and_cube(x):
    return x ** 2, x ** 3 

result = square_and_cube(2)
print(result) # the result is a TUPLE!
print(type(result))

In [None]:
# functions can return multiple values
# just write them all after your "return" keyword, separated by commas:
def square_and_cube(x):
    return x ** 2, x ** 3 

# we can save each item within the returned tuple into a separate variable:
a, b = square_and_cube(2)  
print(a)
print(b)

In [None]:
# functions can return multiple values
# just write them all after your "return" keyword, separated by commas:
def square_and_cube(x):
    return x ** 2, x ** 3 

# if we don't need one of the returned variables, use underscore:
_, b = square_and_cube(2)  # let's say i don't care about square, i just want cube
print(_)

In [None]:
# HOW NOT TO DO IT:
def square_and_cube(x):
    return x ** 2, 
    return x ** 3 
# this is WRONG!! function is exited as soon as return statement is executed
# THE 2ND RETURN STATEMENT NEVER GETS EXECUTED 

# the function call still works... but this is not what we wanted
square_and_cube(2)



In [None]:
# function definition
def add_5(x): 
    x = x + 5
    return x

# function call
add_5(4) 

In [None]:
# function definition
def add_5(x): # x ...is the PARAMETER (arbitrarily named)
    x = x + 5
    return x

# function call
add_5(4) # 4 ... is the ARGUMENT (value passed to function)

In [None]:
# a function can have several PARAMETERS
def add_numbers(x, y, z):
    my_sum = x + y + z
    return my_sum
# parameters in function def: x, y, z

# function call
add_numbers(1, 2, 3)
# arguments in function call: 1, 2, 3

# Q: How does Python know which argument corresponds to which parameter?

In [None]:
# a function can have several parameters
def add_numbers(x, y, z):
    my_sum = x + y + z
    return my_sum
# parameters in function def: x, y, z

# in this function call
add_numbers(1, 2, 3)
# python knows BY POSITION: 
# first argument (1) is x 
# second argument (2) is y
# third argument (3) is z

In [None]:
# a function can have several parameters
def add_numbers(x, y, z):
    my_sum = x + y + z
    return my_sum

# function call
# we can explicitly "assign the arguments by parameter name"
add_numbers(x = 1, y = 2, z = 3)

In [None]:
# why is it useful to assign arguments by NAME?
# because ORDER MATTERS!
def compute_power(x, y):
    return x**y

print(compute_power(5,2)) # positional arguments >> computes 5 * 5
print(compute_power(2,5)) # positional argument >> computes 2 * 2 * 2 * 2 *2

In [None]:
# why is it useful to assign arguments by NAME?
# because ORDER MATTERS!
def compute_power(x, y):
    return x**y

# to be sure that we use the parameters in the way we intended,
# assign by name! if we want to compute 5 to the power of 2,
# compute_power(x = 5, y = 2) # named arguments >> computes 5 ** 2
compute_power(y = 2, x = 5) # named arguments >> ALSO computes 5 ** 2

In [None]:
# we can have "default values" for "optional parameters":
def compute_power(x, y = 2):
    return x ** y

# what happens if we don't provide an argument for the second parameter?
compute_power(5)

# what happens if we DO provide an argument for the second parameter?
# compute_power(3, 5)

In [None]:
# we can have "default values" for "optional parameters":
def compute_power(x, y = 2):
    # x is required (must be provided)
    # y is optional (if not provided, gets the default value)
    return x ** y

# compute_power(3) # if we DO NOT specify y, it will return x ** 2
# compute_power(3,3) # if we DO specify y, it will return x ** y
compute_power(y=3) # if we DO NOT specify x, it will complain

In [None]:
# RULE FOR FUNCTION DEFINITIONS:
# required parameters must ALWAYS COME FIRST
# and default parameters come last.
# this will throw an error message:
def compute_power(x = 10, y):
    return x ** y

In [None]:
# SIMILAR RULE FOR FUNCTION CALLS:
def compute_power(x, y = 2): 
    return x ** y

### POSSIBLE OPTIONS
# compute_power(3, 3) # this works (all arguments are positional)
# compute_power(x = 3, y = 3) # this works (all arguments are named)
# compute_power(y = 3, x = 3) # this works (all arguments are named, so position doesn't matter)
# compute_power(3, y = 3) # this works! (first positional, then named) 

### IMPOSSIBLE OPTIONS
# compute_power(x = 3, 3) # THIS DOESN'T WORK! (first named, then positional is a NO)
# compute_power(y = 3, 3) # THIS DOESN'T WORK! (first named, then positional is a NO)


# Try it out yourself!

1. Define a function `compute_fraction` that takes 3 input parameters `a, b, c` and computes $\frac{a+b}{c}$. Try it out for different values of `a, b, c`. Run the cell with the `assert` statements to see if you defined the function correctly.

2. Define a function `compute_sum_or_fraction` that takes 3 input parameters and computes $\frac{a+b}{c}$; where a and b are **required parameters** and **c is an optional parameter with the default value `1`**. Try it out for different values `a, b, c`; try it out with providing only the parameters `a, b`. Run the cell with the `assert` statements to see if you defined the function correctly.

**`compute_fraction`**

In [None]:
# define your function
def compute_fraction():
    # YOUR CODE HERE
    raise NotImplementedError()


In [None]:
# test it out for different values of a, b, c

In [None]:
# test your function
assert compute_fraction(1,2,3)==1
assert compute_fraction(2,2,2)==2
assert compute_fraction(10,60,7)==10

**`compute_sum_or_fraction`**

In [None]:
# define your function
def compute_sum_or_fraction():
    # YOUR CODE HERE
    raise NotImplementedError()

In [None]:
# try out your functions for different values a,b,c 
# and for different values a,b

In [None]:
# test your function
assert compute_sum_or_fraction(1,2,3)==1
assert compute_sum_or_fraction(60,10,7)==10

assert compute_sum_or_fraction(1,2)==3
assert compute_sum_or_fraction(10,5)==15


# Break until 11:20

## Update on functions - recap
* returning multiple values: `return value1, value2`
* **"parameter"** = temporary variable declared in the function definition `def function_name(parameter_name): ...`
* **"argument"** = value passed into the function call `function_name(argument_value)`
* functions can take several parameters `def function_name(param1, param2, param3, ...)`
* required and optional parameters (that have a default) 
    * required: `def function_name(req_param)`
    * optional: `def function_name(opt_param=default_value)`
* mixing required and optional parameters in a function call - **watch out for the right order**
* assigning arguments explicitly (by parameter name)

# The data type dictionary `dict`

"A collection of key-value pairs, with unique keys"

In [None]:
# how can we store PAIRS of data?
countries = ["ethiopia", "indonesia", "peru", "czech republic"]
capitals = ["addis ababa", "jakarta", "lima", "prague"]

# pairs of country-capital have the same index:
print(countries[2])
print(capitals[2])
print("the capital of", countries[2],"is", capitals[2])


In [None]:
# how can we store PAIRS of data?
countries = ["ethiopia", "indonesia", "peru", "czech republic"]
capitals = ["addis ababa", "jakarta", "lima", "prague"]

i = 0 # the items that belong together have the same listindex

print(
    "the capital of", 
    countries[i], 
    "is", 
    capitals[i]
    )


In [None]:
# "there MUST be a better way to do this!"
# enter the data type DICTIONARY: 
# denoted by curly brackets. 
# each ITEM of the dictionary is a key:value pair.
my_dict = {
    "ethiopia": "addis ababa", # key is "ethiopia", the corresponding value is "addis ababa"
    "indonesia": "jakarta",
    "peru": "lima",
    "czech republic": "prague"
} 
my_dict

In [None]:
# check the new data type:
type(my_dict)

In [None]:
# inside the dict, the items are structured by key:value;
# now we can LOOK UP a VALUE by its KEY:
my_dict["ethiopia"] 

In [None]:
# inside the dict, the items are structured by key:value;
# now we can LOOK UP a VALUE by its KEY:
my_dict["peru"] 

In [None]:
# we can access the keys by the DICT METHOD .keys():
my_dict.keys()

In [None]:
# we can access the values by the DICT METHOD .values():
my_dict.values()

In [None]:
# we can access the ITEMS (key-value pairs) by the DICT METHOD .items():
my_dict.items()

In [None]:
# we can iterate over the keys of a dictionary:
for x in my_dict.keys():
    print(x)

In [None]:
# we can iterate over the values of a dictionary:
for v in my_dict.values():
    print(v)

In [None]:
# we can iterate over the keys of a dictionary:
for key in my_dict.keys():
    print("the dict key is:", key, "; the dict value is:", my_dict[key]) 

In [None]:
# we can iterate over the ITEMS (keys AND values) of a dictionary.
# the ITEMS come in pairs of 2 (key, value), so we need 2 for loop variables:
for key, value in my_dict.items():
    print(key)
    print(value)

In [None]:
# we can iterate over the ITEMS (keys AND values) of a dictionary:
for key, value in my_dict.items():
    print("the capital of " + key + " is " + value)

In [None]:
# COMPARE THIS TO OUR INITIAL ATTEMPT TO STORE THE DATA IN 2 LISTS:

my_dict = {"ethiopia": "addis ababa", "indonesia": "jakarta", "peru": "lima", "czech republic": "prague"} 
for key, value in my_dict.items():
    print("the capital of " + key + " is " + value)

### VERSUS:

countries = ["ethiopia", "indonesia", "peru", "czech republic"]
capitals = ["addis ababa", "jakarta", "lima", "prague"]

for i in range(len(countries)):     
    print("the capital of " + countries[i] + " is " + capitals[i])

In [None]:
# Values can be of any type!
# bank movements per weekday
bank_movements = {
    "monday" : [15000, -750, -100],
    "tuesday" : 250,
    "wednesday" : [-200, 300]
}
# bank_movements
#bank_movements["monday"]
bank_movements["tuesday"]

In [None]:
# if we try to use a non-existing key,
# we get an error:
bank_movements["thursday"]

In [None]:
# To add an entry to the dictionary: 
# just assign a value to the new key like so:
bank_movements["thursday"] = "on thursday, nothing happened"
bank_movements

In [None]:
# To update a key-value pair:
bank_movements["tuesday"] = [125, 125] # here we are reassigning a new value to this key
bank_movements

In [None]:
# To update a key-value pair:
bank_movements["monday"] += [30] # here we are updating the list stored as value
bank_movements

In [None]:
# To update a key-value pair:
bank_movements["wednesday"].remove(300)
bank_movements

In [None]:
# To DELETE an entry from the dictionary:
# del(bank_movements["thursday"])
#bank_movements

In [None]:
# To test if a key is in the dictionary:
"monday" in bank_movements
# this is short for:
"monday" in bank_movements.keys()
# print(bank_movements)

In [None]:
# To test if a value is in the dictionary:
[-200] in bank_movements.values()

In [None]:
# Defining an empty dictionary, option 1:
d = {} 
type(d)

In [None]:
# Defining an empty dictionary, option 2:
d = dict()
type(d)

In [None]:
# We can also use the dict() function to create a dictionary:
# dict(name1=value1, name2=value2, ...)

# note that in this case, the keys (name1, name 2, ...) 
# have to follow the NAMING RULES FOR VARIABLES!

my_dict = dict(
    ethiopia = "addis ababa", # key is "ethiopia", the corresponding value is "addis ababa"
    indonesia = "jakarta",
    peru = "lima",
    czech_republic = "prague" 
) 
my_dict

# Try it out yourself

Create a dictionary `menu` with days of the week as keys, and your dinner menu for that day as value. Then:
* print out all keys of the `menu`
* print out all values of `menu`
* print out what you will eat on Friday
* change your dinner on Saturday to `"dining out!"` 
* delete the key-value pair for Sunday from the dictionary

In [None]:
menu = {
    # your menu here: 
    # "monday": "...", etc.
}

# Dictionaries in Python

* dictionaries are **collections of key-value pairs** with **unique keys**
* store pairs of data in a **structured** way, by entries of the form `key:value` 
* finding a key/value in a dictionary is **much more efficient** than finding an item in a list
* denoted by curly brackets `{}` or `dict()`
    * {key1:value1, key2:value2, ...}
    * dict(name1=value1, name2=value2, ...)
* rules for dictionary **keys**:
    * *should* be of type `int`, `str` or ` tuple` (more details later)
    * have to be unique (the same way a word can't appear twice in a dictionary)
* rules for dictionary **values** 
    * can be of any type 
    * do not need to be unique (the same way two words (synonyms) can have the same meaning)

# Dictionaries in Python

* use a key to "look up" its value: `mydict[mykey]` returns the value that corresponds to `mykey`
* to add or update a key:value pair: `mydict[mykey]=myvalue`
* to delete a key:value pair: `del(mydict[mykey])`
* to check whether a key is in the dictionary: `mykey in mydict` (short for: `mykey in mydict.keys()`)
* to check whether a value is in the dictionary: `myvalue in mydict.values()`

# Reminder: how to create...

<style type="text/css">
.tg  {border-collapse:collapse;border-spacing:0;}
.tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
  overflow:hidden;padding:10px 5px;word-break:normal;}
.tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
  font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg .tg-0lax{text-align:left;vertical-align:top}
</style>
<table class="tg">
<thead>
  <tr>
    <th class="tg-0lax">AN EMPTY ....</th>
    <th class="tg-0lax">with brackets</th>
    <th class="tg-0lax">with function()</th>
  </tr>
</thead>
<tbody>
  <tr>
    <td class="tg-0lax">list</td>
    <td class="tg-0lax">[]</td>
    <td class="tg-0lax">list()</td>
  </tr>
  <tr>
    <td class="tg-0lax">tuple</td>
    <td class="tg-0lax">()</td>
    <td class="tg-0lax">tuple()</td>
  </tr>
  <tr>
    <td class="tg-0lax">set</td>
    <td class="tg-0lax"></td>
    <td class="tg-0lax">set()</td>
  </tr>
  <tr>
    <td class="tg-0lax">dict</td>
    <td class="tg-0lax">{}</td>
    <td class="tg-0lax">dict()</td>
  </tr>
</tbody>
</table>

# Reminder: how to create...

<style type="text/css">
.tg  {border-collapse:collapse;border-spacing:0;}
.tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
  overflow:hidden;padding:10px 5px;word-break:normal;}
.tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
  font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg .tg-0lax{text-align:left;vertical-align:top}
</style>
<table class="tg">
<thead>
  <tr>
    <th class="tg-0lax">WITH ITEMS</th>
    <th class="tg-0lax">with brackets</th>
    <th class="tg-0lax">with function()</th>
  </tr>
</thead>
<tbody>
  <tr>
    <td class="tg-0lax">list</td>
    <td class="tg-0lax">[1, 2, 3]</td>
    <td class="tg-0lax">list((1,2,3))</td>
  </tr>
  <tr>
    <td class="tg-0lax">tuple</td>
    <td class="tg-0lax">(1, 2, 3)</td>
    <td class="tg-0lax">tuple([1,2,3])</td>
  </tr>
  <tr>
    <td class="tg-0lax">set</td>
    <td class="tg-0lax">{1, 2, 3}</td>
    <td class="tg-0lax">set([1,2,3])</td>
  </tr>
  <tr>
    <td class="tg-0lax">dict</td>
    <td class="tg-0lax">{key1: value1, <br>key2: value2, <br>key3: value3}</td>
    <td class="tg-0lax">dict(key1 = value1, <br>key2 = value2, <br>key3 = value3)</td>
  </tr>
</tbody>
</table>

**Note** that `list(), tuple(), set()` take only ONE argument!

In [None]:
# list(), tuple(), set(), dict() take only ONE argument!
# e.g. to create a tuple with the items 1, 2, 3:

# mytuple = tuple(1,2,3) # WRONG
# mytuple = tuple( [1,2,3] ) # RIGHT
mytuple = (1,2,3) # ALSO RIGHT
mytuple