# Data Structures - Dictionaires, Sets and Tuples

## Content
### Data types
[X] Variables    
[X] Types (Boolean, Int, Float, Strings)     
### Data transformation
[X] Arithmetic (+,  -, *, /, ** , %, )    
[X] Type conversion       
[X] Conditionals  (if/elif/else)  
[X] Methods    
### Handling collections
[X] Collection types: list, string,...    
[X] Loops    
[X] Lists   
​
### Functions
[X] 
​
### More Data Structures
[_] Dictionaries   
[_] Sets   
[_] Tuples  
​
### Next up: Classes
[_] ...


---

## Lists and functions recap

**Why do we have them?**

Creating a function:



``` python
def function_name(argument1, argument2):
  do stuff
  don't forget the identation
  
  return something
```



In [None]:
# Example from last class
def add_two_numbers(number1, number2):
  number_sum = number1 + number2
  print("number_sum inside function:", number_sum)
  
  return number_sum

In [None]:
sum_3_4 = add_two_numbers(4 , 3)

In [None]:
print(sum_3_4)

### Exercise 1 - Removing duplicates

Notice we have some duplicate items on the list below? Write a function to remove all duplicates from a list, printing out the items you're removing

In [None]:
groceries_list = ["milk", "eggs", "coffee", "eggs", "chocolate", "milk"]

Expected output:
  * "eggs already in the list"
  * "milk already in the list"
  
Expected return:
  * ["milk", "eggs", "coffee", "chocolate"]

In [None]:
"milk" in groceries_list

In [None]:
def remove_duplicates(my_list):
  ''' Removes duplicates from a list '''
  my_new_shiny_unduplicated_list = []
  for item in my_list:
    if item in my_new_shiny_unduplicated_list:
      continue
    else:
      my_new_shiny_unduplicated_list.append(item)
  return my_new_shiny_unduplicated_list 

In [None]:
remove_duplicates(groceries_list)

### Exercise 2 - creating similar lists

Suppose every time we go to the supermarket, we always buy the items in **groceries_list** above, plus some items that change everytime.

To make our lives easier, write a function that receives only these new items and gives us a complete list.

input:
  * `["fruits", "cleaning products"]`
  
Expected return:
  * `["milk", "eggs", "coffee", "chocolate", "fruits", "cleaning products"]`

In [None]:
def expand_shopping_list(new_items, basic_groceries):
  ''' Creates a complete shopping list '''
  # your code here
  expandedList = basic_groceries
  for item in new_items:
    if not item in expandedList:
      expandedList.append(item)
  return expandedList

In [None]:
basics = remove_duplicates(["milk", "eggs", "coffee", "eggs", "chocolate", "milk"])
additional = ["fruits", "cleaning products"]
print ( expand_shopping_list(additional, basics) )

---

# Data structures


## Tuples

* Sequence of elements of any types (like list)
* Immutable (like string)
Once initialized, can’t be changed - only re-assigned
* Create with parentheses or the keyword Tuple
* Lots of the same methods as list 

Advantage to list? 
  * a bit faster 
  * safer, if variable should never be changed


In [None]:
base_groceries = ("milk", "eggs", "coffee", "chocolate", "oo")

for item in base_groceries:
  print(item)

### Creating new tuples

To create new tuples from a pre existing tuples, you can also use the addition operator. 

In [None]:
new_groceries = ("fruits", "vegetables")
all_groceries = base_groceries + new_groceries

In [None]:
all_groceries

Or you can transform it into a list, modify it and make it a tuple again 

In [None]:
groceries_list = list(base_groceries)
groceries_list[2] = 'fruits'

all_groceries = tuple(groceries_list)

In [None]:
all_groceries

Careful when working with tuples. Accessing individual entries is not a problem. Updating the tuple throws
an exception.

In [None]:
print(all_groceries[0])
all_groceries[0] = 'water'

Since tuples can contain arbitrary types, you can still change individual tuple values;

In [None]:
some_tuple = {'name': 'John'}, {'name': 'Sally'}
print(some_tuple)
some_tuple[0]['name'] = 'Jim'
print(some_tuple)

### Packing and unpacking

As we've seen, you can create tuples using the constructor or by converting a list to a tuple. Tuples
are implicitly created using comma-separated values.

In [None]:
coordinates_3d =  10, 20, -5
print(type(coordinates_3d))

Unpacking the tuple is also quite useful when individual variables are required. Just make sure the the amount of variables
is equal to the count of the tuple.

In [None]:
x, y, z = coordinates_3d
print(f'X = {x}, Y = {y}, Z = {z}')

In [None]:
# Careful!
x,y = coordinates_3d

---

## Sets

* **Unordered** collection of elements of different types, but ...
All elements within the set are **unique**

* Can contain only “hashable” objects (all built-in immutable types are hashable, mutable are not)

* Supports mathematical set operations: 
  * union 
  * intersection 
  * difference 
  * symmetric difference 

Convenient for removing duplicates from a sequence
 
No indexing of elements: You can’t rely on the order of the elements within the set


In [None]:
groceries_list = ["milk", "eggs", "coffee", "eggs", "chocolate", "milk"]

We build sets using the *set* keyword:

In [None]:
groceries_products = set(groceries_list)

print(groceries_list, len(groceries_list))
print(groceries_products, len(groceries_products))

Or with curly brackets

In [None]:
groceries_products = {"milk", "eggs", "coffee", "eggs", "chocolate", "milk"}
groceries_products

In [None]:
groceries_products[2]

Or with set comprehension:

In [None]:
letters = {c for c in 'abracadabra' if c not in 'abc'}
print(letters)

### Set operations

You can call these operations by the symbols below or using the dot notation
- union ( | )
- intersection ( & )
- symmetric difference ( ^ )
- difference ( - )

In [None]:
base_groceries = {"milk", "eggs", "coffee", "eggs", "chocolate", "milk"}

new_groceries = {"fruits", "vegetables", "musli", "eggs"}

In [None]:
# Union - What elements are in either set (think of OR - in set 1 OR set 2)
base_groceries | new_groceries

Or:

In [None]:
base_groceries.union?

In [None]:
base_groceries.union(new_groceries)

In [None]:
# Intersection - What elements are in both sets (Think of AND - in set 1 AND set 2)
base_groceries & new_groceries

# Dot-notation
base_groceries.intersection(new_groceries)

In [None]:
# Symmetric difference - What elements are in either of the sets exclusively (Think of XOR - in either set 1 or set 2, but not both.)
base_groceries ^ new_groceries

# Dot-notation
base_groceries.symmetric_difference(new_groceries)

In [None]:
# Difference - What elements are in set 1 but not set 2
base_groceries - new_groceries

# Dot-notation
base_groceries.difference(new_groceries)

### Set methods

These methods can be called as static members of the set class as well. Using the `_update` method, e.g. `intersection_update`, an existing set
can be updated with the result of the operation. So, base_groceries.intersection_update(new_groceries) would update the set base_groceries
to only contain eggs, which is the result of the intersection_update method.

In [None]:
set.union(base_groceries, new_groceries)

set.intersection(base_groceries, new_groceries)

set.symmetric_difference(base_groceries, new_groceries)

set.difference(base_groceries, new_groceries)

### Exercise
We asked the students in each REDI class to state their native languages and got the results: 

  * Python class: russian, french, arabic, arabic, arabic, arabic, farsi
  * Java class: english, kurdish, arabic, arabic, french 

Based on these data sets, we would like the answer the following questions:  

  1. How many different languages do we have in each class?
  1. What are the languages spoken in both classes? 
  1. What are the common languages between the two classes?
  1. What are the languages that are found in only one class (not in both)?
  1. What are languages found in Python class but not in Java class?


In [None]:
python_languages = ['russian', 'arabic', 'french',  'arabic', 'arabic', 'arabic', 'farsi']
java_languages = ['english', 'kurdish', 'arabic', 'arabic', 'french']

In [None]:
python_languages_set = set(python_languages)
java_languages_set = set(java_languages)

In [None]:
python_languages_set | java_languages_set

In [None]:
python_languages_set.intersection(java_languages_set)

In [None]:
# What are the languages spoken in both classes?
print('All languages:', python_languages_set | java_languages_set)

# What are common languages between the two classes?
print('Common languages:', python_languages_set.intersection(java_languages_set))

# What are the languages that are found in only one class (not in both)?
print('One-class languages:', )

# What are languages found in Python class but not in Java class and vice versa?
print('Python class specific languages:', )

---

## Dictionaries

Mutable collection of pairs: *key - value*

* Keys should be of hashable (immutable) types 
* Keys can have different types
* Keys of a dict are unique


Values can be of any type (including other dictionaries), making dictionaries very versatile to store complex data.


In [None]:
populations = {
    "China": 1433783686, "India": 1366417754, "United States":	329064917, 
    "Indonesia": 270625568, "Pakistan":	216565318, "Brazil": 211049527,	 
    "Nigeria":	200963599, "Bangladesh": 163046161, "Russia":	145872256, 
    "Mexico":	127575529	 
}

In [None]:
populations

In [None]:
# Getting a value
russian_population = populations["Russia"]

In [None]:
russian_population

In [None]:
# Retrieving a non-existant value
populations["Germany"]

In [None]:
# Inserting a new value
populations["Japan"] = 126860301
populations

In [None]:
# Updating a value looks suspiciously similar...
populations["India"] = 1352642280
populations['India']

In recent versions of Python, dict is ordered. In other programming languages, dictionaries are unordered, and types like OrderedDict are
used when a specific order of Key-Value-Pairs should be preserved.

### Iterating on dicts
We have 3 important methods to iterate on dicts:

* keys
* values   
* items - returns pairs of (key,value)



In [None]:
for country in populations.keys():
  if country.startswith("B"):
    print(country)

In [None]:
counter = 0
for population in populations.values():
  if population > 1000000000:
    counter +=1

print("There are {} countries with a population above 1 billion".format(counter))

**What if I want to know which countries they are??**

In [None]:
for country, population in populations.items():
  print(country, "has a population of", population)

In [None]:
one_billion_countries = list()
for country, population in populations.items():
  if population > 1000000000:
    one_billion_countries.append(country)

print("There are {} countries with a population above 1 billion".format(len(one_billion_countries)))
print("They are {}".format(one_billion_countries))

### Exercises
#### Warming up

Given the following dictionary:


In [None]:
inventory = {
    'gold' : 500,
    'pouch' : ['flint', 'twine', 'gemstone'],
    'backpack' : ['xylophone','dagger', 'bedroll','bread loaf']
}

Try to do the followings:

1. Add a key to inventory called 'pocket'.
2. Set the value of 'pocket' to be a list consisting of the strings 'seashell', 'strange berry', and 'lint'.
2. *.sort()* the items in the list stored under the 'backpack' key.
4. Then .remove('dagger') from the list of items stored under the 'backpack' key.
5. Add 50 to the number stored under the 'gold' key.


In [None]:
inventory["pocket"] = ['seashell', 'strange berry','lint'] 

In [None]:
inventory

In [None]:
backpack_value = inventory["backpack"]
backpack_value

In [None]:
# Sort the items of the list
backpack_value.sort()


In [None]:
inventory["backpack"] = backpack_value

In [None]:
inventory

In [None]:
gold_value = inventory["gold"]
gold_value = gold_value + 50

In [None]:
inventory["gold"] = gold_value

In [None]:
inventory["gold"] += 50
inventory

In [None]:
backpack_value.remove('dagger')

In [None]:
inventory["backpack"] = backpack_value

In [None]:
inventory

In [None]:
inventory["backpack"][1] += '!!'

In [None]:
inventory

In [None]:
inventory["backpack"][1][2]

#### Moving on

Create a new dictionary called "prices" using {} format like the example above.

Put these values in your prices dictionary:

```
"banana": 4,
"apple": 2,
"orange": 1.5,
"pear": 3
```

Create another dictionary called "stock":

```
"banana": 6,
"apple": 0,
"orange": 32,
"pear": 15,
"mango": 700
```


1. Loop through each key in prices. For each key, print out the key along with its price and stock information. Print the answer in the following format:
```
apple
price: 2
stock: 6
```

2. Let's determine how much money you would make if you sold all of your food.

  * Create a variable called total and set it to zero.
  * Loop through the prices dictionaries.For each key in prices, multiply the number in prices by the number in stock. Print that value into the console and then add it to total.
  * Finally, outside your loop, print total.


In [None]:
price = {
  "banana": 4,
  "apple": 2,
  "orange": 1.5,
  "pear": 3
}

In [None]:
stock = {
    "banana": 6,
    "apple": 0,
    "orange": 32,
    "pear": 15,
  "mango": 700  
}

In [None]:
for fruit in price.keys():
  print(fruit)
  print("price: {}".format(price[fruit]))
  print("stock: {}".format(stock[fruit]))
  print()

#### Dictionary out of a list

Given the list of fruit: ["banana", "apple", "apple", "banana", "banana", "orange", "pear", "orange", "orange", "orange"]
create a dictionary "stock", where the name of the fruit is a key, and the value is the amount of that fruit in the list.

1. loop over the given list
2. if the next fruit from the list is in the stock dictionary already, increase it's value in the dictionary
3. if it's not in the dictionar - add it there with value 1

You can use dictionary method .get() to figure out, if the key exists in the dictionary

In [None]:
fruit = ["banana", "apple", "apple", "banana", "banana", "orange", "pear", "orange", "orange", "orange"] 
stock = {}

for item in fruit:
  if item in stock:
    stock[item] += 1
  else:
    stock[item] = 1

The other approach - with get() method:

In [None]:
fruit = ["banana", "apple", "apple", "banana", "banana", "orange", "pear", "orange", "orange", "orange"] 
stock = {}

for item in fruit:
  stock[item] = stock.get(item, 0) + 1


#### Selling everything
Given the following dicts:

In [None]:
stock = {
    "banana": 6,
    "apple": 0,
    "orange": 32,
    "pear": 15
}

prices = {
    "banana": 4,
    "apple": 2,
    "orange": 1.5,
    "pear": 3
}

1. Make a list called groceries with the values "banana","orange", and "apple"

2. Define a function compute_bill that takes one argument food as input. 

3. In the function, create a variable total with an initial value of zero.

4. For each item in the food list, add the price of that item to total. 

5. Finally, return the total. Ignore whether or not the item you're billing for is in stock.

Note that your function should work for any food list.

In [None]:
groceries = ["banana","orange", "apple"]

def compute_bill(food):
  total = 0
  for item in food:
    total += prices[item]
  return total

total = compute_bill(groceries)
print(total)

**Bonus**

Make the following changes to your compute_bill function:
1. While you loop through each item of food, only add the price of the item to total if the item's stock count is greater than zero.
2. If the item is in stock and after you add the price to the total, subtract one from the item's stock count.


In [None]:
stock = {
    "banana": 6,
    "apple": 0,
    "orange": 32,
    "pear": 15
}

prices = {
    "banana": 4,
    "apple": 2,
    "orange": 1.5,
    "pear": 3
}
groceries = ["banana","orange", "apple"]

def compute_bill(food):
  total = 0
  for item in food:
    if stock[item] > 0:
      total += prices[item]
      stock[item] -= 1
  return total

total = compute_bill(groceries)
print(total)
print(stock)

#### Merging dicts

Write a function that receives two dicts and returns a single dict with all information from the two given dicts.

**Example**:

input:
  * {"name": "rodrigo", "age": 31}
  * {"dog": "Kiara"}
  
output:
  * {"name": rodrigo, "age": 31, "dog": "Kiara"}

In [None]:
def merge_dicts(dict_left, dict_right):
  merged_dict = {}
  for item in dict_left:
    merged_dict[item] = dict_left[item]
  
  for k, v in dict_right.items():
    merged_dict[k] = v

  return merged_dict


dict_1 = {"name": "rodrigo", "age": 31}
dict_2 = {"dog": "Kiara"}

merged = merge_dicts(dict_1, dict_2)

print(merged)



In [None]:
def merge_dicts(dict_left, dict_right):
  merged_dict = dict_left.copy()
  merged_dict.update(dict_right)
  return merged_dict


dict_1 = {"name": "rodrigo", "age": 31}
dict_2 = {"dog": "Kiara"}

merged = merge_dicts(dict_1, dict_2)

print(merged)


#### Character count

Write a function to count the number of times each character occurs in a word.

**Example**

input:
  * "I totally got this"

output:
  * {' ': 3, 'I': 1, 'a': 1, 'g': 1, 'h': 1, 'i': 1, 'l': 2, 'o': 2, 's': 1, 't': 4, 'y': 1}

Hints: 
* use a dict to store the values you're counting
* the method *.get* can help you when a key is not in the dict

In [None]:
def count_letters(s):
  dict_x = {}
  for letter in s:
    if letter in dict_x:
      dict_x [letter] += 1
    else:
      dict_x [letter] = 1

  return dict_x

 
print(count_letters("I totally got this"))


In [None]:
def count_letters(s):
  dict_x = {}
  for letter in s.lower():
    old_value = dict_x.get(letter, 0)
    dict_x[letter] = old_value + 1

  return dict_x

 
print(count_letters("I totally got this"))


#### Grades
We want to build a program which stores semester grades of a single student - one grade per subject, for example:
* Math => 5
* Science => 3
* English => 10
* German => 6
 
We want our program to ask the user if he wants to get add a grade, or to read it: 
* If read: ask the user for a subject, and print the grade for the respective subject.
* If add, ask the user for a subject and a grade. Then:
  * if the subject already has a grade - update it
  * if there is no such subject stored - add it with the given grade


**Bonus**

If the user asks to read a grade for a subject you have no grade for, print all subjects that we have grades for.  

#### Grades Challenge!



The aim of this exercise is to make a gradebook for teacher's students.

Given the following example dicts:

In [None]:
lloyd = {
  "name": "Lloyd",
  "homework": [90.0,97.0,75.0,92.0],
  "quizzes": [88.0,40.0,94.0],
  "tests": [75.0,90.0]
}
alice = {
  "name": "Alice",
  "homework": [100.0, 92.0, 98.0, 100.0],
  "quizzes": [82.0, 83.0, 91.0],
  "tests": [89.0, 97.0]
}
tyler = {
  "name": "Tyler",
  "homework": [0.0, 87.0, 75.0, 22.0],
  "quizzes": [0.0, 75.0, 78.0],
  "tests": [100.0, 100.0]
}

1. Create a list of students. For each student in your students list, print out that student's data, as follows:

  - print the student's name
  - print the student's homework
  - print the student's quizzes
  - print the student's tests


2. Write a function average that takes a list of numbers and returns the average.

  - Define a function called average that has one argument, numbers.
  - use the built-in sum() function with a numbers list as a parameter. 
  - Use float() to convert total and store the result in total.
  - Divide total by the length of the numbers list. Use the built-in *len()* function to calculate that.



3. Write a function called get_average that takes a student dictionary (like lloyd, alice, or tyler) as input and returns his/her weighted average.

  - Define a function called get_average that takes one argument called student.
  - Make a variable homework that stores the average() of student["homework"].
  - Repeat step 2 for "quizzes" and "tests".
  - Multiply the 3 averages by their weights and return the sum of those three. Weight for homeworks is 10%, for quizzes it's 30% and for tests it's 60%.


4. Define a new function called get_letter_grade that has one argument called score. Expect score to be a number.

- Inside your function, test score using a chain of if: / elif: / else: statements, like so:

```
If score is 90 or above: return "A"
Else if score is 80 or above: return "B"
Else if score is 70 or above: return "C"
Else if score is 60 or above: return "D"
Otherwise: return "F"
```

  - Finally, test your function. Call your get_letter_grade function with the result of get_average(lloyd). Print the resulting letter grade.

5. Define a function called get_class_average that has one argument, students. You can expect students to be a list containing your three students.

  - First, make an empty list called results.
  - For each student item in the class list, calculate get_average(student) and then call results.append() with that result.
  - Finally, return the result of calling average() with results.


6. Finally, print out the result of calling get_class_averagewith your students list. Your students should be [lloyd, alice, tyler].

7. Then, print the result of get_letter_grade for the class's average.


### Bonus Challenge

#### Greetings bot

Write a program that tells the user how to greet people in different languages. Your program should support: 
* English 
* German
* Italian, 
* French
* Portuguese,

**Example:**
```
program > Which language?
user > English
program > Greeting: Hello
```



#### Substitution cipher

Write the program which implements simple [substitution cipher](https://en.wikipedia.org/wiki/Substitution_cipher). 

It should take from user 3 strings: 
* original allowed alphabet, 
* ciphertext alphabet (of the same length as originall allowed alphabet has) 
* text to encipher (any length, containing symbols from the original alphabet). 

The program should output enciphered text (all symbols of the original alphabet in the text are substituted by the appropriate symbols of the ciphertext alphabet - see example)

**Example:**

Input:  
(original allowed alphabet): 
```
abcd
```
(ciphertext alphabet): 
```
*d%#
```

(text to encipher): 
```
abacabadaba
```

Output:  
```
*d*%*d*#*d*
```
