# Dictionaries and Structuring Data

## Introduction

The last "drink generator" exercise was a little bit tedious. This is because the values for drinks and their respective ingredients were not linked.

There are better data structures to map a recipe to a recipe name: A *dictionary*. Dictionaries are similar to their real-world equivalent: They always map a certain *key* to a certain *value*, like:

- hello -> Hallo
- world -> Welt

There are many other names for the same concept in other languages - for example, associative arrays or maps (since a dictionary *maps* or *associates* values to each other).

In addition to the book chapter about dictionaries, we will also take a look at *sets*. Sets are similar to the same concept in mathematics ("Mengen"): A collection of values, but without any fixed order and without duplicates.

**This notebook covers the [fifth chapter](https://automatetheboringstuff.com/2e/chapter5/) of the book.**

### Optional resources

You can find more information about dictionaries and structuring data in the Python documentation:
* [Dictionaries](https://docs.python.org/3/tutorial/datastructures.html#dictionaries)
* [Sets](https://docs.python.org/3/tutorial/datastructures.html#sets)

Relevant Real Python tutorials:
* [Dictionaries in Python](https://realpython.com/python-dicts/)
* [Sets in Python](https://realpython.com/python-sets/)

While not required for the exercises, it also pays off to make yourself familiar with the `collections` module in Python. It provides useful specialized dictionares such as a [`collections.defaultdict`](https://docs.python.org/3/library/collections.html#collections.defaultdict) (automatically providing a default value for unknown keys) or a [`collections.Counter`](https://docs.python.org/3/library/collections.html#collections.Counter) (counting the elements in an interable like a string, and then providing the total count of each element):

* [Python documentation for `collections`](https://docs.python.org/3/library/collections.html)
* [collections — Container Data Types — PyMOTW 3](https://pymotw.com/3/collections/index.html)

## Summary

### Dictionaries
Reconsider the example where we assigned some characteristics to a person. Now, in dictionary-form:

In [10]:
harry = {
    "hair": "black",
    "eyes": "green",
    "feature": "scar",
}
eyecolor = harry["eyes"]

print(f"Harry has {eyecolor} eyes.")

Harry has green eyes.


As you can see, dictionaries are similar to lists, but we always specify pairs of *key* and *value*. In this example, we're using strings as keys - but many other data types can also be used as keys (some, like lists, can't).

Curly braces are used to start and end a dictionary. Inside the curly braces, there are key-value-pairs separated by commas. Each pair consists of a *key*, the colon (`:`) as a separator, and the *value*.

Dictionaries can be modified after creating them, so they are *mutable*. But, unlike lists, they cannot be sliced as the items are _not ordered!_

Changing or adding an element to the dictionary works similarly like with lists:

In [14]:
# Modify
harry["eyes"] = "grey-green"
print(f"Harry has {harry['eyes']} eyes.")

# Add
harry["favorite_drink"] = "fire whiskey"
print(f"Harry drinks {harry['favorite_drink']} for breakfast.")

Harry has grey-green eyes.
Harry drinks fire whiskey for breakfast.


#### Keys, Values, Items
Have a look at the following methods by running the code. Study the output.

In [3]:
print(harry.keys())
print(harry.values())
print(harry.items())

dict_keys(['hair', 'eyes', 'feature'])
dict_values(['black', 'green', 'scar'])
dict_items([('hair', 'black'), ('eyes', 'green'), ('feature', 'scar')])


- `dict.keys()`: Returns all the keys in the dictionary.
- `dict.values()`: Returns all the values in the dictionary.
- `dict.items()`: Returns all the items in (key, value)-tuples.

You can think of those values like a list - they work in almost the same way a "normal" list would. They are of a special type (`dict_keys`, `dict_values` and `dict_items`) to avoid having to copy the data into a new list (a different "view" on the existing data).

#### `get()` and `setdefault()`
An error is raised if you access an unavailable item. There are multiple ways to avoid such an error.

In [None]:
# Check for the key:
if "size" in harry:
    print(harry["size"])
else:
    print("unknown")

# Use get():
size = harry.get("size", "unknown")
print(size)

The `get()`-method can have a default value as the second argument which will be returned if the key does not exist. The `get()`-method simplifies accessing data.

On the other hand, the `setdefault()`-method simplifies data insertion. To not overwrite data when you insert a new key-value-pair, use this method. The item only gets created if there is not already an item with the same key.

In [11]:
harry.setdefault("size", 178)

178

#### Delete Items
To delete an element, just use the `pop` method or `del` function.

In [12]:
harry.pop("hair")
del harry["size"]
harry

{'eyes': 'green', 'feature': 'scar'}

### Sets
Another type of data structure is a _set_. A set is an _unordered, immutable_ collection of _unique_ elements.

In [3]:
songs = {
    "Dark Noise",
    "Elephant Shunned",
    "Elephant Shunned",
    "More",
    "Packard",
    "The Space in Between",
}
print(songs)  # Note how the duplicate element vanished!

{'Elephant Shunned', 'Dark Noise', 'Packard', 'The Space in Between', 'More'}


Because the elements are not always ordered in the same way (unordered), you cannot access single items via the index. But you can get the length of the set, combine two sets with each other (union) and even loop through a set, albeit the order of the loop is undefined.

In [2]:
more_songs = {"404 Not Found", "The Six Degrees Theory", "Our Broken Mind Embassy"}

complete_set = songs | more_songs  # union

print(f"Playing {len(complete_set)} tracks in total.")

for track in complete_set:
    print(f"Now playing: {track}")

Playing 8 tracks in total.
Now playing: The Six Degrees Theory
Now playing: 404 Not Found
Now playing: The Space in Between
Now playing: Our Broken Mind Embassy
Now playing: Packard
Now playing: Dark Noise
Now playing: More
Now playing: Elephant Shunned


### Dictionary Comprehension
Yes, dictionary comprehensions exist and they are very useful, too. The syntax is a little bit more complicated, as there are two values you can change in one line:

```python
new_dict = {key: value for (key, value) in old_dict.items() if condition}
```
The difference to list comprehension is that you can manipulate both a key and a value, and that the expression is framed with curly braces. The `key: value` part denotes the items (key-value-pairs) which will end up in the new dict.

At first sight, this might sound a little bit complicated, but the following example should clarify things. It builds a new dictionary (`corrected_playlist = ...`) based on an existing one (`for (nr, name) in playlist.items()`). While doing so, it changes both key (`nr.replace("Song", "Track")` and value  (`f"{name} by Jan Blomqvist"`). Finally, it also filters certain elements (`if name.startswith("T")`).

In [5]:
playlist = {
    "Song 1": "Elephant Shunned",
    "Song 2": "The Space in Between",
    "Song 3": "The Six Degrees Theory",
    "Song 4": "Our Broken Mind Embassy",
}

corrected_playlist = {
    nr.replace("Song", "Track"): f"{name} by Jan Blomqvist"
    for (nr, name) in playlist.items()
    if name.startswith("T")
}

print(corrected_playlist)

{'Track 2': 'The Space in Between by Jan Blomqvist', 'Track 3': 'The Six Degrees Theory by Jan Blomqvist'}


## Concluding Remarks: Loop Like A Native

In Python, there are many different ways how to loop over data structures such as dictionaries or lists. Especially if you have learned other programming languages first, you might be tempted to use the structures learned first. Those might work in Python, but essentially they are not always "Pythonic" - and Python offers several elegant ways to create beautiful loops that are easy to read. You have learned them over the last couple of labs, but we want to conclude the topic on data structures with this link: [Loop Like a Native by Ned Batchelder](https://nedbatchelder.com/text/iter.html).

It discusses the specifics of looping over data in Python and how to do so elegantly. Please be aware: The article is written for Python 2.x, which (among other things) has different `print` statements than Python 3.x. The content is good to know nevertheless.

## Exercises
### Exercise 1: Drink-Generator with Dicts
Rewrite the drink-generator, but this time, use a dictionary instead of two lists. The requirements are only slightly different:

* New: Define the dictionary `drinks` (see below). It should contain the name of a drink as key, and the required ingredients as the corresponding value. Refer to the previous exercise to get the ingredients and the name of the drinks. 
* `find_drink` accepts three ingredients as separate strings and the drinks dictionary. 
* The function returns the name of the drink possible to make using these ingredients. 
* If no matching drink can be found, the function returns `None`. 

Hint for a more elegant solution, not required: Sets are a really nice feature to check if some items are a part of a larger group of items - namely, if they are a subset. [Check out the Real Python tutorial](https://realpython.com/python-sets/) to learn more about sets and set operations in detail.

Expected output:
```Python
>>> find_drink("cachaca", "sugar", "lime", drinks)
"caipirinha"

>>> find_drink("ice", "olives", "vodka", drinks)
"vodka martini"
>>> find_drink("ice", "olives", "vermouth", drinks)
"vodka martini"

>>> find_drink("white rum", "gin", "ice", drinks)
None
```

In [1]:
def find_drink(ingredient1, ingredient2, ingredient3, drinks):
    for drink, ingredients in drinks.items():
        if ingredient1 in ingredients and ingredient2 in ingredients and ingredient3 in ingredients:
            return drink
    return None
    # todo: Implement


drinks = {
    "caipirinha": ["cachaca", "sugar", "lime"],
    "mojito": ["white rum", "sugar cane juice", "lime juice", "soda water", "mint"],
    "gin tonic": ["gin", "tonic water", "ice"],
    "vodka martini": ["vodka", "vermouth", "ice", "olives"],
}


# Your code should work with the example below, but you're free to change it.
print(find_drink("cachaca", "sugar", "lime", drinks))
print(find_drink("ice", "olives", "vodka", drinks))
print(find_drink("ice", "olives", "vermouth", drinks))
print(find_drink("white rum", "gin", "ice", drinks))


caipirinha
vodka martini
vodka martini
None


### Exercise 2: Get and Set

Implement the `get()` and the `setdefault()` method: `get` tries to retrieve a value based on a key, if the value is not found, `get` returns a default value. `setdefault()` adds a key-value pair to a dictionary if it does not exist, and returns the value,  also if the key already exists. Do not use the built-in dictionary methods.

Expected output:

```python
>>> harry = {"hair": "black", "eyes": "green", "feature": "scar"}
>>> get(harry, "eyes", "blue")
'green'
>>> get(harry, "size", 0)
0
>>> setdefault("harry", "size", 178)
178
>>> get(harry, "size", 0)
178
```

In [2]:
def get(dictionary, key, default):
     # Implement here
    if key in dictionary:
        return str(dictionary[key])
    return default


def setdefault(dictionary, key, value):
     # Implement here
     if key in dictionary:
         return dictionary[key]
     dictionary[key] = value
     return value


# Your code should work with the example below, but you're free to change it.
harry = {"hair": "black", "eyes": "green", "feature": "scar"}
print(get(harry, "eyes", "blue"))
print(get(harry, "size", 0))
print(setdefault(harry, "size", 178))
print(get(harry, "size", 0))

green
0
178
178


### Exercise 3: Cliques

This exercise aims to make you familiar with set operations, since they are a very useful feature in daily coding. Consider the crew members, their ships and roles below and answer the following questions **using set operations**:

* Who is male?
* Who is engineer AND first officer?
* Who is security on the NCC-1701-D?
* Who are the female pilots?
* Who is male security on the Rocinante?

Note: In order to answer some questions, you may need to create additional set(s).

In [None]:
# roles (don't change!)
engineers = {"Naomi Nagata", "Geordi La Forge", "Kaylee Frye"}
security = {"Amos Burton", "Bobbie Draper", "Worf", "Tasha Yar", "Jayne Cobb"}
captains = {"Jim Holden", "Jean-Luc Picard", "Mal Reynolds"}
firstofficers = {"Naomi Nagata", "Zoë Washburne", "William T. Riker"}
pilots = {"Alex Kamal", "Data", "Hoban Washburne"}
female = {"Naomi Nagata", "Kaylee Frye", "Bobbie Draper", "Tasha Yar", "Zoë Washburne"}

# ships (don't change!)
firefly = {"Kaylee Frye", "Jayne Cobb", "Mal Reynolds", "Hoban Washburne", "Zoë Washburne"}
ncc1701d = {"Worf", "Tasha Yar", "Geordi La Forge", "William T. Riker", "Data", "Jean-Luc Picard"}
rocinante = {"Naomi Nagata", "Amos Burton", "Bobbie Draper", "Jim Holden", "Alex Kamal"}

# More operations if needed

male = engineers.union(security, captains, firstofficers, pilots).difference(female)
engineer_and_firstofficer = engineers.intersection(firstofficers)
security_ncc1701d = security.intersection(ncc1701d)
pilot_female = female.intersection(pilots)
male_security_rocinante = male.intersection(security).intersection(rocinante)

# ---- no changes below! ----
print(male)
print(engineer_and_firstofficer)
print(security_ncc1701d)
print(pilot_female)
print(male_security_rocinante)

### Exercise 4: Battleships

In this exercise, you will implement a Battleships ("Schiffliversenkis") game. You'll start with defining the ship's positions, then implement a first simplified version, and finally implement a more complete version of the game as an advanced exercise.

#### a) Defining the playing field

* The field consists of four-by-four cells (A1 to D4). Refer to the table below for the initial position of the ships. 
* Represent the playing field as a dictionary called `SHIPS`, where the key is a position as string, and the value a boolean whether there is a ship at this spot (`True`) or not (`False`). 

Initial position of battleships: 

|      | A    | B    | C    | D    |
| ---- | ---- | ---- | ---- | ---- |
| 1    | X    | X    |      | X    |
| 2    |      |      |      | X    |
| 3    |      |      |      | X    |
| 4    |      | X    |      |      |

Example: `SHIPS["A1"]` evaluates to `True` since there's a ship on it.

In [1]:
    # You can run this cell to define `SHIPS`, then re-use the data in the exercises below.
    
    SHIPS =  {
        "A1": True,
        "A2": False,
        "A3": False,
        "A4": False,
        "B1": True,
        "B2": False,
        "B3": False,
        "B4": True,
        "C1": False,
        "C2": False,
        "C3": False,
        "C4": False,
        "D1": True,
        "D2": True,
        "D3": True,
        "D4": False,
    }
    
    print(SHIPS["A1"])

True


#### b) Simplified implementation
* Write a loop which asks the user for input where to place a bomb on the field.
  - Make sure to **use the argument to `input(...)` to display the prompt**, don't print it manually via `print`. Use the given `prompt_message` variable.
  - If the user inputs something which isn't a valid ship position, ignore the input, and ask them for their next move.
* If a ship is hit, the player wins, the game prints `You won.` (use the given `win_message` variable), and the game ends.
* Note: We play a simplified version of Battleships! All ships only have the size of one square. So A1 is a single ship, B1 is another single ship, which is incidentally adjacent to A1.

Expected output (the part after the `?` is what the user inputs, not part of what gets printed):

```
On which cell do you want to set off the bomb? A1
You won.

On which cell do you want to set off the bomb? A2
On which cell do you want to set off the bomb? C1
On which cell do you want to set off the bomb? D3
You won.
```

In [None]:
# Use these constants for printing messages, don't change them.
OUTPUT_PROMPT = "On which cell do you want to set off the bomb?"
OUTPUT_WIN = "You won."
# Use SHIPS from above, don't redefine it here.


def battleships():
    while(True):
        userinput = input(OUTPUT_PROMPT)
        if userinput in SHIPS:
            if SHIPS[userinput]:
                print(OUTPUT_WIN)
                return


# Invocation to try your implementation, don't change this
battleships()

#### c) Battleships Enhanced

**This is an advanced exercise.** Consider finishing the other labs first, then coming back here.

Enhance the game a little further, the position of the battleships stays the same. Requirements:

* Make sure that the player only wins when all ships are hit, which means that the value of a field in `ships` changes when a ship was hit. 
* Use the given `ships` dictionary, which is a copy of the original `SHIPS`. The upper-case version is constant, so it shouldn't be changed.
* When a ship is hit, print `You hit a ship!` (use the given `hit_message` variable).
* When the player failed to hit a ship, print `You missed!` (use the given `missed_message` variable).
* When the player fails to hit three ships in a row, they lose the game, and the game prints `You lost.` after the last `You missed!` message (use the given `lose_message` variable).
* If a ship is hit, the player has three attempts again to hit a ship.
* As above, if the user inputs something which isn't a valid ship position, ignore the input, and ask them for their next move. This does *not* count as a hit/miss.

Expected output (win):

```
On which cell do you want to set off the bomb? A2
You missed!
On which cell do you want to set off the bomb? A3
You missed!
On which cell do you want to set off the bomb? A1
You hit a ship!
On which cell do you want to set off the bomb? B1
You hit a ship!
On which cell do you want to set off the bomb? B4
You hit a ship!
On which cell do you want to set off the bomb? D1
You hit a ship!
On which cell do you want to set off the bomb? D2
You hit a ship!
On which cell do you want to set off the bomb? D3
You hit a ship!
You won.
```

Expected output (lose):

```
On which cell do you want to set off the bomb? A2
You missed!
On which cell do you want to set off the bomb? A3
You missed!
On which cell do you want to set off the bomb? A4
You missed!
You lost.
```

In [None]:
# Use these constants for printing messages, don't change them.
OUTPUT_PROMPT = "On which cell do you want to set off the bomb? "
OUTPUT_WIN = "You won."
OUTPUT_LOSE = "You lost."
OUTPUT_HIT = "You hit a ship!"
OUTPUT_MISSED = "You missed!"
# Use SHIPS from above, don't redefine it here.


def battleships_enhanced():
    # Use this copy, don't touch SHIPS in your code.
    ships = SHIPS.copy()
    tries = 3
    while True in ships.values():
        if tries == 0:
            print(OUTPUT_LOSE)
            return
        userinput = input(OUTPUT_PROMPT)
        if userinput in ships:
            if ships[userinput]:
                ships[userinput] = False
                tries = 3
                print(OUTPUT_HIT)
            else:
                print(OUTPUT_MISSED)
                tries -= 1
    print(OUTPUT_WIN)
    return


# Invocation to try your implementation, don't change this
battleships_enhanced()