## Motivation for dictionaries
To see why dictionaries are useful, have a look at the two lists defined in the script. **countries** contains the names of some European countries. **capitals** lists the corresponding names of their capital.

### Instructions

- Use the **index()** method on **countries** to find the index of **'germany'**. Store this index as **ind_ger**.
- Use **ind_ger** to access the capital of Germany from the capitals list. Print it out.



In [2]:
# Definition of countries and capital
countries = ['spain', 'france', 'germany', 'norway']
capitals = ['madrid', 'paris', 'berlin', 'oslo']

ind_ger = countries.index('germany')
print(capitals[ind_ger])

berlin


## Create dictionary
The **countries** and **capitals** lists are again available in the script. It's your job to convert this data to a dictionary where the country names are the keys and the capitals are the corresponding values. As a refresher, here is a recipe for creating a dictionary:

**my_dict = { <br>
   "key1":"value1", <br>
   "key2":"value2", <br>
} <br>**

In this recipe, both the keys and the values are strings. This will also be the case for this exercise.

### Instructions

- With the strings in **countries** and **capitals**, create a dictionary called **europe** with 4 key:value pairs. Beware of capitalization! Make sure you use lowercase characters everywhere.
- Print out **europe** to see if the result is what you expected.

In [23]:
europe = {}

for i,j in zip(countries, capitals):
    europe.update({i:j})

print(europe)


{'spain': 'madrid', 'france': 'paris', 'germany': 'berlin', 'norway': 'oslo'}
{'countries': ['spain', 'france', 'germany', 'norway'], 'capitals': ['madrid', 'paris', 'berlin', 'oslo']}


## Access dictionary
If the keys of a dictionary are chosen wisely, accessing the values in a dictionary is easy and intuitive. For example, to get the capital for France from **europe** you can use:

**europe['france']**

Here, **'france'** is the key and **'paris'** the value is returned.

### Instructions

- Check out which keys are in **europe** by calling the **keys()** method on **europe**. Print out the result.
- Print out the value that belongs to the key **'norway'**.

In [15]:
print(europe.keys())

print(europe['norway'])

dict_keys(['spain', 'france', 'germany', 'norway'])
oslo


## Dictionary Manipulation (1)
If you know how to access a dictionary, you can also assign a new value to it. To add a new key-value pair to **europe** you can use something like this:

**europe['iceland'] = 'reykjavik'**

### Instructions

- Add the key **'italy'** with the value **'rome'** to **europe**.
- To assert that **'italy'** is now a key in **europe**, print out **'italy' in europe**.
- Add another key:value pair to europe: **'poland'** is the key, **'warsaw'** is the corresponding value.
- Print out **europe**.

In [18]:
europe['italy'] = 'rome'

print('italy' in europe)

europe['poland'] = 'warsaw'

print(europe)

True
{'spain': 'madrid', 'france': 'paris', 'germany': 'berlin', 'norway': 'oslo', 'italy': 'rome', 'poland': 'warsaw'}


## Dictionary Manipulation (2)
Somebody thought it would be funny to mess with your accurately generated dictionary. An adapted version of the **europe** dictionary is available in the script.

Can you clean up? Do not do this by adapting the definition of **europe**, but by adding Python commands to the script to update and remove key:value pairs.

### Instructions

- The capital of Germany is not **'bonn'**; it's **'berlin'**. Update its value.
- Australia is not in Europe, Austria is! Remove the key **'australia'** from **europe**.
- Print out **europe** to see if your cleaning work paid off.

In [20]:
#fudgin up the dict
europe['germany'] = 'bonn'
europe['australia'] = 'vienna'
print(europe)

#correcting
europe['germany'] = 'berlin'

del(europe['australia'])

print(europe)

{'spain': 'madrid', 'france': 'paris', 'germany': 'bonn', 'norway': 'oslo', 'italy': 'rome', 'poland': 'warsaw', 'australia': 'vienna'}
{'spain': 'madrid', 'france': 'paris', 'germany': 'berlin', 'norway': 'oslo', 'italy': 'rome', 'poland': 'warsaw'}


## Dictionariception
Remember lists? They could contain anything, even other lists. Well, for dictionaries the same holds. Dictionaries can contain key:value pairs where the values are again dictionaries.

As an example, have a look at the script where another version of **europe** - the dictionary you've been working with all along - is coded. The keys are still the country names, but the values are dictionaries that contain more information than just the capital.

It's perfectly possible to chain square brackets to select elements. To fetch the population for Spain from **europe**, for example, you need:

**europe['spain']['population']**

### Instructions

- Use chained square brackets to select and print out the capital of France.
- Create a dictionary, named **data**, with the keys **'capital'** and **'population'**. Set them to **'rome'** and **59.83**, respectively.
- Add a new key-value pair to **europe**; the key is **'italy'** and the value is **data**, the dictionary you just built.

In [22]:
# Dictionary of dictionaries
europe = { 'spain': { 'capital':'madrid', 'population':46.77 },
           'france': { 'capital':'paris', 'population':66.03 },
           'germany': { 'capital':'berlin', 'population':80.62 },
           'norway': { 'capital':'oslo', 'population':5.084 } }

print(europe['france']['capital'])

data = {'capital':'rome', 'population':59.83}

europe['italy'] = data

print(europe)

paris
{'spain': {'capital': 'madrid', 'population': 46.77}, 'france': {'capital': 'paris', 'population': 66.03}, 'germany': {'capital': 'berlin', 'population': 80.62}, 'norway': {'capital': 'oslo', 'population': 5.084}, 'italy': {'capital': 'rome', 'population': 59.83}}


## Dictionary to DataFrame (1)
Pandas is an open source library, providing high-performance, easy-to-use data structures and data analysis tools for Python. Sounds promising!

The DataFrame is one of Pandas' most important data structures. It's basically a way to store tabular data where you can label the rows and the columns. One way to build a DataFrame is from a dictionary.

In the exercises that follow you will be working with vehicle data from different countries. Each observation corresponds to a country and the columns give information about the number of vehicles per capita, whether people drive left or right, and so on.

Three lists are defined in the script:

- **names**, containing the country names for which data is available.
- **dr**, a list with booleans that tells whether people drive left or right in the corresponding country.
- **cpc**, the number of motor vehicles per 1000 people in the corresponding country.
Each dictionary key is a column label and each value is a list which contains the column elements.

### Instructions

- Import **pandas** as **pd**.
- Use the pre-defined lists to create a dictionary called **my_dict**. There should be three key value pairs:
  - key **'country'** and value **names**.
  - key **'drives_right'** and value **dr**.
  - key **'cars_per_cap'** and value **cpc**.
- Use **pd.DataFrame()** to turn your dict into a DataFrame called **cars**.
- Print out **cars** and see how beautiful it is.

In [25]:
# Pre-defined lists
names = ['United States', 'Australia', 'Japan', 'India', 'Russia', 'Morocco', 'Egypt']
dr =  [True, False, False, False, True, True, True]
cpc = [809, 731, 588, 18, 200, 70, 45]

# Import pandas as pd
import pandas as pd

# Create dictionary my_dict with three key:value pairs: my_dict
my_dict = {'country':names,'drives_right':dr,'cars_per_cap':cpc}

cars = pd.DataFrame(my_dict)
print(cars)

         country  drives_right  cars_per_cap
0  United States          True           809
1      Australia         False           731
2          Japan         False           588
3          India         False            18
4         Russia          True           200
5        Morocco          True            70
6          Egypt          True            45


## Dictionary to DataFrame (2)
The Python code that solves the previous exercise is included in the script. Have you noticed that the row labels (i.e. the labels for the different observations) were automatically set to integers from 0 up to 6?

To solve this a list **row_labels** has been created. You can use it to specify the row labels of the **cars** DataFrame. You do this by setting the **index** attribute of **cars**, that you can access as **cars.index**.

### Instructions

- Hit Run Code to see that, indeed, the row labels are not correctly set.
- Specify the row labels by setting **cars.index** equal to **row_labels**.
- Print out **cars** again and check if the row labels are correct this time.

In [26]:
# Definition of row_labels
row_labels = ['US', 'AUS', 'JPN', 'IN', 'RU', 'MOR', 'EG']

cars.index = row_labels

print(cars)

           country  drives_right  cars_per_cap
US   United States          True           809
AUS      Australia         False           731
JPN          Japan         False           588
IN           India         False            18
RU          Russia          True           200
MOR        Morocco          True            70
EG           Egypt          True            45


## CSV to DataFrame (1)
Putting data in a dictionary and then building a DataFrame works, but it's not very efficient. What if you're dealing with millions of observations? In those cases, the data is typically available as files with a regular structure. One of those file types is the CSV file, which is short for "comma-separated values".

To import CSV data into Python as a Pandas DataFrame you can use **read_csv()**.

Let's explore this function with the same cars data from the previous exercises. This time, however, the data is available in a CSV file, named **cars.csv**. It is available in your current working directory, so the path to the file is simply **'cars.csv'**.

### Instructions

- To import CSV files you still need the **pandas** package: import it as **pd**.
- Use **pd.read_csv()** to import cars.csv data as a DataFrame. Store this DataFrame as **cars**.
- Print out **cars**. Does everything look OK?

In [27]:
cars = pd.read_csv('cars.csv')
print(cars)

  Unnamed: 0  cars_per_cap        country  drives_right
0         US           809  United States          True
1        AUS           731      Australia         False
2        JAP           588          Japan         False
3         IN            18          India         False
4         RU           200         Russia          True
5        MOR            70        Morocco          True
6         EG            45          Egypt          True


## CSV to DataFrame (2)
Your **read_csv()** call to import the CSV data didn't generate an error, but the output is not entirely what we wanted. The row labels were imported as another column without a name.

Remember **index_col**, an argument of **read_csv()**, that you can use to specify which column in the CSV file should be used as a row label? Well, that's exactly what you need here!

Python code that solves the previous exercise is already included; can you make the appropriate changes to fix the data import?

### Instructions

- Run the code with *Run Code *and assert that the first column should actually be used as row labels.
- Specify the **index_col** argument inside **pd.read_csv()**: set it to **0**, so that the first column is used as row labels.
- Has the printout of **cars** improved now?

In [28]:
cars = pd.read_csv('cars.csv', index_col=0)
print(cars)

     cars_per_cap        country  drives_right
US            809  United States          True
AUS           731      Australia         False
JAP           588          Japan         False
IN             18          India         False
RU            200         Russia          True
MOR            70        Morocco          True
EG             45          Egypt          True
