<a name="top"></a>Übersicht: Dictionaries & I/O
===

* [Dictionaries](#dictionaries)
  * [Indizierung](#dindex)
  * [Iteration](#diteration)
  * [Bearbeiten](#dmodify)

* [Input/Output](#inputoutput)
  * [Lesen von Dateien](#reading)
  * [Schreiben von Dateien](#writing)
  * [Lesen und Schreiben von .csv-Dateien](#Pandas)
  * [Nutzereingaben](#userinteraction)

* [Exercise 07: Dictionaries & I/O](#exercise07)

**Lernziele:** Am Ende dieser Einheit
* wisst ihr, wie man Variablen in Dictionaries speichert und abruft
* könnt ihr Dateien lesen und schreiben
* könnt ihr Eingaben vom Nutzer anfordern und verarbeiten

# <a name="dictionaries"></a>Dictionaries

Dictionaries sind neben Listen ein weiterer Datentyp für eine Sammlung von mehreren Elementen.

Gemeinsamkeiten mit Listen:
* Sammlung von Variablen, die wiederum in einer Variable gespeichert sind
* es können beliebige Variablen gespeichert werden

Unterschiede:
* Elemente werden nicht über ihre Position (Index) angesprochen, sondern über ihren _Schlüssel_ (_key_)
* Bei der Deklaration werden geschweifte Klammern `{}` statt eckigen Klammern `[]` verwendet

Andere Bezeichnungen für so einen Datentyp sind z. B. _map_, _Assoziatives Array_, _Hashtable/Hashmap_

Dictionaries bilden _Schlüssel_ auf _Werte_ ab, z. B.:

* Wort $\rightarrow$ Bedeutung oder Übersetzung (Wörterbuch)
* Name $\rightarrow$ Telefonnummer (Telefonbuch)
* Einstellung $\rightarrow$ Wert (Programmkonfiguration)

##### Beispiel

In [None]:
# ein Dictionary, das Strings als Schlüssel verwendet
data_types = {'integer': 'Ganzzahl',
              'float': 'Dezimalzahl',
              'string': 'Zeichenkette',
              'list': 'Sammlung von Elementen mit Index'}

[top](#top)

## <a name="dindex"></a>Indizierung

Während Listen mit Zahlen indiziert werden, verwenden Dictionaries _Schlüssel_. Meistens werden dafür Integer oder Strings genutzt.

In [None]:
# Zugriff auf ein Element
int_type = data_types['integer']
print(int_type)

Um die Anzahl der Elemente eines Dictionary zu bestimmen, können wir wieder die Funktion `len()` verwenden:

In [None]:
# Anzahl an gespeicherten Datentypen
number_of_types = len(data_types)

print('Das Dictionary data_types enthält {} Elemente.'
      .format(number_of_types))

Der Zugriff auf ein nicht vorhandenes Element ergibt einen Fehler (**KeyError**):

In [None]:
data_types['bicycle']

Wir können mit dem `in`-Operator überprüfen, ob ein Dictionary einen Wert zu dem gefragten Schlüssel gespeichert hat:

In [None]:
test_types = ['integer', 'bicycle']

for test in test_types:
    if test in data_types:
        print('{} is in data_types'.format(test))
    else:
        print('{} is not in data_types'.format(test))

[top](#top)

## <a name="diteration"></a>Iteration über Dictionaries

Dictionaries können wie Listen iteriert werden, es gibt aber mehrere Möglichkeiten, das zu tun:
* Iteration über die Schlüssel (der Normalfall)
* Iteration über Paare von Schlüssel und Wert

##### Iteration über Schlüssel

Wir haben bereits `for`-Schleifen genutzt, um auf Elemente einer Liste zuzugreifen. Das gleiche machen wir jetzt mit den Schlüsseln des Dictionary. In diesem Fall läuft die Schleife also vier mal:

In [None]:
# Iteration über die Schlüssel
for key in data_types:
    print ("{} ist ein Datentyp in Python.".format(key))

##### Iteration über Schlüssel-Wert-Paare

Um über die Paare aus Schlüssel und Wert zu iterieren, können wir die `dict.items()`-Funktion nutzen:

In [None]:
# Iteration über Schlüssel-Wert-Paare
for key, value in data_types.items():
    print ("{} ist eine {}.".format(key, value))

Natürlich können wir auch über die Schlüssel iterieren und dannn das Dictionary mit dem Schlüssel indizieren. Das ist aber langsamer und etwas unübersichtlicher als die vorherige Variante.

In [None]:
# Iteriert über die Schlüssel und greift manuell auf die Werte zu,
# das ist langsamer als die Iteration über die Paare!
for key in data_types:
    print ("{} ist eine {}.".format(key, data_types[key]))

## <a name="dmodify"></a>Dictionaries bearbeiten

Variablen in Dictionaries können hinzugefügt, geändert und gelöscht werden.

##### Elemente hinzufügen oder ändern

Um einen Wert zum Dictionary hinzuzufügen oder einen Wert zu ändern, indizieren wir das Dictionary mit dem dazugehörigen Schlüssel und weisen einen Wert zu:

In [None]:
data_types['dictionary'] = 'samlung von Schlüssel-Wret-Paaren'

for key, value in data_types.items():
    print ("{} ist eine {}.".format(key, value))

Falsch geschrieben, das sollten wir korrigieren:

In [None]:
print('Ein Dictionary ist eine {}.'.format(data_types['dictionary']))
data_types['dictionary'] = 'Sammlung von Schlüssel-Wert-Paaren'
print('Ein Dictionary ist eine {}.'.format(data_types['dictionary']))

##### Elemente entfernen

Um ein Element aus einem Dictionary zu löschen, verwenden wir wieder den `del`-Operator:

In [None]:
del data_types['float']

for key, value in data_types.items():
    print ("{} ist eine {}.".format(key, value))

[top](#top)

# <span id="inputoutput"/>Input & Output

## <span id="reading"/>Reading files

In the previous lesson we've already used ```open()``` to open files.  The ```open()``` function returns a **file-object** that can be be used to read from or to write to.

By default a file is opened for reading text.  To read line of text, you can use the ```readline()``` function.  File objects are also **iterable**, which means that we can use a ```for``` loop in order to read from them.  Both are demonstrated here:

In [None]:
# Open a file for reading text.
f = open("text_file.txt")

# Read a line, and print it.
line = f.readline()
print(line)

# Iterate it and print its contents.
for line in f:
    print(line)

# Close the file.  (Important!!!)
f.close()

Finally we close the file using the ```close()``` function.  If you don't do this, your program will sooner or later crash unexplicably!

Note that there are empty lines interleaved with the text.  These lines are not present in the original file.  They got there, because Python reads the newlines from the file, and then ```print()``` adds another one.  This can be solved with the ```rstrip()``` function:

In [None]:
# Open a file for reading text.
f = open("text_file.txt")

# Read a line, and print it.  Note the addition of rstrip()!
line = f.readline().rstrip()
print(line)

# Iterate it and print its contents.  Note the addition of rstrip()!
for line in f:
    line = line.rstrip()
    print(line)

# Close the file.  (Important!!!)
f.close()

[top](#top)

## <span id="writing"/>Writing files

By default a file is opened for reading text.  If we need to write to a file, we need to tell Python to open it for writing.  We can do this by passing an extra argument, ```'w'```, to ```open()```.

Files that are opened for writing, can be written to using the ```print()``` function that we're already familiar with.  We can tell it what file-object to write to using the ```file=``` keyword argument.

In [None]:
# Open a file for writing text.
f = open('writing.txt','w')

# Write a line of text to the file.
print('This is a text file.', file=f)

# Write a few lines to it:
for number in range(1,6):
    print('a number: {}'.format(number), file=f)

# And we close the file.
f.close()

[top](#top)

## <span id="Pandas"/>Writing and reading .csv files

We can of course read and write files line by line, as shown above. However, this is quite inefficient since there are ready-to-use file formats available.
As an example (which you will probably need) we want to take a look at **c**omma-**s**eperated **v**alues (**csv** files here.

First we import a module for handling csv files called pandas:

In [None]:
import pandas as pd

For a more detailed introduction to Pandas take a look at : http://pandas.pydata.org/pandas-docs/version/0.15/10min.html

Now we want to make our own csv file. Pandas uses a data type called *dataframe* for dealing with multi column data, which can include different data types.
For initializing, we first create a dictionary and then convert this dictionary to a dataframe.

In [None]:
# initialize a dictionary, which has two keys and for each key a list of five values
week_lunch_dict = {'Day': ['Monday', 'Tuesday', 'Wednesday', 'Thursda', 'Friday'],
                   'Lunch': ['Salad', 'Schnitzel', 'Lasagna', 'Pizza', 'Fish filets']}
# convert this dictionary to a dataframe
week_lunch_dataframe = pd.DataFrame(data=week_lunch_dict)
# visualize our dataframe
print(week_lunch_dataframe)

Note that the keys in our dictionary were converted to the names of the columns of our dataframe.
Pandas can handle a variety of data types and is quite useful for working with data. Here we will only look at how to save and reav cvs files though.

We can write our dataframe directly to a csv:

In [None]:
week_lunch_dataframe.to_csv('week_lunch.csv')

Now you will probably wonder whether loading csv files in Pandas is just as easy as writing them is.

Of course it ist - let us try to do it here:

There is a file called 'Rain_netherlands.csv' in your folder. It contains data from the UN database on rain in the Netherlands - see:
[UN data on rainfall in the Netherlands](http://data.un.org/Data.aspx?q=Rain&d=CLINO&f=ElementCode%3aBT%3bCountryCode%3aNL%3bStatisticCode%3a15&c=2,5,6,7,10,15,18,19,20,22,24,26,28,30,32,34,36,38,40,42,44,46&s=CountryName:asc,WmoStationNumber:asc,StatisticCode:asc&v=1)

We can simply load it like this:

In [None]:
rain=pd.read_csv('Rain_netherlands.csv')
rain.head()

[top](#top)

## <span id="userinteraction" />User interaction

Sometimes you might want your program to talk with the user.  If you just want to give some information to the user, you can use the ```print()``` function.  Using the ```input()``` function we can also get information from the user.

The ```input()``` function asks the user a questions, and waits for input.  The input is returned as a string.

In [None]:
s = input('What is your name? ')
print('Your name is {}.'.format(s))

Here is more elaborate example, that uses the ```split()``` function to split a string into separate words.

In [None]:
s = input('Can you rhyme something for me? ')

# Split the string on every space.
words = s.split(' ')

word_num = 1
for word in words:
    print('word #{} is "{}".'.format(word_num, word))
    word_num = word_num + 1

Here is another example, that shows you how to make decisions based on input.

(Note this is very general: you can use any string here, not just input from ```input()```!)

In [None]:
answer = input('If you\'re happy and you know it, clap your? ')
if answer == 'hands':
    print('If you\'re happy and you know it, stomp your feet!')
else:
    print('Hmm...  I always clap my hands...')

[top](#top)

# <span id="exercise07"/>Exercise 07: Dictionaries & I/O

**Dictionaries**

  1. Create a dictionary that contains the names and e-mail addresses of the tutors.  The names should be used as keys.
  2. Create a new dictionary also with names and e-mail addresses, but now use the e-mail addresses as keys.  Make the new dictionary by looping over the previous exercise's dictionary with `for`.
  3. Add the names and email addresses of your neighbours to previous exercise's dictionary.
  4. Print the new dictionary by looping over it with `for`.


**Lists & Dictionaries**

  You're doing an experiment with genetically modified mice.  The modifications are such that they get an extra brain.  Sometimes this works, but other times, they only get an extra kidney or spleen.

  1. In order to deal with your data, you decide to write a Python script.  For each mouse, you make a dictionary that holds the number of brains, kidneys, and spleens, and also the mouse's name.  Use 4 mice; you can make up some names and results.
  2. Put the dictionaries in a list.
  3. Print the number of the kidneys that the 3rd mouse has.
  4. You want to give each mouse a score.  The higher it's score, the better your experiment went.  To do this iterate over the list, and print each mouse's name, followed by the number of brains minus the number of kidneys and spleens.  Save the mouse's score in its dictionary.
  5. Print the name of the mouse that has the best score.


**Input/Output**

  In this exercise you'll write a program that reads a list of names and e-mail addresses from a file.  Your program should do the following:

  1. Read the names and e-mail addresses;
  2. Print what was read;
  2. Ask the user for a name, and print the e-mail address corresponding to that name.

  We've provided a file that contains e-mail addresses.  It's in file called `email_addresses.txt`.  Each line in this file is formatted as follows:

  `[name] [e-mail-address]`

  In other words, on each line name and e-mail address are separated by a space.  There are only firstnames, which means that `name` doesn't contain spaces.

  You should store the list as a dictionary, so you can do easy lookups.  You can of course also use lists, but this is more cumbersome.

  Hint: use the `split()` function to split each line into a name, and an e-mail address.  The output of `split()` is a list: you can index it.

  Hint: string comparison is case-sensitive.  That is: 'guus' won't work, but 'Guus' will!

**Bonus: CSV files**

    Try to read one of your own csv files. Then change some of its value fields and save it under a different name.

[top](#top)