# -1. Disclaimer
-- Many of the materials are gently stolen from the following courses: 
- **["A Python Course for the Humanities"](https://github.com/fbkarsdorp/python-course)** a course designed by Folgert Karsdorp and Maarten van Gompel
- and later modified by Mike Kestemont and Lars Wieneke for the course **["Programming for Linguistics and Literature"](https://github.com/mikekestemont/prog1617)**
- **["Python for text analysis"](https://github.com/cltl/python-for-text-analysis)** designed by H.D. van der Vliet and taught at the Vrije Universiteit
- **["How to Think Like a Computer Scientist"](http://www.greenteapress.com/thinkpython/thinkCSpy.pdf)** by Allen Downey, Jeffrey Elkner, Chris Meyers
- **["The Programming Historian"](https://programminghistorian.org/en/lessons)**: ["Fetch and Parse Data with OpenRefine"](https://programminghistorian.org/en/lessons/fetch-and-parse-data-with-openrefine) (by Evan P. Williamson) and ["Manipulating Strings in Python"](https://programminghistorian.org/en/lessons/manipulating-strings-in-python) by William J. Turkel and Adam Crymble

# 0. Before we kick off: Installing Jupyter Notebook

- Download Anaconda: https://www.anaconda.com/download
        Select the Python 3.6 Version
        Follow the installation instructions
- Download the Notebook and data [here](https://drive.google.com/drive/folders/1wIJ1hPu1BSg2_ocf5k-jcBqN1WZGJiSW?usp=sharing)
        Open Anaconda Navigator
        Launch Jupyter Notebook
        This should open a tab in your browser
        Go to the location where you cloned/unzipped the material downloaded from Github

# 1. Philosophy of the Course

- **We have time...** (if we don't get everything done, we just add an extra lesson)
- Coding is **not** difficult, but obtaining basic programming skills requires a **sustained effort**.
- With only a few basic skills you can go a long way (writing scripts vs. developping tools).
- We skip some of the details, but hope you get a feeling for what is possible, and why coding could be useful for your research.
- The full course is available [here](https://github.com/kasparvonbeelen/Coding-the-Humanities) 

### 1.1 The Language of Choice: Python

#### **What** is Python?

[From Wikipedia](https://en.wikipedia.org/wiki/Python_(programming_language): Python is a widely used **high-level** programming language for **general-purpose** programming.
- ** high-level programming language**: In computer science, a high-level programming language is a programming language with **strong abstraction from the details of the computer**. In comparison to low-level programming languages, it may use **natural language elements**, be easier to use, or may **automate** (or even **hide** entirely) significant areas of computing systems (e.g. memory management), making the process of developing a program simpler and more understandable relative to a lower-level language. The amount of **abstraction** provided defines how "high-level" a programming language is.


#### **Why** Python?

In general, Python is **easier to learn and to read**. The first example in this Notebook (`print('Hello, World.')`) illustrates this point. The C++ version of the "Hello World" program would look like this:

C++ code below:
``
#include <iostream.h>

void main()

{
    
    cout << "Hello, world." << endl;

}

``

End of C++ code.

while in Python version it simply was:

``
print("Hello, world.")
``

So, why **Python**:

- Software **Quality**: Python code is designed to be **readable**, and hence reusable and maintainable. 
- Developer **Productivity**: Python code is typically one-third to one-fifth the size of C++ or Java code. 
- **Portability**: Python code runs unchanged on all major computer platforms (Windows, Linux, MacOS). 
- **General-purpose**: data analysis, web development etc.
- **Support Libraries**: Standard, homegrown and third-party libraries.
- **Widely used by the academic and scientific community!**

# 2. Goal of Today's Lecture

Today we focus on basic Python objects:
- Strings
- Lists
- Dictionaries

Tools for manipulating these objects:
- String formatting
- Appending items to a list
- Exploring dictionaries and JSON Objects


In general, the course shows how to collect and save data from the Web. In the following courses we turn to analysis of the retrieved data. 

At the end of the day, you should be able to **understand** most of the following code. For now, just try to run it by simultenously pressing `ctrl` and `enter`.

### 2.1. Real-world application of the elements we discuss today.

In [2]:
import requests # Import models, here a set of tools that help you donwloading data
'''
Script that retrieves data "Chronicling America" and stores information in a list.
'''

data = """Idaho,1865\nMontana,1865\nOregon,1865\nWashington,1865""" # Variable Assignment & Strings

url = "http://chroniclingamerica.loc.gov/search/pages/results/?state={0}&date1={1}&date2={1}&dateFilterType=yearRange&sequence=1&sort=date&rows=5&format=json"
# String formatting, Getting data from APIs

all_data = [] # Empty Lists & Variable Assignment

lines = data.split('\n') # Split string (convert string to list)

print(lines) # Printing the list

for line in lines: # For loop 
    state,year = line.split(',') # Split string, multiple assignment
    response = requests.get(url.format(state,year)).json() # Calling the API & download data
    all_data.append(response) # Storing data in a variable

print('\nDownloaded {} items.\nDone!\n'.format(len(all_data)))

['Idaho,1865', 'Montana,1865', 'Oregon,1865', 'Washington,1865']

Downloaded 4 items.
Done!



In [3]:
import json # Import JSON tools
'''
Inspect the downloaded data.
'''
idaho = all_data[0] # Indexing and slicing
idaho.keys() # Inspecting JSON objects & Python dictionaries
print(json.dumps(idaho))
#json.dump(idaho,open('./idaho_example.json','w'))# Copy-Paste the print output to http://jsonviewer.stack.hu/

{"totalItems": 52, "endIndex": 5, "startIndex": 1, "itemsPerPage": 5, "items": [{"sequence": 1, "county": ["Boise"], "edition": null, "frequency": "Weekly", "id": "/lccn/sn82015407/1865-01-07/ed-1/seq-1/", "subject": ["Boise County (Idaho)--Newspapers.", "Idaho City (Idaho)--Newspapers.", "Idaho--Boise County.--fast--(OCoLC)fst01219026", "Idaho--Idaho City.--fast--(OCoLC)fst01295399"], "city": ["Idaho City"], "date": "18650107", "title": "The Idaho world.", "end_year": 1918, "note": ["Archived issues are available in digital format from the Library of Congress Chronicling America online collection.", "Semiweekly eds.: Idaho semi-weekly world (Idaho City, Idaho : 1867), May 4, 1867-Nov. 11,1868 ; Idaho semi-weekly world (Idaho City, Idaho : 1875), July 30, 1875-June 15, 1908.", "Triweekly ed.: Idaho tri-weekly world, Mar. 14-July 27, 1875."], "state": ["Idaho"], "section_label": "", "type": "page", "place_of_publication": "Idaho City, Idaho Territory", "start_year": 1864, "edition_label

In [4]:
# print the title
print(idaho['items'][0]['alt_title'])

['Idaho weekly world']


In [5]:
# print ocr text
print(idaho['items'][0]['ocr_eng'])

The Idaho World
"THE NOBLEST MOTIVE IS THE ^PUBLIC GOOD."
Voll
IDAHO CITY, BOISE COUNTY, IDAHO TERRITORY, SATURDAY, JANUARY 7, 1865.
No. 11.
dab World,
PUBLISHED EVERT SATURDAY MORNING BY
I. H. BOWMAN & CO.
H. C. STREET, Editor,
TERMS INVARIABLY in ADVANCE.
Bates of Subscription s
One year,..... - $12 00.
Six mort ha, ------- 7 00.
Three months, .------4 00.
Single copies, ----- - - 60.
Bates of Advertising s
Per square, ten lines or less, first insertion, : : : $5
« « " " " each subsequent in. 2
« « « " " three months, : : : 12
•* « " " u one year, : : 40
Agents for the Idaho "Worfd.
A. P. Turner, Carrier and General Agent, Idaho city.
Tho8. Boyce, northeast corner Montgomery and Wash
ington street, up stairs, San Francisco. Cal.
C. R. Street, Cal. Express office, Marysville, Cal.
S. J. McCormick, Portland, Oregon.
M. J. Allphin, Dalles city, Oregon.
Capt T Miller,° RAY '} Cat, y° n cit L Oregon.
A. H. Brown—W ells, Fargo k Co.'s agent, Auburn, Or.
Powell & Coe, Umatilla, Oregon.
A. A

In [6]:
# how many words does this text contain?
idaho_text = idaho['items'][0]['ocr_eng']
print(len(idaho_text.split()))

3692


In [7]:
import pickle # Pickling is a method for saving Python objects
json.dump(all_data,open('./chrom_america.json','w')) # Store the data on your disk

The difficult thing about learning how to program is that you have get a proper understanding of all these building blocks, before you can start writing useful programs.

We have to go through these elements in isolation. This can be tedious, as things only start to make sense when you start combining different components.

# 3. Baby Python

For practising your coding skills, you can use the many 'code blocks' in this Notebook, such as the grey cell below. Place your cursor inside the cell and press ``ctrl+enter`` to "run" or execute the code. Let's begin right away: run your first little program!

In [None]:
print('Hello, World!')

You've just executed your first program!

### Exercise
- Can you describe what the programme just did?
- Can you adapt it to print your own name (with a greeting)?

Use the code block below.

In [None]:
# Insert your own code here!
# Print your own name ... or whatever you want, and press ctrl + enter

Apart from printing words to your screen, you can also use Python as a calculator. 

In [None]:
print(5+9)
print(3*8)

### Exercise
Use the code block below to calculate (and print) how many minutes there are in one week?

In [None]:
# Write your code here

# 4. Variables: Presents for Everyone

One of the most powerful features of a programming language is the ability to **store and manipulate variables**. A variable is a **name** that refers to a value. The **assignment statement** creates new variables and relates them to concrete values. Instead of passing these elements as an argument to the `print()` function, we can store them, by creating a variable that refers to the "Hello, World!" string.

In [None]:
# declare a variable
x = 'Hello World.'
# print what is in the box
print(x)

In [None]:
# declare a variable
y = 22
# print what is in the box
print(y)

If you vaguely remember your math-classes in school, this should look familiar. It is basically the same notation with the name of **the variable on the left, the value on the right**, and the = sign in the middle. 

In the code block above, two things happen. **First**, we fill `x` with a value, in our case `22`. This variable x behaves pretty much like a **box** on which we write an `x` with a thick, black marker to find it back later. **Second**: We print the contents of this box, using the `print()` command. ![box](./images/box.png)

You can inspect the type of the variable with the `type()` function. A string is always between quotation marks *`'`* or *`"`*, a number (integers or floats) is not.

In [None]:
text = 'Hello, Worlds!'
print(type(text))
number = 10
print(type(number))
number_string = '10'
print(type(number_string))

#### Exercise 
Create and print two values: your name (string) and date of birth (integer)

In [None]:
# write your code here

#### Exercise 
Find the variable assignments in the example code.

# 5. Strings: How Python Understands Text

Let's have a closer look at the ``'str'`` type (str stands for string)

Not only numbers but also strings can be added together. What do you think the operation below will produce?

In [None]:
name = 'Kaspar'
print(type(name))

Not only numbers but also strings can be added together. What do you think the operation below will produce?

In [None]:
first_name = "Kaspar"
last_name = "Beelen"
print(first_name+last_name)

This the last operation is called string **concatination**. We added one string to another using the `+` operator.

In [None]:
book = "The Lord of the Flies"
print(first_name + " likes " + book + "?")

#### Exercise: 
Declare two variables `first_name` and `last_name`. Print them neatly using concatenation.

In [None]:
# Write your code here

Another option would be the `format()` method. Please note the `\n` sign here. Which denotes a hard return (newline character).

To see what `format()` does, we can simply use Python's help functionalities!

In [None]:
help(str.format)

`.format()` inserts a variable (either a string or a number) between braces. Try it out below!

In [None]:
name = # enter a name
print('{} is great!'.format(name))

In [None]:
name = 'My first name is {0}.\nMy second name is {1}'.format(first_name,last_name)
print(name)

#### Exericse
What would the following expression return?
'My first name is {1}.\nMy second name is {0}'.format(first_name,last_name)

A lot is actually happening here--these lines may be confusing. Let's inspect this line a bit closer.

## 5.1 String Methods

The expression below follows the Python dot notation:

    - `'My first name is {0}.\nMy second name is {1}'.format(first_name,last_name)`

Which in a more abstract forms looks like

    - `object.method(arguments)`
    
In this example, we applied the format method using names as arguments.

But Python comes with many useful tools for text processing. You can list and inspect them with `dir()` and `help()`.

In [None]:
book = 'Pride and Prejudice' # Let's pretend we stored a whole book in this variable

`dir()` shows all the methods you can apply to the string variable `book`. Please scroll down. You can ignore all those starting with double underscores.

In [None]:
dir(book)

All these methods allows you to do things with strings. Some of the most useful methods are
- `split()`
- `lower()`
- `len()`
- `find()`

## .split()

#### Exercise

Go back to the initial example, and figure out how the `split()` method works.

HINT: use print before and after `split()`.

#### Exercise
Find the **documentation** on `split` using the `help` function

In [None]:
# search for help here

Inspect the following examples:

In [None]:
print('Split on white space: ',book.split())
print('Split on character "e": ',book.split('e'))
print('Splint on newline: ',book.split('\n'))

### Important

`split()` divides a string into words (approximately, we come back to this later).

## .lower()

**Exercise**

1. Experiement with the `lower()`.

In [None]:
# Experiment with lower
# Declare a string variable

variable = 

# Look for documentation on `lower`

help

# Apply lower to the variable AND assign the lowercased string to a new variable

var_lower = 

# print the variables before and after applying the lower method
print()
print()


## .find()

#### **Exercise**

Find the position of the first 'e' in the title "Naturkatastrophenkonzert".

In [None]:
title = 'Naturkatastrophenkonzert'
# use the find() method here

#### Exercise

In [None]:
randj = requests.get('http://www.gutenberg.org/cache/epub/1777/pg1777.txt').text

#### Exercise

Find the position of the word `love` in the Shakespeare's Rome and Juliet. 

HINT: Do not forget to first lowercase all words!

In [None]:
first_love = randj.lower().find('love')
print(first_love)

You can print the context around `first_love` using the [index](https://www.oreilly.com/learning/how-do-i-use-the-slice-notation-in-python) notation. (Please follow link for more information.)

In [None]:
context_size = 50 # the number of character around the word
start_at = first_love-context_size # indicate the starting position
stop_at = first_love+context_size+len('love') # indicate where to stop
print('Start printing at character with position=',start_at)
print('Stop printing at character with position=',stop_at)
print('\n')
print(randj[start_at:stop_at])

## len()

`len()` counts the number of elements that the argument contains. If you pass a string as an argument, it counts the number characters.

Note: the syntax is slighly different here (for reasons that fall outside the scope of this course.)

In [None]:
word = 'supercalifragilisticexpialidocious'
print(len(word))
#print(word.__len__())

In [None]:
# How many characters does your full name contain?

Final Question: How many words does Romeo and Juliet contain (approximately)? Use `split()` and `len()` in combination.

**Exercise**

Are there other useful string methods?

In [None]:
# if yes, play with them here

# Intermezzo: Returning to the main example

Let's inspect more closely some lines in the leading example.

#### `\n` indicates the end of a line

In [None]:
data = """Idaho,1865\nMontana,1865\nOregon,1865\nWashington,1865"""
print(data)

#### Split by newline returns the rows as a list (see below).

In [None]:
data.split('\n')

#### We can save this by declaring a new variable `lines`.

In [None]:
lines = data.split('\n')
print(lines)

#### .format() method manipulates the url by inserting substrings as specific locations marked by braces '{}'.

In [None]:
url = "http://chroniclingamerica.loc.gov/search/pages/results/?state={0}&date1={1}&date2={1}&dateFilterType=yearRange&sequence=1&sort=date&rows=5&format=json"
query = url.format('Idaho',1865)
print(query)

Please follow the link produced by the `print` operation.

This may look very complicated, but actually we are doing nothing more than generating a query that we use to retrieve data from "Chronicling America". Let's have a closer look at what we are actually doing.

The basic components of the this URL are:
- the base URL, http://chroniclingamerica.loc.gov/
- the search service location for individual newspaper pages, search/pages/results
- a query string, starting with `?` and made up of **value pairs** (fieldname=value) separated by `&`.
    - e.g. value pairs are: state=Idaho; date1=1865;
    - only the front pages (sequence=1)
    - sorting by date (sort=date)
    - returning a maximum of five (rows=5)
    - in JSON (format=json)

Now image we would like to retrieve data for multiple years for the state Idaho. In Python this is very simple, but we have to extend our syntax to properly understand how.

In [None]:
queries = [] # define a variable where you will store all your queries
for year in [1865,1885,1905]:  # loop over these years
    query = url.format('Idaho',year) # formulate queriy
    queries.append(query) # store it in the queries variable using .append()
print(queries) # done! print! copy paste one of the elements to see if this worked...

So let's turn to list objects!

# 6. Lists

Lists resemble strings: both are a **sequence** of values. But whereas a string was a sequence of characters, a list can contain values of any type. These values we call **elements** or **items**.

In [None]:
this_is_a_string = 'Hello Newman'
this_is_a_list = ['Hello','Jerry',42,3.1415]

Consider the first sentence (represented as a string) from Franz Kafka's book 'The Trial'. Image for a moment we would have assigned the whole book to the `trial` variable.

In [None]:
trial = "Someone must have slandered Josef K., for one morning, without having done anything truly wrong, he was arrested. "

**A string is a sequence of characters.**

How can we select specific words from this book? For the sentence above, it might seem more natural for humans to describe it as a series of words, rather than as a series of characters. Say, we want to access the first word in our sentence. If we enter:

In [None]:
first_word = trial[0]
print(first_word)

**`split()` converts this string to a list of words.**

Python only prints the first character of our sentence. (Think about this if you do not understand why.) We can, however, transform our sentence into a list of words (represented by strings) using the split() function as follows:

In [None]:
words = trial.split()
print(words)

The variable `trial` now holds the first line of Kafka's Trial as a list. Each element in this list is now (approximately) a word. Run the code below to see the difference.

In [None]:
first_word = words[0]
print(first_word)

## Creating a list: the basic rules 

To store an empty list in variable `x`, simply assign `x` to ``[]`` (square brackets).

In [None]:
# create an empty list
x = []

We can also create lists with some content: enclose the individual items within square brackets, separated by a comma.

In [None]:
my_grades = [8,9,6,7]
print(my_grades)
my_garbage = ['Potatoe',[1,2,3],9.03434,'frogs']
print(my_garbage)

### General rules:
* Lists are surrounded by square brackets and the elements in the list are separated by commas
* A list element can be **any Python object** - even another list (e.g. * List can be an collection of numbers, strings, floats (or a combination thereof))
* A list can store values with different types
* A list can be empty

#### Exercise

Create a list manually, select your three favorite artists/composers, whatever, and put them in one list.

In [None]:
# put your code here

## Adding Items to a List: The concatenation and the `.append()` method

Similar to strings, Python comes with specific operations (``*`` and ``+``) that you can apply to a list.

The ``+`` operator **concatenates** lists. 

Can you guess what the variable `c` will look like?

In [None]:
a = [1, 2, 3]
b = [4, 5, 6]
c = a + b
print(c)

Most of the crucial list functionalities are provided by the inbuilt list **methods**: functions attached to the list object. For an overview of the available methods run the code below (scroll down, for this course you can ignore the methods starting and ending with double underscores.)

In [None]:
writers_list = []
print(type(a_list))

We learn, unsurprisingle to that the variable a_list is of type `list`. Let's inspect the functionalities Python provides for working with lists.

In [None]:
help(list)

**``append()`` adds other values to the list**

The first method we encounter is ``append``. To see what this method does use the same `help` function as before

In [None]:
help(list.append)

`append` is a method that **adds new items** to the end of a list. It has one positional argument and **returns `None`** (we come back to this a few blocks below).

In [None]:
composer_list = ['J.S. Bach', 'W.A. Mozart', 'F. Mendelssohn']
print(composer_list)
composer_list.append('L. van Beethoven')
print(composer_list)

#### Exercise
add another composers to the `composer_list`

In [None]:
# add your code here

Functions in Python are generally divided into **fruitful** and **void** functions? `append` is a **void** function: similar to `print`, it performs an operation (adds one element to the list) but **returns nothing**. Understanding this distinction may help you tracing bugs in future code.

In [None]:
a = composer_list.append('J. des Prez')
print(composer_list)
print(a)

The `append()` method is especially powerful in a `for` loop.

# 7. For-loops

The code below shows a context in which the `append()` method is often applied. For example, we have structured data which lists song titles since the interwar year. Imagine, we want to study all songs about "love". 

But let's start with a simple example and return to Kafka.

In [None]:
trial = "Someone must have slandered Josef K., for one morning, without having done anything truly wrong, he was arrested. "

In [None]:
# split the string by white spaces
words = trial.split()
print(words)

### Membership operators

An easy way to check if a word apears in a sentence is the membership operator `in`.

In [None]:
print('"must" in words? ','must' in words)
print('"lalalala" in words? ','lalalala' in words)

Ok, now we have a list with the indvidual words. Now we can iterate over this list with a `for` loop. Let's loop over the words and print each of the individually in upper case and with exclamation marks!!!!

In [None]:
for word in words:
    print(word.upper()+'!!!')

#### Exercise:

... or print the length of the words.

In [None]:
for  in :
   print(...)

For sure, we could have done this manually, and obtain the same result. But you have to agree that the above example is more elegant and concise. Also, applying the example below to a list 100.000 items or more, is very time consuming.

In [None]:
print(len(words[0]))
print(len(words[1]))
print(len(words[2]))
print(len(words[3]))
print('...')
print('etc.  till the end.')
print('...')
print(len(words[-4]))
print(len(words[-3]))
print(len(words[-2]))
print(len(words[-1]))

What is the benefit of having a fast computer if you have to enter everything manually?

Python provides the so-called `for`-statements that allow us to **iterate** through any iterable object and perform actions on each element. The basic syntax of a `for`-statement is: 

    for X in iterable:

That reads almost like English. We can collect all letters of the lengths of the words in the previous sentence:

The `for` loop might be confusing at first. Let's have a closer look at a simple example: 

In [None]:
names = ['John', 'Anna', 'Bert']
for name in names:
    print(name)

The `name` variable is not explicitly assigned in advance. It acts somewhat as a **placeholder**, and is assigned to each element in the list in turn (as the `print()` statement suggests). 

You are **free to choose the name** of this variable, but it has to be consistent in the indented block below.

In [None]:
names = ['John', 'Anna', 'Bert']
for LALALALALA in names:
    print(LALALALALA)

... this works just fine but is less readable.

We can, now, make a simple program that stores the word length of each word in `words`.

In [None]:
# Initialize and empty list, in which we will store all word lengths
word_lengths = []
# now we iterate over the iterable (i.e. list) called words
for word in words:
    # get the name of the word
    var = len(word)
    # append it to the list
    word_lengths.append(var)

print(word_lengths)

We could make the previous code a bit more concise:

In [None]:
# Initialize and empty list, in which we will store all word lengths
word_lengths = []
# now we iterate over the iterable (i.e. list) called words
for word in words:
    word_lengths.append(len(word))

print(word_lengths)

Now we can put everything together and make a simple programme that collects all songs about 'love'. Step by step.

I left out some code. Please complete where necessary.

**A.** Retrieve the data with `requests`

In [None]:
import requests
url = 'https://labrosa.ee.columbia.edu/millionsong/sites/default/files/AdditionalFiles/tracks_per_year.txt'
#small data set for those with a slower laptop/computer
#url = ‘https://raw.githubusercontent.com/kasparvonbeelen/Coding-the-Humanities/master/lecture2/subsample.txt’
data = requests.get(url).text.strip() # download the song titles

**B.** Create an empty list and define your query.

In [None]:
search = #
love_song = #

**C.** split the data by row. There should be 515576 rows.

In [None]:
rows = data.split('\n')
len(rows) == 515576

In [None]:
for row in rows:
    cells = # split the row into a list called cells, split on the <SEP> sequence
    title = cells[-1] 
    title_lower = # convert capitals in the string to lowercase characters
    words =  # split the title string into words
    if search in words  : # print string if it contains the search term and is older than 1960
        love_song.append(title)

There should be 13844 in `love_song` variable.

In [None]:
print(len(love_song)==13844)

Let's print the first 100.

In [None]:
print(love_song[:100])

#### Exercise

- Put all the code together in one code block. 
- Can you find all the songs "hate" in the song title database.
- Is "love" more popular a topic than "hate"? 

# 8. Dictionaries

Dictionaries provide you with the data structure that makes such tasks (**looking up values by keys**) exceptionally easy.

For example, if we look at the dictionary `telephone_numbers` below, what is Susan's phone number?

In Pyhon you can easily look-up a key (the element before the `":"`) in a dictionary:

In [None]:
telephone_numbers = {'Frank': 4334030, 'Susan': 400230, 'Guido': 487239}
print(telephone_numbers)

... and now print Susan's telephone number:

In [None]:
print(telephone_numbers['Susan'])

#### Exercise

print Guido's number.

## Creating a dictionary

* a dictionary is surrounded by **curly brackets** 

* a dictionary consists of one or more **key:value pairs**, the key is the 'identifier' or "name" that is used to describe the value.
* the **keys** in a dictionary are **unique**
* the syntax for a key/value pair is: `key : value`
* and the **key/value** pairs (i.e. **items**) are separated by **commas**.
* the keys (e.g. 'Frank') in a dictionary have to be **immutable**
* the values (e.g. 8) in a dictionary can by **any python object**
* a dictionary can be empty


An empty dictionary:

In [None]:
x = {}

A mapping between English and German words:

In [None]:
english2deutsch = {'ambulance':'Krankenwagen',
                  'clever':'klug',
                  'concrete':'Beton'}

#### Exercise

Make dictionary which maps three cities to the size of their population. Call it `city2population`.

In [None]:
city2population = #add your code here

## Adding items to a dictionary

There is one very simple way in order to add a **key:value** pair to a dictionary. Please look at the following code snippet:

In [None]:
english2deutsch = dict()
#or try english2deutsch = {}
print(english2deutsch)

In [None]:
english2deutsch['one'] = 'einz'
english2deutsch['two'] = 'zwei'
english2deutsch['three'] = 'drei'
print(english2deutsch)

#### Exercise

Add two more cities to `city2population`

## Iterating over dictionaries

Since dictionaries are iterable objects, we can iterate through our good reads collection as well. This will iterate over the *keys* of a dictionary:

In [None]:
good_reads = {"The Magic Mountain":9,
             "The Idiot":7,
             "Don Quixote": 9.5}

for book in good_reads:
    print(book)

To iterate over the key-value pairs use the `.items()` method.

In [None]:
for book,score in good_reads.items():
    print(book,score)

#### Exercise

Print the city and population by iterating over the items of the city2population dictionary.

## JSON

The data retrieved from the Chronicling America API is a [JSON](https://en.wikipedia.org/wiki/JSON) file in which each item contains a few newspapers from a different state. Copy paste the printout below and go to the [JSON viewer](http://jsonviewer.stack.hu/) the inspect the docoment.

As you'll see, the JSON object combines Python lists and dictionaries. As it is a very common data type, Python has some libraries to process and read JSON data.

In [None]:
idaho = json.load(open('./idaho_example.json'))

In [None]:
print(json.dumps(idaho))

#### Exercise

Explore the JSON file.

#### Exercise

Can you print the first title?

In [None]:
# Add you code here

#### Exercise

Can you print the number of words (approximetaly) in the fourth article (hidden under key 'ocr_eng')?

In [None]:
# Add you code here

## We are DONE for today. Congratulations!