<b><center>UNCLASSIFIED</center></b>

<img src="./GRAPHICS/Banner.png" width="110%">

# Lesson 7: Getting Non-Tabular Data

***

## Table of Contents

* [7.1. Objectives](#Objectives)<br>
* [7.2. Overview](#Overview)<br>
* [7.3. Review](#Review)<br>
   * [7.3.1. Reading a CSV File](#CSV)<br>
   * [7.3.2. Using .DictReader() to Read a CSV File](#DictReader)<br>
* [7.4. Lesson: Getting Non-Tabular Data](#Lesson)<br>
   * [7.4.1. Dictionaries](#Dictionaries)<br>
   * [7.4.2. JavaScript Object Notation (JSON)](#JSON)<br>
   * [7.4.2. GeoJSON](#GeoJSON)<br>
* [7.5. Guided Exercise: JSON and Dictionaries](#Guided-Exercise)<br>
* [7.6. Practical Exercises](#Practical-Exercises)<br>
   * [7.6.1. Practical Exercise 1: Read a JSON File](#PE1)<br>
   * [7.6.2. Practical Exercise 1: How Many Earthquakes?](#PE2)<br>
   * [7.6.3. Practical Exercise 2: Magnitude of the Strongest Earthquake](#PE3)<br>
   * [7.6.4. Practical Exercise 3: Strongest Earthquake](#PE4)<br>
   * [7.6.5. Practical Exercise 4: Most Recent Earthquake](#PE5)<br>
   * [7.6.6. Practical Exercise 5: Write Earthquake Dictionary](#PE6)<br>
* [7.7. Administrative Notes](#Administrative)<br>

***
<a id='Objectives'></a>
## 7.1. Objectives
Using conditionals, loops, Python dictionaries, local data, the CSV Library, and the Glob Library, students will be able to:

* Examine the implications of using computation to solve a problem
    * Discuss best practices for using computation to solve a problem
    * Suggest types of problems that can be solved through computation
    * Show how computation can solve a problem
<br><br>    
* Recognize key computer science concepts
    * Recognize how queries operate
<br><br>    
* Demonstrate the ability to build basic scripts using Python scripting language
    * Use various data types and structures in Python scripting
    * Collect data using Python scripting
    * Extract data using Python scripting
    * Develop advanced data structures using Python scripting

*** 
<a id='Overview'></a>
## 7.2. Overview
The following lesson is divided into three main parts:
* Lesson: Getting Non-Tabular Data
* Guided Exercise: JSON and Dictionaries
* Practical Exercises

<b><font color="brick">Instructor Guidance</font>: Refer back to Lesson 1 and relate the four steps of problem-solving using Computational Thinking (Decomposition, Pattern Recognition, Abstraction, & Algorithm Design) to lessons, exercises, examples, student questions/comments, etc., as appropriate throughout this lesson.</b>

<a id='Review'></a>
## 7.3. Review
<img src="./GRAPHICS/csvReader.png" width="35%">

<hr> <a id='CSV'></a>
### 7.3.1. Reading a Basic CSV File

`csv.reader()` returns an object that can be iterated through to obtain the data in each row of the file. To retain the data, it must be appended to a list. Run the code cells below to import data from a local CSV file. First, we need to import the CSV Library.

In [1]:
import csv

Then, we can read in the data.

In [2]:
with open('data/animal/animal_info_land.csv') as csvfile:
    
    animal = csv.reader(csvfile)
    
    land_animals = []
    
    for row in animal:
        land_animals.append(row)
        
land_animals

[['Name', 'Weight', 'Diet', 'Average Lifespan'],
 ['Giraffe', '2600', 'Shrubbery', '25'],
 ['Moose', '1300', 'Shrubbery', '20'],
 ['Giant Panda', '220', 'Bamboo', '20'],
 ['Red Panda', '14', 'Bamboo', '14'],
 ['Hippopotamus', '3800', 'Grass', '45'],
 ['Lion', '400', 'Gazelle', '12'],
 ['Tortoise', '100', 'Grass', '100']]

1. What is the data type of each row?
2. What are all of the data types of the items in each row?
3. Does this structure make sense as a way to store information from a file? Why or why not?

<hr>
#### 7.3.1.1. Writing to a CSV File
Writer objects can take data in your script, convert it into a string, and write it to a given file. The `.writer()` method creates a writer object. Using that object, we can call `.writerows()`, which requires a list of lists as an argument.

In [3]:
with open('data/animal/animal_output.csv', mode='w', newline='') as csvfile:
    
    writer = csv.writer(csvfile)
    
    writer.writerows(land_animals)

Now let's read the file we just wrote to make sure it worked.

In [4]:
animals_list = []

with open('data/animal/animal_output.csv') as csvfile:
    
    animal = csv.reader(csvfile)
    
    for row in animal:
        animals_list.append(row)

animals_list

[['Name', 'Weight', 'Diet', 'Average Lifespan'],
 ['Giraffe', '2600', 'Shrubbery', '25'],
 ['Moose', '1300', 'Shrubbery', '20'],
 ['Giant Panda', '220', 'Bamboo', '20'],
 ['Red Panda', '14', 'Bamboo', '14'],
 ['Hippopotamus', '3800', 'Grass', '45'],
 ['Lion', '400', 'Gazelle', '12'],
 ['Tortoise', '100', 'Grass', '100']]

<hr> <a id='DictReader'></a>
### 7.3.2. Using `.DictReader()` to Read a CSV File

Another way to interpret CSV data is as a list of dictionaries.

`.DictReader()` is similar to `.reader()` in function but it stores the data in a different way. It formats each row as a dictionary, using the first row of data as the keys. To retain the data outside of `csv.DictReader` object, append each dictionary to a list or other data structure for you to manipulate later in the code. Below, we read in the animals dataset as dictionaries.

In [5]:
with open('data/animal/animal_info_land.csv',mode='r') as csvfile:
    animal = csv.DictReader(csvfile)
    for row in animal:
        print(row)
print('\n',animal)

{'Diet': 'Shrubbery', 'Average Lifespan': '25', 'Weight': '2600', 'Name': 'Giraffe'}
{'Diet': 'Shrubbery', 'Average Lifespan': '20', 'Weight': '1300', 'Name': 'Moose'}
{'Diet': 'Bamboo', 'Average Lifespan': '20', 'Weight': '220', 'Name': 'Giant Panda'}
{'Diet': 'Bamboo', 'Average Lifespan': '14', 'Weight': '14', 'Name': 'Red Panda'}
{'Diet': 'Grass', 'Average Lifespan': '45', 'Weight': '3800', 'Name': 'Hippopotamus'}
{'Diet': 'Gazelle', 'Average Lifespan': '12', 'Weight': '400', 'Name': 'Lion'}
{'Diet': 'Grass', 'Average Lifespan': '100', 'Weight': '100', 'Name': 'Tortoise'}

 <csv.DictReader object at 0x000002E4447AFDD8>


The code above prints each dictionary. How should the code be modified to add each row (dictionary) to a list?

In [6]:
dict_land_animals = []

with open('data/animal/animal_info_land.csv') as csvfile:
    
    animal = csv.DictReader(csvfile)
    
    for row in animal:
        dict_land_animals.append(row)

dict_land_animals

[{'Average Lifespan': '25',
  'Diet': 'Shrubbery',
  'Name': 'Giraffe',
  'Weight': '2600'},
 {'Average Lifespan': '20',
  'Diet': 'Shrubbery',
  'Name': 'Moose',
  'Weight': '1300'},
 {'Average Lifespan': '20',
  'Diet': 'Bamboo',
  'Name': 'Giant Panda',
  'Weight': '220'},
 {'Average Lifespan': '14',
  'Diet': 'Bamboo',
  'Name': 'Red Panda',
  'Weight': '14'},
 {'Average Lifespan': '45',
  'Diet': 'Grass',
  'Name': 'Hippopotamus',
  'Weight': '3800'},
 {'Average Lifespan': '12',
  'Diet': 'Gazelle',
  'Name': 'Lion',
  'Weight': '400'},
 {'Average Lifespan': '100',
  'Diet': 'Grass',
  'Name': 'Tortoise',
  'Weight': '100'}]

Now we can access the information by label instead of index position. Below, we iterate through the list of dictionaries we just created and print out the values in the `Name` and `Weight` columns.

In [7]:
for animal in dict_land_animals:
    
    print(animal['Name'], animal['Weight'])

Giraffe 2600
Moose 1300
Giant Panda 220
Red Panda 14
Hippopotamus 3800
Lion 400
Tortoise 100


<hr>
#### 7.3.2.1. Using `.DictWriter()` to Write to a CSV File

Writer objects can take data, convert it into a string, and write it to a given file. `csv.DictWriter()` is similar in behavior to `csv.writer()`, but takes a list of dictionaries as input rather than a list of lists. You must also pass the `fieldnames` argument, a list of strings matching the column headers of your table.

```python
csv.DictWriter(csvfile, fieldnames)
```

Let's add a new row of data to `dict_land_animals`.

In [8]:
tiger_data = {'Name': 'Tiger', 'Diet': 'Deer, Gazelle', 'Average Lifespan': 10, 'Weight': 450}

dict_land_animals.append(tiger_data)
dict_land_animals

[{'Average Lifespan': '25',
  'Diet': 'Shrubbery',
  'Name': 'Giraffe',
  'Weight': '2600'},
 {'Average Lifespan': '20',
  'Diet': 'Shrubbery',
  'Name': 'Moose',
  'Weight': '1300'},
 {'Average Lifespan': '20',
  'Diet': 'Bamboo',
  'Name': 'Giant Panda',
  'Weight': '220'},
 {'Average Lifespan': '14',
  'Diet': 'Bamboo',
  'Name': 'Red Panda',
  'Weight': '14'},
 {'Average Lifespan': '45',
  'Diet': 'Grass',
  'Name': 'Hippopotamus',
  'Weight': '3800'},
 {'Average Lifespan': '12',
  'Diet': 'Gazelle',
  'Name': 'Lion',
  'Weight': '400'},
 {'Average Lifespan': '100',
  'Diet': 'Grass',
  'Name': 'Tortoise',
  'Weight': '100'},
 {'Average Lifespan': 10,
  'Diet': 'Deer, Gazelle',
  'Name': 'Tiger',
  'Weight': 450}]

In [9]:
with open('data/animal/animal_dict_output.csv', mode='w', newline='') as csvfile:
    
    writer = csv.DictWriter(csvfile, fieldnames=dict_land_animals[0].keys())
    
    writer.writeheader()
    
    writer.writerows(dict_land_animals)

Read the data back in to make sure it worked.

In [10]:
animals = []

with open('data/animal/animal_dict_output.csv') as csvfile:
    
    animal = csv.DictReader(csvfile)
    
    for row in animal:
        animals.append(row)

animals

[{'Average Lifespan': '25',
  'Diet': 'Shrubbery',
  'Name': 'Giraffe',
  'Weight': '2600'},
 {'Average Lifespan': '20',
  'Diet': 'Shrubbery',
  'Name': 'Moose',
  'Weight': '1300'},
 {'Average Lifespan': '20',
  'Diet': 'Bamboo',
  'Name': 'Giant Panda',
  'Weight': '220'},
 {'Average Lifespan': '14',
  'Diet': 'Bamboo',
  'Name': 'Red Panda',
  'Weight': '14'},
 {'Average Lifespan': '45',
  'Diet': 'Grass',
  'Name': 'Hippopotamus',
  'Weight': '3800'},
 {'Average Lifespan': '12',
  'Diet': 'Gazelle',
  'Name': 'Lion',
  'Weight': '400'},
 {'Average Lifespan': '100',
  'Diet': 'Grass',
  'Name': 'Tortoise',
  'Weight': '100'},
 {'Average Lifespan': '10',
  'Diet': 'Deer, Gazelle',
  'Name': 'Tiger',
  'Weight': '450'}]

<hr>
<a id='Lesson'></a>
## 7.4. Lesson: Getting Non-Tabular Data 

<hr> <a id='Dictionaries'></a>
### 7.4.1. Dictionaries
A driver's license is a great example of a common object in our daily lives that uses key-value pairs. Each data point is linked to a key that describes what it is. 

<img src="./GRAPHICS/DriversLicense.png" width="35%"><br>

For practice, we are going to capture data from the Driver's License as a dictionary. To create a dictionary from scratch, wrap a set of key-value pairs in curly brackets (`{}`).

In [11]:
mclovin = {'DOB': '06/03/1981',
           'EXP': '06/03/2008',
           'HT': '5-10',
           'WT': 150,
           'HAIR': 'BRO',
           'EYES': 'BRO',
           'SEX': 'M',
           'CTY': 'O',
           'ISSUE DATE': '06/18/1998',
           'CLASS': 3}

mclovin

{'CLASS': 3,
 'CTY': 'O',
 'DOB': '06/03/1981',
 'EXP': '06/03/2008',
 'EYES': 'BRO',
 'HAIR': 'BRO',
 'HT': '5-10',
 'ISSUE DATE': '06/18/1998',
 'SEX': 'M',
 'WT': 150}

If you are ever unsure of an object's data type, use the `type()` function.

In [12]:
type(mclovin)

dict

Python dictionaries have several special methods. How would we find the keys in `mclovin`?

In [13]:
mclovin.keys()

dict_keys(['HT', 'EYES', 'DOB', 'HAIR', 'WT', 'SEX', 'EXP', 'CLASS', 'ISSUE DATE', 'CTY'])

We access data inside dictionaries using bracket notation. Inside the brackets we write the key of the item we want to retrieve. But if we try to access a key the dictionary doesn't have, we'll get an error.

In [14]:
mclovin['DONOR']

KeyError: 'DONOR'

To prevent errors from halting your code, you can use an `if` block to only access a key if you can first find it in the dictionary.

In [15]:
if 'DONOR' in mclovin.keys():
    print(mclovin['DONOR'])
    
else:
    print("McLovin's donor status is not on his license. Moving on...")

McLovin's donor status is not on his license. Moving on...


`.keys()` is also useful for getting a better understanding of your data. When we receive a new, unfamiliar dataset, we can use this method to get a preview of what our dictionary holds.

In [16]:
mclovin.keys()

dict_keys(['HT', 'EYES', 'DOB', 'HAIR', 'WT', 'SEX', 'EXP', 'CLASS', 'ISSUE DATE', 'CTY'])

If we did not know ahead of time that we were dealing with a driver's license, we could look at this list of keys and infer get a clue as to what the data is about. We will spend more time diving into data later today and throughout the remainder of this course.

You can use a similar method for retrieving the values in a dictionary. Like `.keys()`, `.values()` returns a list-like object, which we can perform membership tests on and iterate over using `for` loops.

In [17]:
mclovin.values()

dict_values(['5-10', 'BRO', '06/03/1981', 'BRO', 150, 'M', '06/03/2008', 3, '06/18/1998', 'O'])

Although the `.values()` function is used less frequently than the `.keys()` function, it can still be a very useful function in that it helps us quickly acquire information about a dictionary if it contains data of the same type. A good example of this would be a dictionary that contains only numerical data.

In [18]:
apples = {'A': 14,
          'B': 18,
          'C': 2,
          'D': 17,
          'E': 12,
          'F': 18,
          'G': 16,
          'H': 9,
          'I': 7,
          'J': 2,
          'K': 18,
          'L': 20,
          'M': 13,
          'N': 15,
          'O': 21}

apples.values()

dict_values([18, 18, 2, 20, 14, 17, 13, 18, 21, 15, 2, 9, 7, 12, 16])

Isolating the values allows us to perform operations that we would be able to do on a dictionary. In this case, we can find the maximum or minimum of our dictionary's values.

In [19]:
max(apples.values())

21

In [20]:
min(apples.values())

2

We can also use the `.items()` method to retrieve both the keys and values from a dictionary as a list of tuples.

In [21]:
mclovin.items()

dict_items([('HT', '5-10'), ('EYES', 'BRO'), ('DOB', '06/03/1981'), ('HAIR', 'BRO'), ('WT', 150), ('SEX', 'M'), ('EXP', '06/03/2008'), ('CLASS', 3), ('ISSUE DATE', '06/18/1998'), ('CTY', 'O')])

`.items()` is typically used in conjunction with <font color=green><b>for</b></font> loops to iterate over dictionaries and perform some action.

In [22]:
for key, value in mclovin.items():
    
    print(key, '==>', value)

HT ==> 5-10
EYES ==> BRO
DOB ==> 06/03/1981
HAIR ==> BRO
WT ==> 150
SEX ==> M
EXP ==> 06/03/2008
CLASS ==> 3
ISSUE DATE ==> 06/18/1998
CTY ==> O


The for loop above uses multiple assignment to assign the key and value to separate variables.

#### 7.4.1.1 Nested Dictionaries

So far, all our dictionaries' values have been individual pieces of data, like strings or integers. However, values can be any data type or data structure. 

Here's McLovin's driver's license structured a little differently:

In [23]:
mclovin_nested = {'ADDRESS': {'CITY': 'HONOLULU', 'STREET': '892 MOMONA ST', 'ZIP': '96820'},
                  'DRIVER': {'DOB': '06/03/1981',
                             'EYES': 'BRO',
                             'HAIR': 'BRO',
                             'HT': '5-10',
                             'NAME': 'McLOVIN',
                             'SEX': 'M',
                             'WT': 150},
                  'LICENSE': [['ISSUE DATE', '06/18/1998'],
                              ['EXP', '06/03/2008'],
                              ['CLASS', 3],
                              ['RESTR', ''],
                              ['ENDORSE', ''],
                              ['ISSUING STATE', 'Hawaii'],
                              ['NUMBER', '01-47-87441'],
                              ['CTY', 'O']]
                  }

How many key-value pairs does the above dictionary have? We can find out using the `len()` function.

In [24]:
len(mclovin_nested)

3

What is an advantage of using nested data structures? While they are slightly more complicated to build, nested dictionaries can organize our information better. Nested dictionaries are great at capturing hierarchical data, where there are multiple tiers of classification. Above, we pooled the driver's license data into three subgroups. Thus we have three top-level keys in our dictionary. Each key, however, holds another structure rather than a single value.

Large dictionaries are not very readable when printed to the screen.

In [25]:
print(mclovin_nested)

{'LICENSE': [['ISSUE DATE', '06/18/1998'], ['EXP', '06/03/2008'], ['CLASS', 3], ['RESTR', ''], ['ENDORSE', ''], ['ISSUING STATE', 'Hawaii'], ['NUMBER', '01-47-87441'], ['CTY', 'O']], 'ADDRESS': {'STREET': '892 MOMONA ST', 'CITY': 'HONOLULU', 'ZIP': '96820'}, 'DRIVER': {'HT': '5-10', 'EYES': 'BRO', 'DOB': '06/03/1981', 'NAME': 'McLOVIN', 'WT': 150, 'HAIR': 'BRO', 'SEX': 'M'}}


A common workaround for this is to use the Pretty Print Library. This library's `pprint` function indents our data so that it is easier to read.

In [26]:
import pprint

`pprint` is the name of both the library and the method, so you we need to call it as we do below. This is similar to `glob.glob()`.

In [27]:
pprint.pprint(mclovin_nested)

{'ADDRESS': {'CITY': 'HONOLULU', 'STREET': '892 MOMONA ST', 'ZIP': '96820'},
 'DRIVER': {'DOB': '06/03/1981',
            'EYES': 'BRO',
            'HAIR': 'BRO',
            'HT': '5-10',
            'NAME': 'McLOVIN',
            'SEX': 'M',
            'WT': 150},
 'LICENSE': [['ISSUE DATE', '06/18/1998'],
             ['EXP', '06/03/2008'],
             ['CLASS', 3],
             ['RESTR', ''],
             ['ENDORSE', ''],
             ['ISSUING STATE', 'Hawaii'],
             ['NUMBER', '01-47-87441'],
             ['CTY', 'O']]}


<hr> <a id='JSON'></a>
### 7.4.2. JavaScript Object Notation (JSON)
<hr>
<a id='JSONReferences'></a>
#### References:</b>
* [Python 3.4: JSON Library](https://docs.python.org/3.4/library/json.html)
* [JSON.org: Introducing JSON](http://json.org/)
* [The Hitchhiker's Guide to JSON](http://docs.python-guide.org/en/latest/scenarios/json/)

<hr>

#### 7.4.2.1. Introducing JSON

JSON is a lightweight, language-independent data format that is:
* Easy for humans to read and write
* Easy for machines to parse and generate

JSON is easy for people to read and write because it requires less complicated syntax than other formats. Yet it is structured in a particular way, allowing computers to parse it reliably.

Given its roots in JavaScript, JSON has slightly different data types than Python. However, Python data structures are modeled after JSON and so they share a very similar syntax.

***
#### 7.4.2.2. JSON Basic Data Types

| Data Type | Description                                                                                 |
|:---------:|---------------------------------------------------------------------------------------------|
|Number| A signed decimal number that may contain a fractional part and may use exponential E notation, <br>but cannot include non-numbers like `NaN`. The format makes no distinction between integer and floating-point.  |
|String| A sequence of zero or more Unicode characters. Strings are delimited with double quotes and <br>support a backslash escaping syntax. |
|Boolean| Either of the values `true` or `false` |
|Array| An ordered list of zero or more values, each of which may be of any type. Arrays use square bracket <br>notation with elements being comma-separated. |
|Object| An unordered collection of name/value pairs where the names (also called keys) are strings. Objects <br>are delimited with curly brackets and use commas to separate each pair, while within each pair the colon `:` <br>character separates the key or name from its value. |
|null| An empty value, using the word `null`. |

***
#### 7.4.2.3. Comparing JSON Strings and Python Dictionaries 
In order to grasp these differences, let's compare a JSON string with a Python dictionary equivalent of the given JSON string.

JSON String Example:
```python
"""
{
    "FirstName": "John",
    "LastName": "Brown",
    "StudentNumber": "123456",
    "Registered": true,
    "CoursesTaken": ["Computer Science", "Networks", "Python"],
    "Personal": {"Height": "6ft", "Weight": "180lb", "HairColor": "black"},
    "Others": null
}
"""
```

Python Dictionary Example:
```python
{
    'FirstName':'John', 
    'LastName':'Brown', 
    'StudentNumber':'123456', 
    'Registered': True, 
    'CoursesTaken':['Computer Science', 'Networks','Python'],
    'Personal':{'Height': '6ft', 'Weight': '180lb', 'HairColor':'black'}, 
    'Others': None
}

```
While these data types are similar, they are not quite identical. JSON strings and Python dictionaries share a number of similarities and differences:

| Characteristic |JSON   | Python Dictionaries    |
| :----| :-----:|:----------------:|
| Uses key-value pairs to store data| ✔️ | ✔️|
| Can be complex in nature (nested) | ✔️ | ✔️|
| Order matters when comparing      | ✔️ | ✖️|
| Stored in memory as               |String| Dictionary|
| Strings wrapped in                |Double Quotes | Single or Double Quotes|
| Keyword for "nothingness"         |`null`|`None`|


Because Python and JSON share a similar structure, we can easily convert data back and forth between Python data structures and JSON-fomatted strings. Lets quickly go over the JSON Library.

***
#### 7.4.2.4. Converting a JSON-Formatted String or file into a Dictionary
To convert a JSON-formatted string or file into a Python data structure use the following:
```python    
json.loads(json_formatted_string)    ## Converts a JSON string to a Python dictionary ##

json.load(file_obj_with_json)        ## Converts a JSON file to a Python dictionary ##
```
To convert a Python data structure into a JSON-formatted string or file use the following:
```python
json.dumps(my_dictionary)            ## Converts a Python dictionary to a JSON string ##

json.dump(my_dictionary, file_obj)   ## Converts a Python dictionary to a JSON file ##
```

NOTE: `json.loads()` and `json.dumps()` are inverse functions, and `json.load()` and `json.dump()` are inverse functions. Using `json.load()` and `json.dump()` turns a three step process (1. Open file; 2. Read string; 3. Convert string to dictionary) to a two step process (1. Open file; 2. Load dictionary). Of course, you're free to use which ever method you prefer in your own scripts. 

<img src="./GRAPHICS/JSON.png" width="45%">

For additional details about the JSON Library, check out some of these [references](#JSONReferences). In the code cell below, let's output `mclovin` to remind ourselves what it looks like.

In [28]:
mclovin

{'CLASS': 3,
 'CTY': 'O',
 'DOB': '06/03/1981',
 'EXP': '06/03/2008',
 'EYES': 'BRO',
 'HAIR': 'BRO',
 'HT': '5-10',
 'ISSUE DATE': '06/18/1998',
 'SEX': 'M',
 'WT': 150}

Now let's convert `mclovin` into a JSON-formatted string and print the output. But first we need to import the JSON Library.

In [29]:
import json

We use `.dumps()` to convert the `mclovin` dictionary into a string. Can you see the differences and similarities between the two outputs?

In [30]:
mclovin_string = json.dumps(mclovin)
mclovin_string

'{"HT": "5-10", "EYES": "BRO", "DOB": "06/03/1981", "HAIR": "BRO", "WT": 150, "SEX": "M", "EXP": "06/03/2008", "CLASS": 3, "ISSUE DATE": "06/18/1998", "CTY": "O"}'

We just converted a Python dictionary to a JSON-formatted string.  How would we do the opposite? Use the `.loads()` method.

In [31]:
json_final_dict = json.loads(mclovin_string)
json_final_dict

{'CLASS': 3,
 'CTY': 'O',
 'DOB': '06/03/1981',
 'EXP': '06/03/2008',
 'EYES': 'BRO',
 'HAIR': 'BRO',
 'HT': '5-10',
 'ISSUE DATE': '06/18/1998',
 'SEX': 'M',
 'WT': 150}

To convert a python dictionary into a JSON-formatted file, that is a text file that contains a JSON-formatted string, use `json.dump()`. It takes two arguments, so when you use it it will look something like this:

```python
json.dump(my_dict, file_obj)
```
Arguments:
* `my_dict` can be any Python dictionary
* `file_obj` can be any file open in `'w'` mode

Most commonly, we'll use the `with open()` syntax to create a file object to write to. Let's create a new dictionary to write to a file as a JSON string.

In [32]:
employees = {'0001': {'name': 'Ludovicus', 'tenure': 7, 'title': 'manager'},
             '0002': {'name': 'Emeric', 'tenure': 2, 'title': 'associate'},
             '0003': {'name': 'Anacletus', 'tenure': 10, 'title': 'senior manager'},
             '0004': {'name': 'Omondi', 'tenure': 5, 'title': 'senior associate'},
             '0005': {'name': 'Augustina', 'tenure': 8, 'title': 'manager'},
             '0006': {'name': 'Cassia', 'tenure': 1, 'title': 'intern'},
             '0007': {'name': 'Murna', 'tenure': 14, 'title': 'vice president'}
            }

With this data stored in memory as a dictionary, we can write it to a file with `.dump()`.

In [33]:
with open('employee_file.json', 'w') as f:
    json.dump(employees, f)

To do the opposite, and convert a JSON-formatted file into a python dictionary, use `.load()`. This is also used most commonly in a `with open()` block as follows. Below we load the data we just wrote to a file back into memory, converting it on the fly from a JSON string to a Python dictionary.

In [34]:
with open('employee_file.json') as f:
    new_dict = json.load(f)

Let's inspect our data structure to make sure it worked.

In [35]:
new_dict

{'0001': {'name': 'Ludovicus', 'tenure': 7, 'title': 'manager'},
 '0002': {'name': 'Emeric', 'tenure': 2, 'title': 'associate'},
 '0003': {'name': 'Anacletus', 'tenure': 10, 'title': 'senior manager'},
 '0004': {'name': 'Omondi', 'tenure': 5, 'title': 'senior associate'},
 '0005': {'name': 'Augustina', 'tenure': 8, 'title': 'manager'},
 '0006': {'name': 'Cassia', 'tenure': 1, 'title': 'intern'},
 '0007': {'name': 'Murna', 'tenure': 14, 'title': 'vice president'}}

With the four methods we just introduced, `.dumps()`, `.loads()`, `.dump()`, and `.load()`, you have the ability to read and write JSON-formatted data from and to files. This is a great way to share data among and across teams, as many technologies and coding languages have built-in functionality for interpreting JSON.

<hr> <a id='GeoJSON'></a>
### 7.4.3. GeoJSON
<hr>
#### References:
* [Wikipedia: GeoJSON](https://en.wikipedia.org/wiki/GeoJSON)
* [USGS Earthquake Hazards Program: All Earthquakes in the Last Hour](https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_hour.geojson)

<hr>
#### 7.4.3.1. Introducing GeoJSON
A specific application of the JSON format that may be of some interest for this class is GeoJSON, which is an open standard format designed for representing simple geographical features, along with their non-spatial attributes.

GeoJSON is a subset of JSON formatting, which means that while all GeoJSON strings follow JSON formatting, not all JSON strings follow GeoJSON formatting (for example, none of the JSON strings that we have shown above fit GeoJSON formatting).

<img src="./GRAPHICS/subset.png" width="30%">

When we create a new GeoJSON object, it must always build from the following basic template:

In [36]:
earthquakes = {
  'type': 'FeatureCollection',
  'features': []
}

The above GeoJSON does not contain any meaningful data; it's just scaffold that all valid GeoJSON must start from.  To spruce up our data structure, we can add an element to the features list. For our purposes, this element will be another JSON object (think Python dictionary) with keys `'type'`,`'geometry'`, and `'properties'`.

In [37]:
earthquakes = {
  'type': 'FeatureCollection',
  'features': [
              { 
              'type': 'Feature',
              'geometry': {
                           'type': 'Point',
                           'coordinates"': [125.6, 10.1]
                          },
              'properties': {
                             'name': 'Dinagat Islands'
                            }
              }
              ]
}    

Let's deconstruct the feature object we added to our GeoJSON.

* `'type'` will always be the string `'Feature'`.
* `'geometry'` will always be another object.
    * `'type'` will be a string corresponding to one of the geometry types listed below
    * `'coordinates'` will be a list of numerical coordinates defining the shape of the geometry<br><br>
* `'properties'` will be another object with user defined keys and values. This is where you can add in some interesting custom data.

Most key-value pairs in GeoJSON have an assumed value or list of valid values. Additional complexity and customization comes from adding more features to our features list and adding more content to the `properties` key.

***
#### 7.4.3.2. Types of Geometries Using GeoJSON 
Let's take a look at the types of geometries we can define using GeoJSON. GeoJSON supports the following geometry types:

| Type of Geometry | Example | | Type of Geometry | Example |
|:-----------------:|:----------------:| |:-----------------:|:----------------:|
|Point |<img src="./GRAPHICS/51px_SFA_Point.png" width="100%">| | LineString|<img src="./GRAPHICS/51px_SFA_LineString.png" width="100%">|
|Polygon |<img src="./GRAPHICS/SFA_Polygon_Svg.png" width="100%">| | MultiPoint |<img src="./GRAPHICS/51px_SFA_MultiPoint.png" width="100%">|
| MultiLineString |<img src="./GRAPHICS/51px_SFA_MultiLine.png" width="100%">| |MultiPolygon |<img src="./GRAPHICS/SFA_MultiPolygon.png" width="100%">|

***
#### 7.4.3.3. Playing with GeoJSON
With GeoJSON, we can capture individual points of interest, borders, 2D areas, and much more. Let's look at an example, complete the steps below.

Step 1. Load a sample of USGS data from a local file.

In [38]:
with open('data/USGS_Earthquake_Data_Single.geojson') as f:
     quake_data = f.read()

Step 2. Highlight and copy the `quake_data` text.  

In [39]:
print(quake_data)

{"type":"FeatureCollection","metadata":{"generated":1516122464000,"url":"https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_hour.geojson","title":"USGS All Earthquakes, Past Hour","status":200,"api":"1.5.8","count":1},"features":[{"type":"Feature","properties":{"mag":1.5,"place":"9km WSW of North Nenana, Alaska","time":1516119891807,"updated":1516120348239,"tz":-540,"url":"https://earthquake.usgs.gov/earthquakes/eventpage/ak18070078","detail":"https://earthquake.usgs.gov/earthquakes/feed/v1.0/detail/ak18070078.geojson","felt":null,"cdi":null,"mmi":null,"alert":null,"status":"automatic","tsunami":0,"sig":35,"net":"ak","code":"18070078","ids":",ak18070078,","sources":",ak,","types":",geoserve,origin,","nst":null,"dmin":null,"rms":0.65,"gap":null,"magType":"ml","type":"earthquake","title":"M 1.5 - 9km WSW of North Nenana, Alaska"},"geometry":{"type":"Point","coordinates":[-149.3078,64.5602,8.6]},"id":"ak18070078"}],"bbox":[-155.4211731,19.2338333,2.85,-89.6829987,64.5602,64.22]

Step 3. Go to [GeoJSON.io](http://geojson.io).

Step 4. Paste the copied text into the white pane on the right side of the screen. 

<a id='Guided-Exercise'></a>
<hr>
## 7.5. Guided Exercise: JSON and Dictionaries

<b><font color = "brick">Instructor Guidance</font>: Refer back to Lesson 1 and relate the four steps of problem-solving using Computational Thinking (Decomposition, Pattern Recognition, Abstraction, & Algorithm Design) as appropriate throughout these exercises.</b>

Now that we've learned more about JSON and dictionaries, lets cover how to incorporate them into your analytic workflows. We'll retrieve USGS data from the files at the following locations:

```python        
    'data/USGS_Earthquake_Data_Single.geojson'
    'data/USGS_Earthquake_Data_Multiple.geojson'
```

We first open `USGS_Earthquake_Data_Single.json`, which has data about a single earthquake, and convert its JSON string into a dictionary. Then we'll explore the dictionary's keys and values.

In [40]:
with open('data/USGS_Earthquake_Data_Single.geojson') as f:
    USGS_dict_single = json.load(f)

USGS_dict_single

{'bbox': [-155.4211731, 19.2338333, 2.85, -89.6829987, 64.5602, 64.22],
 'features': [{'geometry': {'coordinates': [-149.3078, 64.5602, 8.6],
    'type': 'Point'},
   'id': 'ak18070078',
   'properties': {'alert': None,
    'cdi': None,
    'code': '18070078',
    'detail': 'https://earthquake.usgs.gov/earthquakes/feed/v1.0/detail/ak18070078.geojson',
    'dmin': None,
    'felt': None,
    'gap': None,
    'ids': ',ak18070078,',
    'mag': 1.5,
    'magType': 'ml',
    'mmi': None,
    'net': 'ak',
    'nst': None,
    'place': '9km WSW of North Nenana, Alaska',
    'rms': 0.65,
    'sig': 35,
    'sources': ',ak,',
    'status': 'automatic',
    'time': 1516119891807,
    'title': 'M 1.5 - 9km WSW of North Nenana, Alaska',
    'tsunami': 0,
    'type': 'earthquake',
    'types': ',geoserve,origin,',
    'tz': -540,
    'updated': 1516120348239,
    'url': 'https://earthquake.usgs.gov/earthquakes/eventpage/ak18070078'},
   'type': 'Feature'}],
 'metadata': {'api': '1.5.8',
  'count': 

Remember that you can use `pprint.pprint()` if you find it helpful for exploring your data.

In [41]:
import pprint

pprint.pprint(USGS_dict_single)

{'bbox': [-155.4211731, 19.2338333, 2.85, -89.6829987, 64.5602, 64.22],
 'features': [{'geometry': {'coordinates': [-149.3078, 64.5602, 8.6],
                            'type': 'Point'},
               'id': 'ak18070078',
               'properties': {'alert': None,
                              'cdi': None,
                              'code': '18070078',
                              'detail': 'https://earthquake.usgs.gov/earthquakes/feed/v1.0/detail/ak18070078.geojson',
                              'dmin': None,
                              'felt': None,
                              'gap': None,
                              'ids': ',ak18070078,',
                              'mag': 1.5,
                              'magType': 'ml',
                              'mmi': None,
                              'net': 'ak',
                              'nst': None,
                              'place': '9km WSW of North Nenana, Alaska',
                              'rms': 0.65,
 

With large data structures, it's often more helpful to print the keys than the entire structure at once.

In [42]:
USGS_dict_single.keys()

dict_keys(['type', 'features', 'metadata', 'bbox'])

To pull out a value from the dictionary, pass a key name into a set of brackets. Let's explore the `features` key, where most of the interesting data lives.

In [43]:
USGS_dict_single['features']

[{'geometry': {'coordinates': [-149.3078, 64.5602, 8.6], 'type': 'Point'},
  'id': 'ak18070078',
  'properties': {'alert': None,
   'cdi': None,
   'code': '18070078',
   'detail': 'https://earthquake.usgs.gov/earthquakes/feed/v1.0/detail/ak18070078.geojson',
   'dmin': None,
   'felt': None,
   'gap': None,
   'ids': ',ak18070078,',
   'mag': 1.5,
   'magType': 'ml',
   'mmi': None,
   'net': 'ak',
   'nst': None,
   'place': '9km WSW of North Nenana, Alaska',
   'rms': 0.65,
   'sig': 35,
   'sources': ',ak,',
   'status': 'automatic',
   'time': 1516119891807,
   'title': 'M 1.5 - 9km WSW of North Nenana, Alaska',
   'tsunami': 0,
   'type': 'earthquake',
   'types': ',geoserve,origin,',
   'tz': -540,
   'updated': 1516120348239,
   'url': 'https://earthquake.usgs.gov/earthquakes/eventpage/ak18070078'},
  'type': 'Feature'}]

Let's also explore the `metadata` key.

In [44]:
USGS_dict_single['metadata']

{'api': '1.5.8',
 'count': 1,
 'generated': 1516122464000,
 'status': 200,
 'title': 'USGS All Earthquakes, Past Hour',
 'url': 'https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_hour.geojson'}

Next, we open `USGS_Earthquake_Data_Multiple.json`, which contains data on multiple earthquakes. The data about the earthquakes is stored in `features` as a list of dictionaries. We will need to loop through the `features` list to access each earthquake's details.

In [45]:
with open('data/USGS_Earthquake_Data_Multiple.geojson') as f:
    USGS_dict_multiple = json.load(f)

USGS_dict_multiple

{'bbox': [-155.4211731, 19.2338333, 2.85, -89.6829987, 64.5602, 64.22],
 'features': [{'geometry': {'coordinates': [-155.4211731, 19.2338333, 32.01],
    'type': 'Point'},
   'id': 'hv70015412',
   'properties': {'alert': None,
    'cdi': None,
    'code': '70015412',
    'detail': 'https://earthquake.usgs.gov/earthquakes/feed/v1.0/detail/hv70015412.geojson',
    'dmin': 0.02266,
    'felt': None,
    'gap': 143,
    'ids': ',hv70015412,',
    'mag': 1.85,
    'magType': 'md',
    'mmi': None,
    'net': 'hv',
    'nst': 21,
    'place': '6km ENE of Pahala, Hawaii',
    'rms': 0.15,
    'sig': 53,
    'sources': ',hv,',
    'status': 'automatic',
    'time': 1516121936210,
    'title': 'M 1.9 - 6km ENE of Pahala, Hawaii',
    'tsunami': 0,
    'type': 'earthquake',
    'types': ',geoserve,origin,phase-data,',
    'tz': -600,
    'updated': 1516122134300,
    'url': 'https://earthquake.usgs.gov/earthquakes/eventpage/hv70015412'},
   'type': 'Feature'},
  {'geometry': {'coordinates': [-8

There is a lot of data here, so rather than read it all at once, it's smarter to dissect the larger structure, understand how it is organized, and then dive further into the data as necessary.

In [46]:
USGS_dict_multiple.keys()

dict_keys(['type', 'features', 'metadata', 'bbox'])

Seeing the top-level keys is helpful, but sometimes we want to see the values too. We can use creative `print()` statements to accomplish this in a readable way.

In [47]:
for key, value in USGS_dict_multiple.items():    
    print(key, '==>', value)
    print('-' * 30)

type ==> FeatureCollection
------------------------------
features ==> [{'geometry': {'type': 'Point', 'coordinates': [-155.4211731, 19.2338333, 32.01]}, 'type': 'Feature', 'id': 'hv70015412', 'properties': {'code': '70015412', 'tz': -600, 'tsunami': 0, 'sig': 53, 'magType': 'md', 'felt': None, 'place': '6km ENE of Pahala, Hawaii', 'mag': 1.85, 'time': 1516121936210, 'alert': None, 'cdi': None, 'sources': ',hv,', 'updated': 1516122134300, 'net': 'hv', 'rms': 0.15, 'title': 'M 1.9 - 6km ENE of Pahala, Hawaii', 'types': ',geoserve,origin,phase-data,', 'gap': 143, 'nst': 21, 'dmin': 0.02266, 'detail': 'https://earthquake.usgs.gov/earthquakes/feed/v1.0/detail/hv70015412.geojson', 'type': 'earthquake', 'ids': ',hv70015412,', 'status': 'automatic', 'mmi': None, 'url': 'https://earthquake.usgs.gov/earthquakes/eventpage/hv70015412'}}, {'geometry': {'type': 'Point', 'coordinates': [-89.6829987, 36.1686668, 12.52]}, 'type': 'Feature', 'id': 'nm60186242', 'properties': {'code': '60186242', 'tz': 

The `features` value is still hard to read because of its large amount of data. However, we know that the value here is always a list (or, in JSON parlance, an array). This means we can loop through this value and print each individual piece. Let's try that.

In [48]:
for feature in USGS_dict_multiple['features']:
    print(feature)
    print('-' * 30)

{'geometry': {'type': 'Point', 'coordinates': [-155.4211731, 19.2338333, 32.01]}, 'type': 'Feature', 'id': 'hv70015412', 'properties': {'code': '70015412', 'tz': -600, 'tsunami': 0, 'sig': 53, 'magType': 'md', 'felt': None, 'place': '6km ENE of Pahala, Hawaii', 'mag': 1.85, 'time': 1516121936210, 'alert': None, 'cdi': None, 'sources': ',hv,', 'updated': 1516122134300, 'net': 'hv', 'rms': 0.15, 'title': 'M 1.9 - 6km ENE of Pahala, Hawaii', 'types': ',geoserve,origin,phase-data,', 'gap': 143, 'nst': 21, 'dmin': 0.02266, 'detail': 'https://earthquake.usgs.gov/earthquakes/feed/v1.0/detail/hv70015412.geojson', 'type': 'earthquake', 'ids': ',hv70015412,', 'status': 'automatic', 'mmi': None, 'url': 'https://earthquake.usgs.gov/earthquakes/eventpage/hv70015412'}}
------------------------------
{'geometry': {'type': 'Point', 'coordinates': [-89.6829987, 36.1686668, 12.52]}, 'type': 'Feature', 'id': 'nm60186242', 'properties': {'code': '60186242', 'tz': -360, 'tsunami': 0, 'sig': 196, 'magType':

There is still a lot here to digest. But we are working with yet more nested data structures, and diving deeper into them will allow us to make use of all this data.

<a id='Practical-Exercises'></a>
<hr>
## 7.6. Practical Exercises

<b><font color="brick">Instructor Guidance</font>: Refer back to Lesson 1 and relate the four steps of problem-solving using Computational Thinking (Decomposition, Pattern Recognition, Abstraction, & Algorithm Design) as appropriate throughout these exercises.

<font color="brick">Instructor Guidance</font>: The practical exercises deemed most important due to content and/or a cumulative result, which should be completed first in the interest of maximum training value in relation to time are Practical Exercises 1, 2, 3, and 5. Ensure you go over the exercise solutions and (as necessary) the processes to arrive at the solutions with the students.

<font color="brick">Instructor Guidance</font>: Follow-up questions are designed to be asked by the facilitators individually as each student completes the task and has it looked at by a facilitator.</b>

<hr>
#### References:
* [USGS Real-time Notifications, Feeds, and Web Services](https://earthquake.usgs.gov/earthquakes/feed/)
* [USGS GeoJSON Detail Format](https://earthquake.usgs.gov/earthquakes/feed/v1.0/geojson_detail.php)
<hr>

***
<a id='PE1'></a>
### 7.6.1. Practical Exercise 1: Read a JSON file

Read in the file at the location below and store it into memory. The data object you create should be a dictionary.

```python
'data/all_hour.geojson'
```

HINT: You've learned two different ways to turn JSON into dictionaries in this lesson.

In [49]:
## YOUR CODE GOES HERE ##

In [50]:
## INSTRUCTION SOLUTION(S) ##
with open('data/all_hour.geojson') as f:
    USGS_dict = json.load(f)
    
USGS_dict

{'bbox': [-151.3175, 38.7966652, 2.75, -122.6373367, 62.3983, 88.3],
 'features': [{'geometry': {'coordinates': [-146.6432, 61.4872, 31.7],
    'type': 'Point'},
   'id': 'ak18070801',
   'properties': {'alert': None,
    'cdi': None,
    'code': '18070801',
    'detail': 'https://earthquake.usgs.gov/earthquakes/feed/v1.0/detail/ak18070801.geojson',
    'dmin': None,
    'felt': None,
    'gap': None,
    'ids': ',ak18070801,',
    'mag': 1.9,
    'magType': 'ml',
    'mmi': None,
    'net': 'ak',
    'nst': None,
    'place': '42km NNW of Valdez, Alaska',
    'rms': 0.28,
    'sig': 56,
    'sources': ',ak,',
    'status': 'automatic',
    'time': 1516125572570,
    'title': 'M 1.9 - 42km NNW of Valdez, Alaska',
    'tsunami': 0,
    'type': 'earthquake',
    'types': ',geoserve,origin,',
    'tz': -540,
    'updated': 1516125733258,
    'url': 'https://earthquake.usgs.gov/earthquakes/eventpage/ak18070801'},
   'type': 'Feature'},
  {'geometry': {'coordinates': [-122.7825012, 38.79666

Use the dataset (above) to answer the remaining practical exercises.

***
<a id='PE2'></a>
### 7.6.2. Practical Exercise 2: How Many Earthquakes
How many earthquakes are in the dataset?

HINT: Look at the value of the `'metadata'` key, or count the number of objects in the `features` key.

In [51]:
## YOUR CODE GOES HERE ##

In [52]:
## INSTRUCTION SOLUTION(S) ##

USGS_dict['metadata']['count'] 

5

<hr> <a id='PE3'></a>
### 7.6.3. Practical Exercise 3: Magnitude of the Strongest Earthquake

What is the magnitude of the strongest earthquake? 

HINT: Within the `'features'` key, look at `'properties'`.

In [53]:
## YOUR CODE GOES HERE ##

In [54]:
## INSTRUCTION SOLUTION(S) ##

magnitudes = []

for feature in USGS_dict['features']:
    
    magnitudes.append(feature['properties']['mag'])

max(magnitudes)

1.9

<hr> <a id='PE4'></a>
### 7.6.4. Practical Exercise 4: Strongest Earthquake
What is the location and timestamp of the strongest earthquake?

In [55]:
## YOUR CODE GOES HERE ##

In [56]:
## INSTRUCTION SOLUTION(S) ##
for feature in USGS_dict['features']:
    
    if feature['properties']['mag'] == max(magnitudes):
        
        print(feature['properties']['place'])
        print(feature['properties']['time'])

42km NNW of Valdez, Alaska
1516125572570


<hr> <a id='PE5'></a>
### 7.6.5. Practical Exercise 5: Most Recent Earthquake

Where was the most recent earthquake. When did it happen? 

HINT: The most recent time will have the largest `'time'` value.

In [57]:
## YOUR CODE GOES HERE ##

In [58]:
## INSTRUCTION SOLUTION(S) ##
times = []

for feature in USGS_dict['features']:
    
    times.append(feature['properties']['time'])

for feature in USGS_dict['features']:
    
    if feature['properties']['time'] == max(times):
        
        print(feature['properties']['place'])
        print(feature['properties']['time'])

42km NNW of Valdez, Alaska
1516125572570


In [59]:
## INSTRUCTION SOLUTION(S) (ALTERNATE) ##

from datetime import datetime

newest = int(max(times)) / 1000

dt = datetime.fromtimestamp(newest).strftime('%Y-%m-%d %H:%M:%S')
dt

'2018-01-16 12:59:32'

<hr> <a id='PE6'></a>
### 7.6.6. Practical Exercise 6: Write Earthquake Dictionary

Write out our earthquake dictionary as a JSON string to a local file. If you don't get any errors, use the File Explorer on your computer to ensure the file is where you expect.

In [60]:
## YOUR CODE GOES HERE ##

In [61]:
## INSTRUCTION SOLUTION(S) ##
with open('data/all_hour_out.geojson', 'w') as f:
    json.dump(earthquakes, f)

<b>Follow-up Questions:</b>
1. What data structures did you create and use to answer the questions? Why?
2. What built-in functions or methods did you use to answer the questions? 
3. What data structures give you the most difficult time? Why?

<a id='Administrative'></a>
<hr>
## 7.7. Administrative Notes:
* Save your Lesson 7 .ipynb file to your H drive

<b><center>UNCLASSIFIED</center></b>