### JSON

- JSON has keys and values
- We read JSON with Python using the json module
- JSON is called "javascript object notation" because you use this format to represent javascript objects. 
- In this class we will stick to Python (no Javascript, though it's handy to learn at some point)
- One good way to learn about JSON is just to get some practice with it.

In [1]:
import json

# You can write json in Python like this 
some_json = {"name": "Sir John", "species": "dog"}

serialized = json.dumps(some_json)  # turn it into a string like this, dumps = "dump string"

restored_from_string = json.loads(serialized)

In [2]:
some_json['name'] # you access fields like this

'Sir John'

### Questions

1. What are the types of `some_json`, `serialized` and `restored_from_string`? Why does that make sense? 

2. What happens if you print out `some_json`, `serialized` and `restored_from_string` in the terminal? What do you observe?

3. Try changing the value of the "species" field in `restored_from_string`. Would you expect the value in `some_json` to change? What do you observe?

4. What does Sir John look like? Find a picture on the Internet and fill the cell below by changing the URL?

![title](https://www.thelabradorsite.com/wp-content/uploads/2017/04/black-lab-enthusiasm.jpg)

In [3]:
### Loop over the keys in JSON like this
for key in some_json:  # 3.7+ Python will iterate based on insertion order 
    print("**")        # https://mail.python.org/pipermail/python-dev/2017-December/151283.html
    print("key =", key)
    print("value =", some_json[key])

**
key = name
value = Sir John
**
key = species
value = dog


In [4]:
### Looping over the keys in json with .items
for key, value in some_json.items():
    print(key, "is", value)

name is Sir John
species is dog


In [5]:
### JSON format allows lists of JSON objects

pets = [{"name": "Sir John", "species": "dog"}, 
        {"name": "Blender", "species": "cat"},
        {"name": "harry", "species": "fish"}]

## TODO:
# loop over all the the pets in the list and print out their name and species

### Serialization

- When you have a list of JSON objects, sometimes you can store them in a file line-by-line
- This format is called jsonl. It it a common format for storing information
- Storing information to a disk in general is called "serialization".
- You already know and love the csv serialization format
- Think of jsonl as a serialization format, just like csv

In [16]:
pets = [{"name": "Sir John", "species": "dog"}, 
        {"name": "George", "species": "cat"},
        {"name": "Harry", "species": "fish"}]

with open("/tmp/pets.jsonl", "w") as of:
    for pet in pets:
        print(json.dumps(pet)) # dumps = dump string
        of.write(json.dumps(pet) + "\n") # write the pet on a new line in our output file

{"name": "Sir John", "species": "dog"}
{"name": "George", "species": "cat"}
{"name": "Harry", "species": "fish"}


In [17]:
! cat /tmp/pets.jsonl   # This saves the file to /tmp/pets.jsonl. 
                        # If you are on a PC this file location might not work for you. You might have to change it.

{"name": "Sir John", "species": "dog"}
{"name": "George", "species": "cat"}
{"name": "Harry", "species": "fish"}


### JSON allows for nesting 

- JSON values can be JSON objects
- This is sometimes called "nesting" because some JSON is nested inside other JSON keys

In [8]:
bowl = {"size": "3 gallons", "material": "glass", "name":"bowl"}
bed = {"size": "6 feet", "material": "soft polyester", "name":"bed"}
palace = {"size": "50000 feet", "material": "gold", "name":"palace"}

In [9]:
print(pets[0])
pets[0]["habitat"] = bed
print(pets[0])
pets[0]["habitat"] = palace
print(pets[0])

{'name': 'Sir John', 'species': 'dog'}
{'name': 'Sir John', 'species': 'dog', 'habitat': {'size': '6 feet', 'material': 'soft polyester', 'name': 'bed'}}
{'name': 'Sir John', 'species': 'dog', 'habitat': {'size': '50000 feet', 'material': 'gold', 'name': 'palace'}}


#### Question 

- What happened in the cell above? Why does the habitat field change? 

In [20]:
pets[0]['habitat'] = palace
pets[2]["habitat"] = bowl
pets[1]['habitat'] = bed

### Pandas for hackers

- Python dictionaries (i.e. deserialized jsonl) are surprisingly handy as a data analysis format.
- The Pandas API is often helpful, but it also does introduce dependencies.
- Sometimes you really don't need it, and it is clearer to just use native Python.
- This is the "everything as Python dictionaries" school of data analysis. It has some merits, and also clear downsides.
- You should be comfortable with different ways of doing things, and understand the tradeoffs.

In [18]:
pets

[{'name': 'Sir John', 'species': 'dog'},
 {'name': 'George', 'species': 'cat'},
 {'name': 'Harry', 'species': 'fish'}]

In [21]:
[pet for pet in pets if pet["habitat"]["name"] == "bed"]

[{'name': 'George',
  'species': 'cat',
  'habitat': {'size': '6 feet', 'material': 'soft polyester', 'name': 'bed'}}]

In [22]:
sum(1 for pet in pets if pet["habitat"]["name"] == "bed")

1

### Question 

Write Python code to determine what fraction of pets live in a gold palace?

[your code here]

### Pandas

Of course, you can also do all this with Pandas. This has many advantages also. For instance, you might be less likely to make mistakes from writing out step-by-step instructions for how to group or average data. 

In [23]:
pets = [{"name": "Sir John", "species": "dog"}, 
        {"name": "George", "species": "cat"},
        {"name": "Harry", "species": "fish"}]

import pandas as pd

df = pd.DataFrame(pets)

df

Unnamed: 0,name,species
0,Sir John,dog
1,George,cat
2,Harry,fish


In [24]:
df2 = pd.read_json("/tmp/pets.jsonl", lines=True)  # another way to do this

df2

Unnamed: 0,name,species
0,Sir John,dog
1,George,cat
2,Harry,fish


## Calling APIs

Interfaces
- In computing, an interface is something that allows two different systems to interact
- A graphical user interface allows a person to interact with a computer

APIs
- An API is an application-programming interface
- It allows people or machines to talk to a computer system
- We have already seen databases, which offer programmers a SQL API
- In this unit, we will learn to call APIs across the web

Web APIs
- When you go to a website, you are already calling a sort of API
- You make a request to the server and the server returns HTML, Javascript and CSS
- Your browser render that code into a page that you can see 
- Other kinds of API return information that is really meant for computers instead of people

### Our first web API call

- Web APIs return information for machines
- But you can still read them as a person
- I find that this is an important and useful first step when using web APIs

- Let's take a look
    - https://api.pushshift.io/reddit/search/submission/?subreddit=cuboulder

- Behold, json

## Question 

- Copy and paste the json from the site to a file on your machine and read the file into your notebook. 
    - Soon we will learn better ways of doing this 
    
- What are the fields in the JSON? 
- Sort posts by upvote ratio
- What post has the highest upvote ratio?

### More APIs

https://pushshift.io/api-parameters/