## JSON; Javascript Object Notation

![img](img/Working-With-JSON-Data-in-Python_Watermarked.webp)

(image totally belongs to Real Python and not us. Source [here](https://realpython.com/python-json/) )

## Learning Goals:

You will be able to:
* Together explore an unknown JSON schemas
* Access and manipulate data inside a JSON file
* transform it to another data structure
* Practice on a new json

## Why do we care about JSON?

### Great interest in JSON
![json](img/stackoverflowtrends.svg)

### JSON and python are popular

![trends](img/google_trends.png)

## The OG reference on [JSON](https://www.json.org/)

[Here](https://www.digitalocean.com/community/tutorials/an-introduction-to-json) is another helpful reference.

### Import the `json` library

In [None]:
import json as json

`json` library [documentation](https://docs.python.org/3/library/json.html)

### What function do we use to load a json?
**Task**: the file is called `google-maps-geocoding-results.json`<br>
assign it to `data`

In [None]:
f = open('google-maps-geocoding-results.json')
data = json.load(f)

### Let's look at the data

In [None]:
print(data)

In [None]:
data

### Hm, that second bit looks familiar....

#### How does python classify the json object?

In [None]:
type(data)

Because we know it's a `dictionary` we can now reference `keys`, `values`, and `items`

#### Look at the data and try to navigate it!

In [None]:
data.keys()

#### What type of item is `results`?

In [None]:
results = data['results']
results

In [None]:
type(results)

But what `type` is the first element in `results`?

In [None]:
results[0]

#### So what are the keys?
Is this where we finally find the relevent information?

In [None]:
data['results'][0].keys()

**Okay**....Check the type of `address_components`, does it have keys or a list?<br>
Let's target a specific value like 'Main street'

In [None]:
data['results'][0]['address_components'][1].keys()

#### Okay, so let's use a for loop to get the entire address information

In [None]:
for address_component in data['results']:
    for component in address_component['address_components']:
        print(component['long_name'])

#### Task:
- In what type of structure is the lat and long located?
- How would you access the lat and long of this lcoation?
- Write the code to access the lat and long

### Analysis: 
We cannot run any machine learning algorithms on a json, the data needs to be in tabular format.

What if we wanted to store that information to a df?

In [None]:
import pandas as pd

dfcols = ['number', 'street', 'district', 'city', 'county', 'state', 'country', 'zip']
df = pd.DataFrame(columns=dfcols)

address = []

for address_component in data['results']:
    for component in address_component['address_components']:
        address.append(component['long_name'])
    df = df.append(pd.Series(address, index=dfcols), ignore_index=True)

df.head()

#### Task:
How would you enhance the code above to add the primary latitude and longitude for this location to the data frame?

### Why are we using this difficult process? isn't there an easier way?

there is `pd.DataFrame.from_dict` and `pd.read_json`, but if you try them, what happens?

There is also `pandas.io.json import json_normalize` you can run to _flatten_ nested data, but you should *always* look at your data first before transforming it. 

## Integration:

Use the json `brewreydb.json` to build a table of relevant beer information for analysis.


In [None]:
beer = open('brewerydb.json')
beer_data = json.load(beer)

### Assessment:

True/False: JSON files and dictionaries are the same

True/False: the `json` library loads jsons into dictionary format in python

APIs will most often provide data in what format:

- csv
- xml
- dataframe
- yaml
- json

**Further Reading**

[article 1](https://towardsdatascience.com/the-easy-way-to-work-with-csv-json-and-xml-in-python-5056f9325ca9)<br>
[article 2](https://towardsdatascience.com/my-love-affair-with-json-edaca39e8320)<br>
[article 3](https://medium.com/@martindrapeau/the-state-of-csv-and-json-d97d1486333)

### Some neat tools for JSONs

[The JSON Validator](https://jsonlint.com)<br>
JSONLint is the free online validator and reformatter tool for JSON, a lightweight data-interchange format.

Andy used it when:
- created my own json and want to validate the format
- copied a json, want to make sure I copied it right
 - it's ugly and I want to see it in a prettier way, like without its glasses and hair down
- everything aws is done through json objects - permissions, provisioning servers and anything else


[JSONview](https://chrome.google.com/webstore/detail/jsonview/gmegofmjomhknnokphhckolhcffdaihd?hl=en) - a chrome plug in that lets you view jsons nicely
