# Working with Known JSON Schemas - Lab

## Introduction
In this lab, you'll practice working with JSON files whose schema you know beforehand.

## Objectives
You will be able to:
* Use the JSON module to load and parse JSON documents
* Write data to predefined JSON schemas
* Convert JSON to a pandas dataframe

## Reading a JSON Schema

Here's the JSON schema provided for a section of the NY Times API:
<img src="images/nytimes_movie_schema.png" width=500>

or a fully expanded view:

<img src="images/nytimes_movie_schema_detailed.png" width=500>

You can more about the documentation [here](https://developer.nytimes.com/docs/movie-reviews-api/1/routes/reviews/%7Btype%7D.json/get).

You can see that the master structure is a dictionary and has a key named 'response'. This is also a dictionary and has two keys: 'data' and 'meta'. As you continue to examine the schema hierarchy, you'll notice the vast majority, in this case, are dictionaries. 

## Loading the Data File

Start by importing the json file. The sample response from the api is stored in a file **ny_times_movies.json**

In [16]:
#Your code here
import json
import pandas as pd

f = open('ny_times_movies.json')
data = json.load(f)

In [17]:
print(data.keys())
print(data['results'][0].keys())

dict_keys(['status', 'copyright', 'has_more', 'num_results', 'results'])
dict_keys(['display_title', 'mpaa_rating', 'critics_pick', 'byline', 'headline', 'summary_short', 'publication_date', 'opening_date', 'date_updated', 'link', 'multimedia'])


## Loading Specific Data

Create a DataFrame of the major data container within the json file, listed under the 'results' heading in the schema above.

In [18]:
#Your code here
df = pd.DataFrame(data['results'])

In [26]:
print(df['headline'][4])
df.head(3)

Review: â€˜Impulsoâ€™ Goes Backstage With a Flamenco Innovator


Unnamed: 0,byline,critics_pick,date_updated,display_title,headline,link,mpaa_rating,multimedia,opening_date,publication_date,summary_short,link_type,link_url,link_suggested_link_text,multimedia_type,multimedia_src,multimedia_width,multimedia_height
0,A.O. SCOTT,1,2018-10-17 02:44:23,Can You Ever Forgive Me,Review: Melissa McCarthy Is Criminally Good in...,"{'type': 'article', 'url': 'http://www.nytimes...",R,"{'type': 'mediumThreeByTwo210', 'src': 'https:...",2018-10-19,2018-10-16,Marielle Heller directs a true story of litera...,article,http://www.nytimes.com/2018/10/16/movies/can-y...,Read the New York Times Review of Can You Ever...,mediumThreeByTwo210,https://static01.nyt.com/images/2018/10/19/art...,210,140
1,BEN KENIGSBERG,1,2018-10-16 11:04:03,Charm City,Review: â€˜Charm Cityâ€™ Vividly Captures the ...,"{'type': 'article', 'url': 'http://www.nytimes...",,"{'type': 'mediumThreeByTwo210', 'src': 'https:...",2018-04-22,2018-10-16,Marilyn Nessâ€™s documentary is dedicated to t...,article,http://www.nytimes.com/2018/10/16/movies/charm...,Read the New York Times Review of Charm City,mediumThreeByTwo210,https://static01.nyt.com/images/2018/10/17/art...,210,140
2,GLENN KENNY,1,2018-10-16 11:04:04,Horn from the Heart: The Paul Butterfield Story,Review: Paul Butterfieldâ€™s Story Is Told in ...,"{'type': 'article', 'url': 'http://www.nytimes...",,"{'type': 'mediumThreeByTwo210', 'src': 'https:...",2018-10-19,2018-10-16,A documentary explores the life of the blues m...,article,http://www.nytimes.com/2018/10/16/movies/horn-...,Read the New York Times Review of Horn from th...,mediumThreeByTwo210,https://static01.nyt.com/images/2018/10/17/art...,210,140


In [22]:
def break_out_nested_data(dataframe, col_name):
    keys = dataframe[col_name].iloc[0].keys() 
    
    for key in keys:
        new_col_name = f'{col_name}_{key}'
        dataframe[new_col_name] = dataframe[col_name].map(lambda x: x[key])
    
    return dataframe

df = break_out_nested_data(df, 'link')
df = break_out_nested_data(df, 'multimedia')

## How many unique critics are there?

In [30]:
df.byline.nunique()

7

## Create a new column for the review's url. Title the column 'review_url'

In [None]:
#Your code here
df['review_url']

## How many results are in the file?

In [32]:
#Your code here
data['num_results']

20

## Summary
Well done! Here you continued to gather practice extracting data from JSON files and transforming them into our standard tool of Pandas DataFrames.