# Extracting non-tabular data

### Ingesting JSON data with pandas
When developing a data pipeline, you may have to work with non-tabular data and data sources, such as APIs or JSON files. In this exercise, we'll practice extracting data from a JSON file using pandas.

pandas has been imported as pd, and the JSON file you'll ingest is stored at the path "testing_scores.json".

In [None]:
import pandas as pd


def extract(file_path):
  # Read the JSON file into a DataFrame
  return pd.read_json(file_path, orient="records")

# Call the extract function with the appropriate path, assign to raw_testing_scores
raw_testing_scores = extract("testing_scores.json")

# Output the head of the DataFrame
print(raw_testing_scores.head())


### Reading JSON data into memory
When data is stored in JSON format, it's not always easy to load into a DataFrame. This is the case for the "nested_testing_scores.json" file. Here, the data will have to be manually manipulated before it can be stored in a DataFrame.

To help get you started, pandas has been loaded into the workspace as pd.

In [None]:
def extract(file_path):
  	# Read the JSON file into a DataFrame, orient by index
	return pd.read_json(file_path, orient="index")

# Call the extract function, pass in the desired file_path
raw_testing_scores = extract("nested_scores.json")
print(raw_testing_scores.head())


In [None]:
# Import the json library
import json

def extract(file_path):
    with open(file_path, "r") as json_file:
        # Load the data from the JSON file
        raw_data = json.load(json_file)
    return raw_data

raw_testing_scores = extract("nested_scores.json")

# Print the raw_testing_scores
print(raw_testing_scores)