# Taming JSON


JavaScript-based web pages are different than most HTML/CSS pages you are familiar with. JavaScript sites are dynamic because their cotent is generated, loaded, or modified in the browser using JavaScript, rather than having the server send all the content onto the pages.

The data that is served is more of often than not JSON, or JavaScript Object Notation. It is a light-weight data storage format that is now the most prevalent way that information is held in databases and web applications â€“ data that is often critical to our reporting. At the same time JSON can be difficult to parse because it is so customizable that we often find it with deeply nested (and at times, confusing) logic.

To really know how to work with JSON files, we have to understand their structure.

## Types of JSON objects

##### Large but Easy (and in a file)

<a href="https://drive.google.com/file/d/14LwQ7a9nDw8ytZHxXukJ0EO_IexI1Yr4/view?usp=share_link">Download</a>: ```guns.json```

<a href="https://drive.google.com/file/d/186JTxLB_TLVN1RGss7EyzWtFdzomJD-0/view?usp=share_link">Download:</a> ```gun-data-in-json-file.json```

<a href="https://drive.google.com/file/d/19zUdrkT08OSKD0sWFSeRG68P56tjI1yp/view?usp=share_link">Download</a>: ```countries.json```

What's the difference between these files?

#### Small but Complex (in your notebook):

In [None]:
## run this cell
[{'publication': 'Bloomberg news',
  'location': 'New York',
  'reach': 'Global',
  'info': {'editor': 'John Micklethwait',
   'contacts': {'email': {'tips': 'tips@bloomberg.net',
     'general': 'info@bloomberg.net'},
    'tel': '123456789'}}},
 {'publication': 'The New York Times',
  'location': 'New York',
  'reach': 'Global',
  'info': {'editor': 'Josehp Kahn',
   'contacts': {'email': {'tips': 'tips@nytimes.com',
     'general': 'info@nytimes.com'},
    'tel': '987654321'}}}]

#### Massive and Nested (and on a server)

<a href="https://epicovcharts.bii.a-star.edu.sg/variants-dashboard/data/variants_countries_count.json">Global COVID data</a>

#### Deeply nested API elections sanple data:

```json
{
  "apiVersion": "3.0",
  "apiBuild": "3.0.585",
  "electionDate": "2022-06-21",
  "timestamp": "2023-04-25T13:05:18.342Z",
  "races": [
    {
      "eventID": "VA-20220621_Testday",
      "stateID": 47,
      "test": true,
      "resultsType": "test",
      "raceID": "47434",
      "raceType": "Primary",
      "raceTypeID": "R",
      "officeID": "H",
      "officeName": "U.S. House",
      "party": "GOP",
      "eevp": 0.0,
      "national": true,
      "uncontested": true,
      "seatName": "District 1",
      "seatNum": "1",
      "reportingUnits": [
        {
          "statePostal": "VA",
          "stateName": "Virginia",
          "reportingunitID": 47,
          "reportingunitLevel": 1,
          "pollClosingTime": "2022-06-21T23:00:00.000Z",
          "level": "state",
          "lastUpdated": "2022-06-21T16:43:40.962Z",
          "precinctsReporting": 0,
          "eevp": 0.0,
          "precinctsTotal": 230,
          "precinctsReportingPct": 0.0,
          "candidates": [
            {
              "first": "Rob",
              "last": "Wittman",
              "party": "GOP",
              "incumbent": true,
              "candidateID": "51892",
              "polID": "57798",
              "ballotOrder": 1,
              "polNum": "49141",
              "voteCount": 0
            }
          ]
        }
      ]
    },
    {
      "eventID": "VA-20220621_Testday",
      "stateID": 47,
      "test": true,
      "resultsType": "test",
      "raceID": "47609",
      "raceType": "Primary",
      "raceTypeID": "D",
      "officeID": "H",
      "officeName": "U.S. House",
      "party": "Dem",
      "eevp": 0.0,
      "national": true,
      "uncontested": true,
      "seatName": "District 1",
      "seatNum": "1",
      "reportingUnits": [
        {
          "statePostal": "VA",
          "stateName": "Virginia",
          "reportingunitID": 47,
          "reportingunitLevel": 1,
          "pollClosingTime": "2022-06-21T23:00:00.000Z",
          "level": "state",
          "lastUpdated": "2022-06-21T16:43:40.978Z",
          "precinctsReporting": 0,
          "eevp": 0.0,
          "precinctsTotal": 230,
          "precinctsReportingPct": 0.0,
          "candidates": [
            {
              "first": "Herb",
              "last": "Jones",
              "party": "Dem",
              "candidateID": "51903",
              "polID": "70894",
              "ballotOrder": 1,
              "polNum": "49634",
              "voteCount": 0
            }
          ]
        }
      ]
    },
    {
      "eventID": "VA-20220621_Testday",
      "stateID": 47,
      "test": true,
      "resultsType": "test",
      "raceID": "47780",
      "raceType": "Primary",
      "raceTypeID": "D",
      "officeID": "H",
      "officeName": "U.S. House",
      "party": "Dem",
      "eevp": 0.0,
      "national": true,
      "uncontested": true,
      "seatName": "District 2",
      "seatNum": "2",
      "reportingUnits": [
        {
          "statePostal": "VA",
          "stateName": "Virginia",
          "reportingunitID": 47,
          "reportingunitLevel": 1,
          "pollClosingTime": "2022-06-21T23:00:00.000Z",
          "level": "state",
          "lastUpdated": "2022-06-21T16:43:40.991Z",
          "precinctsReporting": 0,
          "eevp": 0.0,
          "precinctsTotal": 236,
          "precinctsReportingPct": 0.0,
          "candidates": [
            {
              "first": "Elaine",
              "last": "Luria",
              "party": "Dem",
              "incumbent": true,
              "candidateID": "51893",
              "polID": "66930",
              "ballotOrder": 1,
              "polNum": "49514",
              "voteCount": 0
            }
          ]
        }
      ]
    },
    {
      "eventID": "VA-20220621_Testday",
      "stateID": 47,
      "test": true,
      "resultsType": "test",
      "raceID": "47781",
      "raceType": "Primary",
      "raceTypeID": "D",
      "officeID": "H",
      "officeName": "U.S. House",
      "party": "Dem",
      "eevp": 0.0,
      "national": true,
      "uncontested": true,
      "seatName": "District 3",
      "seatNum": "3",
      "reportingUnits": [
        {
          "statePostal": "VA",
          "stateName": "Virginia",
          "reportingunitID": 47,
          "reportingunitLevel": 1,
          "pollClosingTime": "2022-06-21T23:00:00.000Z",
          "level": "state",
          "lastUpdated": "2022-06-21T16:43:41.002Z",
          "precinctsReporting": 0,
          "eevp": 0.0,
          "precinctsTotal": 201,
          "precinctsReportingPct": 0.0,
          "candidates": [
            {
              "first": "Bobby",
              "last": "Scott",
              "party": "Dem",
              "incumbent": true,
              "candidateID": "51894",
              "polID": "1451",
              "ballotOrder": 1,
              "polNum": "48182",
              "voteCount": 0
            }
          ]
        }
      ]
    }
  ]
}
```

## Reading JSON objects

In [None]:
## import libraries
import pandas as pd
import requests

### Read ```JSON``` file

#### Sometimes, we are in luck with a cleanly packaged ```JSON``` file that is not nested and plays nice.

All we need is ```pd.read_json("file_path")```

In [None]:
## these files can be read right into a df


#### ... and easily exported as a csv file:

In [None]:
## export as csv file
df.to_csv("guns_nyc.csv", encoding = "UTF-8", index = False)

### What about this file:

```gun-data-in-json-file.json```

In [None]:
## READ into df


## ```json.load()``` v. ```json.loads()```

The ```json``` package has two similarly named methods that each do something quite different:

- ```.load()``` creates a ```Python Dictionary``` from a **```json``` file.**
- ```.loads()``` creates a ```Python Dictionary``` from a **```json``` string**.

In [None]:
## import python's json package


#### Import ```json``` file

In [None]:
## open and load json file


In [None]:
# What type of object? 


In [None]:
# What type of object?


In [None]:
# Call it again. 


In [None]:
# Turn into DataFrame 


#### Import ```json``` file

In [None]:
## open and load json file


In [None]:
## type of data


In [None]:
## call the keys


In [None]:
## get data from the key above:


In [None]:
## turn into a dataframe


In [None]:
## flatten via dictionary get


## Import ```json``` string

Most the time, your scrapes will return a JSON string, not a JSON file.

In [None]:
## run this cell that holds a json string
my_json = '''
{
	"gun-data":

		[{
				"occur_year": "2006",
				"boro": "The Bronx",
				"precinct": 41,
				"statistical_murder_flag": "No",
				"vic_age_group": "25-44",
				"vic_sex": "Male",
				"vic_race": "White Hispanic",
				"solved": true
			},

			{

				"occur_year": "2006",
				"boro": "Manhattan",
				"precinct": 34,
				"statistical_murder_flag": "No",
				"vic_age_group": "18-24",
				"vic_sex": "Male",
				"vic_race": "White Hispanic",
				"solved": true

			},
			{
				"occur_year": "2006",
				"boro": "Manhattan",
				"precinct": 34,
				"statistical_murder_flag": "No",
				"vic_age_group": "18-24",
				"vic_sex": "Male",
				"vic_race": "White Hispanic",
				"solved": false
			},
			{
				"occur_year": "2006",
				"boro": "The Bronx",
				"precinct": 44,
				"statistical_murder_flag": "No",
				"vic_age_group": "18-24",
				"vic_sex": "Female",
				"vic_race": "White Hispanic",
				"solved": false
			}
		]

}

'''

In [None]:
## call my_json



In [None]:
## print my_json



In [None]:
## what type of data?


In [None]:
## load json string


In [None]:
## type of data


In [None]:
## get keys


In [None]:
## turn into a dataframe


## Writing JSON

Not too many reasons **for us** to write a list of dictionaries as a JSON object to an external file, unless you are building web interactives.

But we might want to more easily inspect JSON.

## Exploring nested data

In [None]:
## what type of data


In [None]:
## go in one level


In [None]:
## what are the keys



In [None]:
## pull out deeper key


In [None]:
## get value from deeper key
