# Intro to JSON and Working With it in Python

* **JSON:** javascript object notation
* A data format using key-value pairs
    * similar to dictionaries in Python
* key (aka 'name') is a string
* value is a number, string, array, boolean, JSON object (nested JSON object)

## Import Statements

In [1]:
import json
import pandas as pd

## Example Code

In [4]:
json_arr_str = """
[
    {
      "TimestampUTC": "2020-03-24T00:27:00Z",
      "TimestampSubjectTZ": "2020-03-23T20:27:00",
      "Calories": 0.0234859050963356,
      "HR": 0.0,
      "Lux": null,
      "Steps": 0.0,
      "Wear": true,
      "x": 0,
      "y": 35,
      "z": 0,
      "AxisXCounts": 0,
      "AxisYCounts": 35,
      "AxisZCounts": 0
    },
    {
      "TimestampUTC": "2020-03-24T00:28:00Z",
      "TimestampSubjectTZ": "2020-03-23T20:28:00",
      "Calories": 0.042274629173404,
      "HR": 0.0,
      "Lux": null,
      "Steps": 0.0,
      "Wear": true,
      "x": 44,
      "y": 63,
      "z": 55,
      "AxisXCounts": 44,
      "AxisYCounts": 63,
      "AxisZCounts": 55
    },
    {
      "TimestampUTC": "2020-03-24T00:29:00Z",
      "TimestampSubjectTZ": "2020-03-23T20:29:00",
      "Calories": 0.0,
      "HR": 0.0,
      "Lux": null,
      "Steps": 0.0,
      "Wear": true,
      "x": 0,
      "y": 0,
      "z": 0,
      "AxisXCounts": 0,
      "AxisYCounts": 0,
      "AxisZCounts": 0
    },
    {
      "TimestampUTC": "2020-03-24T00:30:00Z",
      "TimestampSubjectTZ": "2020-03-23T20:30:00",
      "Calories": 0.224122637205031,
      "HR": 0.0,
      "Lux": null,
      "Steps": 0.0,
      "Wear": true,
      "x": 193,
      "y": 334,
      "z": 71,
      "AxisXCounts": 193,
      "AxisYCounts": 334,
      "AxisZCounts": 71
    },
    {
      "TimestampUTC": "2020-03-24T00:31:00Z",
      "TimestampSubjectTZ": "2020-03-23T20:31:00",
      "Calories": 0.0154335947775919,
      "HR": 0.0,
      "Lux": null,
      "Steps": 0.0,
      "Wear": true,
      "x": 30,
      "y": 23,
      "z": 0,
      "AxisXCounts": 30,
      "AxisYCounts": 23,
      "AxisZCounts": 0
    }
  ]
"""

Now we will create a JSON object which will be a Python list:

In [5]:
json_arr = json.loads(json_arr_str)
print(json_arr)

[{'TimestampUTC': '2020-03-24T00:27:00Z', 'TimestampSubjectTZ': '2020-03-23T20:27:00', 'Calories': 0.0234859050963356, 'HR': 0.0, 'Lux': None, 'Steps': 0.0, 'Wear': True, 'x': 0, 'y': 35, 'z': 0, 'AxisXCounts': 0, 'AxisYCounts': 35, 'AxisZCounts': 0}, {'TimestampUTC': '2020-03-24T00:28:00Z', 'TimestampSubjectTZ': '2020-03-23T20:28:00', 'Calories': 0.042274629173404, 'HR': 0.0, 'Lux': None, 'Steps': 0.0, 'Wear': True, 'x': 44, 'y': 63, 'z': 55, 'AxisXCounts': 44, 'AxisYCounts': 63, 'AxisZCounts': 55}, {'TimestampUTC': '2020-03-24T00:29:00Z', 'TimestampSubjectTZ': '2020-03-23T20:29:00', 'Calories': 0.0, 'HR': 0.0, 'Lux': None, 'Steps': 0.0, 'Wear': True, 'x': 0, 'y': 0, 'z': 0, 'AxisXCounts': 0, 'AxisYCounts': 0, 'AxisZCounts': 0}, {'TimestampUTC': '2020-03-24T00:30:00Z', 'TimestampSubjectTZ': '2020-03-23T20:30:00', 'Calories': 0.224122637205031, 'HR': 0.0, 'Lux': None, 'Steps': 0.0, 'Wear': True, 'x': 193, 'y': 334, 'z': 71, 'AxisXCounts': 193, 'AxisYCounts': 334, 'AxisZCounts': 71}, {'

Let's walk through each object in the array:

In [14]:
for arr_obj in json_arr:
    print(arr_obj)
    print('*'*135)

{'TimestampUTC': '2020-03-24T00:27:00Z', 'TimestampSubjectTZ': '2020-03-23T20:27:00', 'Calories': 0.0234859050963356, 'HR': 0.0, 'Lux': None, 'Steps': 0.0, 'Wear': True, 'x': 0, 'y': 35, 'z': 0, 'AxisXCounts': 0, 'AxisYCounts': 35, 'AxisZCounts': 0}
***************************************************************************************************************************************
{'TimestampUTC': '2020-03-24T00:28:00Z', 'TimestampSubjectTZ': '2020-03-23T20:28:00', 'Calories': 0.042274629173404, 'HR': 0.0, 'Lux': None, 'Steps': 0.0, 'Wear': True, 'x': 44, 'y': 63, 'z': 55, 'AxisXCounts': 44, 'AxisYCounts': 63, 'AxisZCounts': 55}
***************************************************************************************************************************************
{'TimestampUTC': '2020-03-24T00:29:00Z', 'TimestampSubjectTZ': '2020-03-23T20:29:00', 'Calories': 0.0, 'HR': 0.0, 'Lux': None, 'Steps': 0.0, 'Wear': True, 'x': 0, 'y': 0, 'z': 0, 'AxisXCounts': 0, 'AxisYCounts': 0, 'AxisZCoun

Let's grab timestamps in the subject's timzeone:

In [17]:
for arr_obj in json_arr:
    print(arr_obj['TimestampSubjectTZ'], ':', arr_obj['Calories'])
    print('*'*135)

2020-03-23T20:27:00 : 0.0234859050963356
***************************************************************************************************************************************
2020-03-23T20:28:00 : 0.042274629173404
***************************************************************************************************************************************
2020-03-23T20:29:00 : 0.0
***************************************************************************************************************************************
2020-03-23T20:30:00 : 0.224122637205031
***************************************************************************************************************************************
2020-03-23T20:31:00 : 0.0154335947775919
***************************************************************************************************************************************


What is the type of the array object?

In [20]:
print(type(json_arr))
print(type(arr_obj))

<class 'list'>
<class 'dict'>


## Example Reading Data From File

Let's try loading some JSON data from a file:

In [22]:
infile = open('actigraph_data.json', 'r')
json_arr = json.load(infile)
print(json_arr)

[{'TimestampUTC': '2020-03-24T00:27:00Z', 'TimestampSubjectTZ': '2020-03-23T20:27:00', 'Calories': 0.0234859050963356, 'HR': 0.0, 'Lux': None, 'Steps': 0.0, 'Wear': True, 'x': 0, 'y': 35, 'z': 0, 'AxisXCounts': 0, 'AxisYCounts': 35, 'AxisZCounts': 0}, {'TimestampUTC': '2020-03-24T00:28:00Z', 'TimestampSubjectTZ': '2020-03-23T20:28:00', 'Calories': 0.042274629173404, 'HR': 0.0, 'Lux': None, 'Steps': 0.0, 'Wear': True, 'x': 44, 'y': 63, 'z': 55, 'AxisXCounts': 44, 'AxisYCounts': 63, 'AxisZCounts': 55}, {'TimestampUTC': '2020-03-24T00:29:00Z', 'TimestampSubjectTZ': '2020-03-23T20:29:00', 'Calories': 0.0, 'HR': 0.0, 'Lux': None, 'Steps': 0.0, 'Wear': True, 'x': 0, 'y': 0, 'z': 0, 'AxisXCounts': 0, 'AxisYCounts': 0, 'AxisZCounts': 0}, {'TimestampUTC': '2020-03-24T00:30:00Z', 'TimestampSubjectTZ': '2020-03-23T20:30:00', 'Calories': 0.224122637205031, 'HR': 0.0, 'Lux': None, 'Steps': 0.0, 'Wear': True, 'x': 193, 'y': 334, 'z': 71, 'AxisXCounts': 193, 'AxisYCounts': 334, 'AxisZCounts': 71}, {'

## Time For Pandas...

Of course pandas can help us with all of this! Check this out:

In [24]:
json_df = pd.read_json('actigraph_data.json')
print(json_df)

               TimestampUTC  TimestampSubjectTZ  Calories  HR  Lux  Steps  \
0 2020-03-24 00:27:00+00:00 2020-03-23 20:27:00  0.023486   0  NaN      0   
1 2020-03-24 00:28:00+00:00 2020-03-23 20:28:00  0.042275   0  NaN      0   
2 2020-03-24 00:29:00+00:00 2020-03-23 20:29:00  0.000000   0  NaN      0   
3 2020-03-24 00:30:00+00:00 2020-03-23 20:30:00  0.224123   0  NaN      0   
4 2020-03-24 00:31:00+00:00 2020-03-23 20:31:00  0.015434   0  NaN      0   

   Wear    x    y   z  AxisXCounts  AxisYCounts  AxisZCounts  
0  True    0   35   0            0           35            0  
1  True   44   63  55           44           63           55  
2  True    0    0   0            0            0            0  
3  True  193  334  71          193          334           71  
4  True   30   23   0           30           23            0  


## A Big Example

In [25]:
thor_df = pd.read_json('thor_itunes_search.json')
print(thor_df)

   resultCount                                            results
0            6  {'wrapperType': 'track', 'kind': 'feature-movi...
1            6  {'wrapperType': 'track', 'kind': 'feature-movi...
2            6  {'wrapperType': 'track', 'kind': 'feature-movi...
3            6  {'wrapperType': 'track', 'kind': 'feature-movi...
4            6  {'wrapperType': 'track', 'kind': 'feature-movi...
5            6  {'wrapperType': 'track', 'kind': 'feature-movi...


Pandas doesn't always give us correct column parsing, so you can use the `json` library to help with that!

In [27]:
json_obj = json.load(open('thor_itunes_search.json', 'r'))
print(json_obj)

{'resultCount': 6, 'results': [{'wrapperType': 'track', 'kind': 'feature-movie', 'collectionId': 1349600660, 'trackId': 689209608, 'artistName': 'Kenneth Branagh', 'collectionName': 'THOR Triple Bundle', 'trackName': 'Thor', 'collectionCensoredName': 'THOR Triple Bundle', 'trackCensoredName': 'Thor', 'collectionArtistId': 410641764, 'collectionArtistViewUrl': 'https://itunes.apple.com/us/artist/buena-vista-home-entertainment-inc/410641764?uo=4', 'collectionViewUrl': 'https://itunes.apple.com/us/movie/thor/id689209608?uo=4', 'trackViewUrl': 'https://itunes.apple.com/us/movie/thor/id689209608?uo=4', 'previewUrl': 'https://video-ssl.itunes.apple.com/itunes-assets/Video127/v4/b7/b9/73/b7b97307-7783-e8af-cd41-a203f2a7de49/mzvf_4347362604600795574.640x360.h264lc.U.p.m4v', 'artworkUrl30': 'https://is5-ssl.mzstatic.com/image/thumb/Video128/v4/f2/0b/fa/f20bfa07-96dc-e39b-d8db-1c2f8aae8482/source/30x30bb.jpg', 'artworkUrl60': 'https://is5-ssl.mzstatic.com/image/thumb/Video128/v4/f2/0b/fa/f20bfa0

In [31]:
results_arr = json_obj['results']

for result_obj in results_arr:
    track_name = result_obj['trackName']
    # task: grab the milliseconds and convert to minutes
    track_time_millis = result_obj['trackTimeMillis']
    track_time_mins = track_time_millis/1000/60
    print(track_name, ':', track_time_mins)
    print('*'*137)

Thor : 115.71488333333333
*****************************************************************************************************************************************
Thor: Ragnarok : 130.98638333333332
*****************************************************************************************************************************************
Thor: The Dark World : 112.69241666666667
*****************************************************************************************************************************************
Thor: Tales of Asgard : 76.92800000000001
*****************************************************************************************************************************************
I Am Thor : 82.63746666666665
*****************************************************************************************************************************************
Valhalla: The Legend of Thor (Dubbed) : 105.50666666666666
***************************************************************************