# [CPSC 222](https://github.com/GonzagaCPSC222) Intro to Data Science
[Gonzaga University](https://www.gonzaga.edu/)

[Gina Sprint](http://cs.gonzaga.edu/faculty/sprint/)

# JSON
What are our learning objectives for this lesson?
* Learn about JSON
* Parse data from a JSON object

Content used in this lesson is based upon information in the following sources:
* None to report

## JSON
* JSON (Javascript Object Notation): a lightweight data structure commonly used for passing data around on the web (followed by XML as a close second, eXtensible Markup Language). 
    * Not specific to Javascript
    * Easy for humans to read and write
    * Easy for machines to parse and generate
* JSON is a collection of name/value pairs (JSON is really just a dictionary)
    * Names are strings
    * Values can be any of the following types
        * String
        * Number
        * Object (JSON object) 
            * Curly braces hold objects
        * Array
            * Square brackets hold arrays
        * Boolean
        * Null
* Example JSON object from https://www.w3schools.com/js/js_json_xml.asp:

```json
{"employees":[
  { "firstName":"John", "lastName":"Doe" },
  { "firstName":"Anna", "lastName":"Smith" },
  { "firstName":"Peter", "lastName":"Jones" }
]}
```

* Compared with the same info stored in XML

```xml
<employees>
  <employee>
    <firstName>John</firstName> <lastName>Doe</lastName>
  </employee>
  <employee>
    <firstName>Anna</firstName> <lastName>Smith</lastName>
  </employee>
  <employee>
    <firstName>Peter</firstName> <lastName>Jones</lastName>
  </employee>
</employees>
```

* Learn more about JSON with the [W3 Schools JSON tutorial](https://www.w3schools.com/js/js_json_intro.asp)
    * W3 Schools is also the source of the above JSON vs XML example

## Working with JSON in Python
Here is another example JSON object:

```json
{
    "TimestampUTC": "2020-03-24T00:27:00Z",
    "TimestampSubjectTZ": "2020-03-23T20:27:00",
    "Calories": 0.0234859050963356,
    "HR": 0.0,
    "Lux": null,
    "Steps": 0.0,
    "Wear": true,
    "x": 0,
    "y": 35,
    "z": 0,
    "AxisXCounts": 0,
    "AxisYCounts": 35,
    "AxisZCounts": 0
  }
```

The above JSON object represents one minute of wearable data collected from a device called an Actigraph. The [actigraph_data.json](https://raw.githubusercontent.com/GonzagaCPSC222/U5-JSON-APIs/master/files/actigraph_data.json) file contains an array of five minutes of such JSON objects.

We can open this file and load its data in using the `json` module. This will give us a Python list or dictionary:

In [1]:
import json

infile = open("files/actigraph_data.json", "r")
json_list = json.load(infile) # returns a list in this case (see file contents)
print(json_list)
print(type(json_list))
print()

# index once into the list to get a dict
first_minute_of_data = json_list[0]
print(first_minute_of_data)
print(type(first_minute_of_data))
print()

# use a key to get a value from the JSON object dictionary
calories = first_minute_of_data["Calories"]
print(calories)
print(type(calories))

[{'TimestampUTC': '2020-03-24T00:27:00Z', 'TimestampSubjectTZ': '2020-03-23T20:27:00', 'Calories': 0.0234859050963356, 'HR': 0.0, 'Lux': None, 'Steps': 0.0, 'Wear': True, 'x': 0, 'y': 35, 'z': 0, 'AxisXCounts': 0, 'AxisYCounts': 35, 'AxisZCounts': 0}, {'TimestampUTC': '2020-03-24T00:28:00Z', 'TimestampSubjectTZ': '2020-03-23T20:28:00', 'Calories': 0.042274629173404, 'HR': 0.0, 'Lux': None, 'Steps': 0.0, 'Wear': True, 'x': 44, 'y': 63, 'z': 55, 'AxisXCounts': 44, 'AxisYCounts': 63, 'AxisZCounts': 55}, {'TimestampUTC': '2020-03-24T00:29:00Z', 'TimestampSubjectTZ': '2020-03-23T20:29:00', 'Calories': 0.0, 'HR': 0.0, 'Lux': None, 'Steps': 0.0, 'Wear': True, 'x': 0, 'y': 0, 'z': 0, 'AxisXCounts': 0, 'AxisYCounts': 0, 'AxisZCounts': 0}, {'TimestampUTC': '2020-03-24T00:30:00Z', 'TimestampSubjectTZ': '2020-03-23T20:30:00', 'Calories': 0.224122637205031, 'HR': 0.0, 'Lux': None, 'Steps': 0.0, 'Wear': True, 'x': 193, 'y': 334, 'z': 71, 'AxisXCounts': 193, 'AxisYCounts': 334, 'AxisZCounts': 71}, {'

We can also open this file and load its data in using pandas `read_json()` function. This will give us `DataFrame`:

In [2]:
import pandas as pd 

df = pd.read_json("files/actigraph_data.json")
print(df)

               TimestampUTC  TimestampSubjectTZ  Calories  HR  Lux  Steps  \
0 2020-03-24 00:27:00+00:00 2020-03-23 20:27:00  0.023486   0  NaN      0   
1 2020-03-24 00:28:00+00:00 2020-03-23 20:28:00  0.042275   0  NaN      0   
2 2020-03-24 00:29:00+00:00 2020-03-23 20:29:00  0.000000   0  NaN      0   
3 2020-03-24 00:30:00+00:00 2020-03-23 20:30:00  0.224123   0  NaN      0   
4 2020-03-24 00:31:00+00:00 2020-03-23 20:31:00  0.015434   0  NaN      0   

   Wear    x    y   z  AxisXCounts  AxisYCounts  AxisZCounts  
0  True    0   35   0            0           35            0  
1  True   44   63  55           44           63           55  
2  True    0    0   0            0            0            0  
3  True  193  334  71          193          334           71  
4  True   30   23   0           30           23            0  
