![jsonpng.png](attachment:jsonpng.png)

### Objectives

- JSON background
- why use JSON
- Example of JSON schema
- Example of how work with JSON data

# JSON Data in Python


- JSON(JavaScript Object Notation) is a lightweight data-interchange format that easy for humans to read and write. 



- The process of encoding the JSON data is referred to as serialization. 



- Since serialization is encoding of the data, we can guess the term used for decoding. Yes, it is deserialization.



- **Primarily used to transmit data between a server and web applications.**

JSON is built on two structures:

- A collection of name/value pairs. This is realized as an object, record, dictionary, hash table, keyed list, or associative array.
- An ordered list of values. This is realized as an array, vector, list, or sequence.

### Advantages of JSON
- Very popular data format for APIs (e.g. results from an Internet search)
- Human readable
- Each record (or document as they are called) is self contained. The equivalent of the column name and column values are in every record.
- Documents do not all have to have the same structure within the same file
- Document structures can be complex and nested
- It's easy for computers to parse and generate
- Text format that is language independent


### Dis-advantages of JSON
- It is more verbose than the equivalent data in csv format
- Can be more difficult to process and display than csv formatted data

#### JSON keys 
- on the left side of the colon. 
- They need to be wrapped in double quotation marks, as in `"key"`, and can be any valid string. 
- Within each object, keys need to be unique. 
- Key strings can include whitespaces, as in "first name", but it’s best to use underscores, as in "first_name".

#### JSON values 
- are found to the right of the colon. 

- Need to be one of 6 simple data types:

 - strings
 - numbers
 - objects
 - arrays
 - Booleans (true or false)
  - null

### Example


Run the cell of code below which imports a json file and then loads it into python. Investigate the resulting `data` variable and learn all you can abou the object. 

In [47]:
!cat output.json; 


{"albums": {"href": "https://api.spotify.com/v1/browse/new-releases?country=SE&offset=0&limit=20", "items": [{"album_type": "single", "artists": [{"external_urls": {"spotify": "https://open.spotify.com/artist/2RdwBSPQiwcmiDo9kixcl8"}, "href": "https://api.spotify.com/v1/artists/2RdwBSPQiwcmiDo9kixcl8", "id": "2RdwBSPQiwcmiDo9kixcl8", "name": "Pharrell Williams", "type": "artist", "uri": "spotify:artist:2RdwBSPQiwcmiDo9kixcl8"}], "available_markets": ["AD", "AR", "AT", "AU", "BE", "BG", "BO", "BR", "CA", "CH", "CL", "CO", "CR", "CY", "CZ", "DE", "DK", "DO", "EC", "EE", "ES", "FI", "FR", "GB", "GR", "GT", "HK", "HN", "HU", "ID", "IE", "IS", "IT", "JP", "LI", "LT", "LU", "LV", "MC", "MT", "MX", "MY", "NI", "NL", "NO", "NZ", "PA", "PE", "PH", "PL", "PT", "PY", "SE", "SG", "SK", "SV", "TR", "TW", "US", "UY"], "external_urls": {"spotify": "https://open.spotify.com/album/5ZX4m5aVSmWQ5iHAPQpT71"}, "href": "https://api.spotify.com/v1/albums/5ZX4m5aVSmWQ5iHAPQpT71", "id": "5ZX4m5aVSmWQ5iHAPQpT71

### How to read json file in python?
- Reading JSON data from a file is very easy.  
- `json.load()` method reads the string from a file, parses the JSON data. 
- Then it populates a Python dictionary with the parsed data and returns it back to us.

In [48]:
import json
f = open('output.json')
data = json.load(f)
data


{'albums': {'href': 'https://api.spotify.com/v1/browse/new-releases?country=SE&offset=0&limit=20',
  'items': [{'album_type': 'single',
    'artists': [{'external_urls': {'spotify': 'https://open.spotify.com/artist/2RdwBSPQiwcmiDo9kixcl8'},
      'href': 'https://api.spotify.com/v1/artists/2RdwBSPQiwcmiDo9kixcl8',
      'id': '2RdwBSPQiwcmiDo9kixcl8',
      'name': 'Pharrell Williams',
      'type': 'artist',
      'uri': 'spotify:artist:2RdwBSPQiwcmiDo9kixcl8'}],
    'available_markets': ['AD',
     'AR',
     'AT',
     'AU',
     'BE',
     'BG',
     'BO',
     'BR',
     'CA',
     'CH',
     'CL',
     'CO',
     'CR',
     'CY',
     'CZ',
     'DE',
     'DK',
     'DO',
     'EC',
     'EE',
     'ES',
     'FI',
     'FR',
     'GB',
     'GR',
     'GT',
     'HK',
     'HN',
     'HU',
     'ID',
     'IE',
     'IS',
     'IT',
     'JP',
     'LI',
     'LT',
     'LU',
     'LV',
     'MC',
     'MT',
     'MX',
     'MY',
     'NI',
     'NL',
     'NO',
     'NZ',
    

In [49]:
# What type is data?


Performs the following translations in decoding by default:


![Screen%20Shot%202019-10-08%20at%208.52.29%20AM.png](attachment:Screen%20Shot%202019-10-08%20at%208.52.29%20AM.png)

## Exploring JSON Schemas  

Recall that JSON files have a nested structure. The most granular level of raw data will be individual numbers (float/int) and strings. These in turn will be stored in the equivalent of python lists and dictionaries. Because these can be combined, we'll start exploring by checking the type of our root object, and start mapping out the hierarchy of the json file.

In [50]:
# What are the keys


In [51]:
# What are the keys for this key
# Note we have a nested structure


At this point, things should look something like this: 

![json_diagram1.JPG](attachment:json_diagram1.JPG)

At this point, if we want to continue checking individual data types, we have a lot to go through. To simplify this, let's use a for loop:

In [52]:
for key in data['albums'].keys():
    print(key, type(data['albums'][key]))
    
    

href <class 'str'>
items <class 'list'>
limit <class 'int'>
next <class 'str'>
offset <class 'int'>
previous <class 'NoneType'>
total <class 'int'>


Adding this to our diagram we now have something like this:
![json_diagram2.JPG](attachment:json_diagram2.JPG)

### Back to Pandas

In [53]:
import pandas as pd

In [71]:
df = pd.DataFrame(data['albums']['items'])
df.head()
df

Unnamed: 0,album_type,artists,available_markets,external_urls,href,id,images,name,type,uri
0,single,[{'external_urls': {'spotify': 'https://open.s...,"[AD, AR, AT, AU, BE, BG, BO, BR, CA, CH, CL, C...",{'spotify': 'https://open.spotify.com/album/5Z...,https://api.spotify.com/v1/albums/5ZX4m5aVSmWQ...,5ZX4m5aVSmWQ5iHAPQpT71,"[{'height': 640, 'url': 'https://i.scdn.co/ima...",Runnin',album,spotify:album:5ZX4m5aVSmWQ5iHAPQpT71
1,single,[{'external_urls': {'spotify': 'https://open.s...,"[AD, AR, AT, AU, BE, BG, BO, BR, CH, CL, CO, C...",{'spotify': 'https://open.spotify.com/album/0g...,https://api.spotify.com/v1/albums/0geTzdk2Inlq...,0geTzdk2InlqIoB16fW9Nd,"[{'height': 640, 'url': 'https://i.scdn.co/ima...",Sneakin’,album,spotify:album:0geTzdk2InlqIoB16fW9Nd


## Creating JSON files

In [75]:
import json
ds_atlanta = {"course":"python", "topic":"Python JSON"}
ds_atlanta_json = json.dumps(ds_atlanta)
with open('new_json.json', 'w') as f:
  json.dump(ds_atlanta_json, f, indent=2)

# Up Next API !