## Common Use Cases for JSON

Here are some of the common use cases for JSON.
* Read data from JSON files.
* Write data to JSON files.
* We can use either `json` or `pandas` modules to read data from JSON files or write data to JSON files. One need to be familiar with both the approaches as each of them have different use cases and capabilities.
* Read JSON based response payloads on REST API Calls. We use `requests` module to process the REST API response Payloads.
* Once the payload is returned, we can use appropriate modules to process the data further.

### Read Data from JSON files

Here are the steps involved in reading data from JSON files.
* Using `json` module
  * Create file object using `open` in read only mode.
  * Pass the the `file` object to `json.load`.
  * `json.load` will return `dict`. We can process the data further using appropriate modules.
* Using `pandas` module
  * Use the path for the file to invoke `read_json`.
  * A Pandas Data Frame will be created.
  * We can process data further using rich APIs available in `pandas` module.

In [1]:
# Using json module
import json

yt_file = open('youtube_playlist_items.json')
yt_items = json.load(yt_file)

type(yt_items)

dict

In [2]:
yt_items.keys()

dict_keys(['kind', 'etag', 'nextPageToken', 'items', 'pageInfo'])

In [3]:
yt_items['items']

[{'kind': 'youtube#playlistItem',
  'etag': 'SGHDydc4dLsY2RjfXTPneb_zc_s',
  'id': 'UExmMHN3VEZoVEk4cmtINHlJZm95VEFoZUVHaldJUnRQRy5EQkE3RTJCQTJEQkFBQTcz',
  'contentDetails': {'videoId': 'ETZJln4jtAo',
   'videoPublishedAt': '2020-11-28T16:29:47Z'},
  'status': {'privacyStatus': 'public'}},
 {'kind': 'youtube#playlistItem',
  'etag': '5EFUNhJBvcwXPxO416VYQsXGzMo',
  'id': 'UExmMHN3VEZoVEk4cmtINHlJZm95VEFoZUVHaldJUnRQRy4yQzk4QTA5QjkzMTFFOEI1',
  'contentDetails': {'videoId': '1OVHjHTkP3M',
   'videoPublishedAt': '2020-11-28T16:30:12Z'},
  'status': {'privacyStatus': 'public'}},
 {'kind': 'youtube#playlistItem',
  'etag': 'TiKqB2aeYxJjMGKQ0yLMJY0vpQE',
  'id': 'UExmMHN3VEZoVEk4cmtINHlJZm95VEFoZUVHaldJUnRQRy45NDlDQUFFOThDMTAxQjUw',
  'contentDetails': {'videoId': 'qfUbPLsLQcQ',
   'videoPublishedAt': '2020-11-28T16:30:33Z'},
  'status': {'privacyStatus': 'public'}},
 {'kind': 'youtube#playlistItem',
  'etag': 'vQrJOpYdXmGJuV32kjj2xqvSByc',
  'id': 'UExmMHN3VEZoVEk4cmtINHlJZm95VEFoZUVHaldJ

In [4]:
# Further data processing (get video id and published time)
list(map(lambda rec: rec['contentDetails'], yt_items['items']))

[{'videoId': 'ETZJln4jtAo', 'videoPublishedAt': '2020-11-28T16:29:47Z'},
 {'videoId': '1OVHjHTkP3M', 'videoPublishedAt': '2020-11-28T16:30:12Z'},
 {'videoId': 'qfUbPLsLQcQ', 'videoPublishedAt': '2020-11-28T16:30:33Z'},
 {'videoId': 'rLTbhSaXhSM', 'videoPublishedAt': '2020-11-28T16:30:52Z'},
 {'videoId': 'wP7BhXrJKR8', 'videoPublishedAt': '2020-11-28T16:31:14Z'}]

In [9]:
# Using Pandas Module
# As youtube items are part of nested json, we need to use both json and pandas

import json
import pandas as pd

yt_file = open('youtube_playlist_items.json')
yt_items = json.load(yt_file)

In [10]:
yt_items.keys()

dict_keys(['kind', 'etag', 'nextPageToken', 'items', 'pageInfo'])

In [11]:
yt_items

{'kind': 'youtube#playlistItemListResponse',
 'etag': 'lfs_qWNaczIydJ2Dlp1gmX9UTAc',
 'nextPageToken': 'CAUQAA',
 'items': [{'kind': 'youtube#playlistItem',
   'etag': 'SGHDydc4dLsY2RjfXTPneb_zc_s',
   'id': 'UExmMHN3VEZoVEk4cmtINHlJZm95VEFoZUVHaldJUnRQRy5EQkE3RTJCQTJEQkFBQTcz',
   'contentDetails': {'videoId': 'ETZJln4jtAo',
    'videoPublishedAt': '2020-11-28T16:29:47Z'},
   'status': {'privacyStatus': 'public'}},
  {'kind': 'youtube#playlistItem',
   'etag': '5EFUNhJBvcwXPxO416VYQsXGzMo',
   'id': 'UExmMHN3VEZoVEk4cmtINHlJZm95VEFoZUVHaldJUnRQRy4yQzk4QTA5QjkzMTFFOEI1',
   'contentDetails': {'videoId': '1OVHjHTkP3M',
    'videoPublishedAt': '2020-11-28T16:30:12Z'},
   'status': {'privacyStatus': 'public'}},
  {'kind': 'youtube#playlistItem',
   'etag': 'TiKqB2aeYxJjMGKQ0yLMJY0vpQE',
   'id': 'UExmMHN3VEZoVEk4cmtINHlJZm95VEFoZUVHaldJUnRQRy45NDlDQUFFOThDMTAxQjUw',
   'contentDetails': {'videoId': 'qfUbPLsLQcQ',
    'videoPublishedAt': '2020-11-28T16:30:33Z'},
   'status': {'privacyStatu

In [None]:
pd.json_normalize?

In [13]:
yt_df = pd.json_normalize(yt_items, 'items')

In [14]:
yt_df

Unnamed: 0,kind,etag,id,contentDetails.videoId,contentDetails.videoPublishedAt,status.privacyStatus
0,youtube#playlistItem,SGHDydc4dLsY2RjfXTPneb_zc_s,UExmMHN3VEZoVEk4cmtINHlJZm95VEFoZUVHaldJUnRQRy...,ETZJln4jtAo,2020-11-28T16:29:47Z,public
1,youtube#playlistItem,5EFUNhJBvcwXPxO416VYQsXGzMo,UExmMHN3VEZoVEk4cmtINHlJZm95VEFoZUVHaldJUnRQRy...,1OVHjHTkP3M,2020-11-28T16:30:12Z,public
2,youtube#playlistItem,TiKqB2aeYxJjMGKQ0yLMJY0vpQE,UExmMHN3VEZoVEk4cmtINHlJZm95VEFoZUVHaldJUnRQRy...,qfUbPLsLQcQ,2020-11-28T16:30:33Z,public
3,youtube#playlistItem,vQrJOpYdXmGJuV32kjj2xqvSByc,UExmMHN3VEZoVEk4cmtINHlJZm95VEFoZUVHaldJUnRQRy...,rLTbhSaXhSM,2020-11-28T16:30:52Z,public
4,youtube#playlistItem,2CzGUToIgqywXAr4wuPswj9MuFg,UExmMHN3VEZoVEk4cmtINHlJZm95VEFoZUVHaldJUnRQRy...,wP7BhXrJKR8,2020-11-28T16:31:14Z,public


In [15]:
yt_df[['contentDetails.videoId', 'contentDetails.videoPublishedAt']]

Unnamed: 0,contentDetails.videoId,contentDetails.videoPublishedAt
0,ETZJln4jtAo,2020-11-28T16:29:47Z
1,1OVHjHTkP3M,2020-11-28T16:30:12Z
2,qfUbPLsLQcQ,2020-11-28T16:30:33Z
3,rLTbhSaXhSM,2020-11-28T16:30:52Z
4,wP7BhXrJKR8,2020-11-28T16:31:14Z


In [17]:
# Using json to process customers data
# We have one customer per line
# We need to read the data as string then use json.loads to convert each string to dict.

import json

customers_file = open('customers.json')
customers_list = customers_file.read().splitlines()

# Converting the records in the file into list of dicts
# We are processing each element in customers_list
customers = list(map(json.loads, customers_list))

In [18]:
customers

[{'id': 1,
  'first_name': 'Frasco',
  'last_name': 'Necolds',
  'email': 'fnecolds0@vk.com',
  'gender': 'Male',
  'ip_address': '243.67.63.34'},
 {'id': 2,
  'first_name': 'Dulce',
  'last_name': 'Santos',
  'email': 'dsantos1@mashable.com',
  'gender': 'Female',
  'ip_address': '60.30.246.227'},
 {'id': 3,
  'first_name': 'Prissie',
  'last_name': 'Tebbett',
  'email': 'ptebbett2@infoseek.co.jp',
  'gender': 'Genderfluid',
  'ip_address': '22.21.162.56'},
 {'id': 4,
  'first_name': 'Schuyler',
  'last_name': 'Coppledike',
  'email': 'scoppledike3@gnu.org',
  'gender': 'Agender',
  'ip_address': '120.35.186.161'},
 {'id': 5,
  'first_name': 'Leopold',
  'last_name': 'Jarred',
  'email': 'ljarred4@wp.com',
  'gender': 'Agender',
  'ip_address': '30.119.34.4'},
 {'id': 6,
  'first_name': 'Joanna',
  'last_name': 'Teager',
  'email': 'jteager5@apache.org',
  'gender': 'Bigender',
  'ip_address': '245.221.176.34'},
 {'id': 7,
  'first_name': 'Lion',
  'last_name': 'Beere',
  'email': 'lb

In [19]:
# Using Pandas only to process customers data
# For customers where we have one json per line, we can use Pandas directly

import pandas as pd

customers = pd.read_json('customers.json', lines=True)

In [20]:
customers

Unnamed: 0,id,first_name,last_name,email,gender,ip_address
0,1,Frasco,Necolds,fnecolds0@vk.com,Male,243.67.63.34
1,2,Dulce,Santos,dsantos1@mashable.com,Female,60.30.246.227
2,3,Prissie,Tebbett,ptebbett2@infoseek.co.jp,Genderfluid,22.21.162.56
3,4,Schuyler,Coppledike,scoppledike3@gnu.org,Agender,120.35.186.161
4,5,Leopold,Jarred,ljarred4@wp.com,Agender,30.119.34.4
5,6,Joanna,Teager,jteager5@apache.org,Bigender,245.221.176.34
6,7,Lion,Beere,lbeere6@bloomberg.com,Polygender,105.54.139.46
7,8,Marabel,Wornum,mwornum7@posterous.com,Polygender,247.229.14.25
8,9,Helenka,Mullender,hmullender8@cloudflare.com,Non-binary,133.216.118.88
9,10,Christine,Swane,cswane9@shop-pro.jp,Polygender,86.16.210.164


### Write Data to JSON files

Here are the steps involved in writing data to JSON files.
* Using `json` module
  * Make sure the `dict` object is ready with processed data as per requirements before writing to the file.
  * Create file object using `open` in write mode.
  * Pass the `file` object to `json.dump`.
  * The `dict` will be dumped in the form of JSON in the file.
* Using `pandas` module
  * Make sure the Data Frame is ready with processed data as per requirements before writing to the file.
  * Use the path for the file to invoke `to_json`. It can be invoked using Data Frame object which have the processed data.
  * The Pandas Data Frame will be written in the form of JSON in the file.
  * We can leverage additional keyword arguments to control the behavior. For example `orient=records` can be used to write the data frame in the form of one JSON document per line.