## JSON - Introduction

* JSON stands for JavaScript Object Notation
* JSON is a text format for storing and transporting data
* JSON is "self-describing" and easy to understand

**Why Use JSON?**

* You can receive pure text from a server and use it as a JavaScript object.
* You can send a JavaScript object to a server in pure text format.
* You can work with data as JavaScript objects, with no complicated parsing and translations.

**Storing Data**

* When storing data, the data has to be a certain format, and regardless of where you choose to store it, text is always one of the legal formats.
* JSON makes it possible to store JavaScript objects as text.

**JSON Datatypes**

In JSON, values must be one of the following data types:

* a string
* a number
* an object (JSON object)
* an array
* a boolean
* null

JSON values cannot be one of the following data types:

* a function
* a date
* undefined

**Resources**:
    
1. https://www.w3schools.com/js/js_json_intro.asp
2. https://www.youtube.com/watch?v=9N6a-VLBa2I

### Example 01

In [1]:
### Python string that happens to be a valid JSON
people_string = '''
{
    "people":[
        {
            "name" : "John Smith",
            "phone": "615-555-7164",
            "email": ["johnsmith@bogusemail.com", "john.smith@work_place.com"],
            "has_license": false
        },
        {
            "name" : "John Doe",
            "phone": "560-555-5153",
            "email": null,
            "has_license": true
        }    
    ]
}

'''

In [2]:
# string to python object (Note: loads-- "s" in "loads" stands for string)
import json
data = json.loads(people_string)
# https://docs.python.org/3/library/json.html

In [3]:
data

{'people': [{'name': 'John Smith',
   'phone': '615-555-7164',
   'email': ['johnsmith@bogusemail.com', 'john.smith@work_place.com'],
   'has_license': False},
  {'name': 'John Doe',
   'phone': '560-555-5153',
   'email': None,
   'has_license': True}]}

In [4]:
type(data)

dict

In [5]:
data['people']

[{'name': 'John Smith',
  'phone': '615-555-7164',
  'email': ['johnsmith@bogusemail.com', 'john.smith@work_place.com'],
  'has_license': False},
 {'name': 'John Doe',
  'phone': '560-555-5153',
  'email': None,
  'has_license': True}]

In [6]:
# Loop through people
for person in data['people']:
    print(person)

{'name': 'John Smith', 'phone': '615-555-7164', 'email': ['johnsmith@bogusemail.com', 'john.smith@work_place.com'], 'has_license': False}
{'name': 'John Doe', 'phone': '560-555-5153', 'email': None, 'has_license': True}


In [7]:
# Access the name of each person
for person in data['people']:
    print(person['name'])

John Smith
John Doe


In [8]:
# Delete the phone numbers from data dictionary
for person in data['people']:
    del person['phone']

In [9]:
data

{'people': [{'name': 'John Smith',
   'email': ['johnsmith@bogusemail.com', 'john.smith@work_place.com'],
   'has_license': False},
  {'name': 'John Doe', 'email': None, 'has_license': True}]}

In [10]:
# Dump the data dictionary back to string
new_string = json.dumps(data)

In [11]:
# Note that the new_string contains the phone number
print(new_string)

{"people": [{"name": "John Smith", "email": ["johnsmith@bogusemail.com", "john.smith@work_place.com"], "has_license": false}, {"name": "John Doe", "email": null, "has_license": true}]}


In [12]:
# Dump the data dictionary back to string using indentation option
new_string = json.dumps(data, indent = 2)
print(new_string)

{
  "people": [
    {
      "name": "John Smith",
      "email": [
        "johnsmith@bogusemail.com",
        "john.smith@work_place.com"
      ],
      "has_license": false
    },
    {
      "name": "John Doe",
      "email": null,
      "has_license": true
    }
  ]
}


In [13]:
# Dump the data dictionary back to string using indentation option and sort key option
new_string = json.dumps(data, indent = 2, sort_keys = True)
print(new_string)

{
  "people": [
    {
      "email": [
        "johnsmith@bogusemail.com",
        "john.smith@work_place.com"
      ],
      "has_license": false,
      "name": "John Smith"
    },
    {
      "email": null,
      "has_license": true,
      "name": "John Doe"
    }
  ]
}


### Example 02

In [14]:
## Open the states.json file
with open('states.json') as f:
    data = json.load(f)

In [15]:
print(data)

{'states': [{'name': 'Alabama', 'abbreviation': 'AL', 'area_codes': ['205', '251', '256', '334', '938']}, {'name': 'Alaska', 'abbreviation': 'AK', 'area_codes': ['907']}, {'name': 'Arizona', 'abbreviation': 'AZ', 'area_codes': ['480', '520', '602', '623', '928']}, {'name': 'Arkansas', 'abbreviation': 'AR', 'area_codes': ['479', '501', '870']}, {'name': 'California', 'abbreviation': 'CA', 'area_codes': ['209', '213', '310', '323', '408', '415', '424', '442', '510', '530', '559', '562', '619', '626', '628', '650', '657', '661', '669', '707', '714', '747', '760', '805', '818', '831', '858', '909', '916', '925', '949', '951']}, {'name': 'Colorado', 'abbreviation': 'CO', 'area_codes': ['303', '719', '720', '970']}, {'name': 'Connecticut', 'abbreviation': 'CT', 'area_codes': ['203', '475', '860', '959']}, {'name': 'Delaware', 'abbreviation': 'DE', 'area_codes': ['302']}, {'name': 'Florida', 'abbreviation': 'FL', 'area_codes': ['239', '305', '321', '352', '386', '407', '561', '727', '754', '7

In [16]:
## Access the state dictionaries of first five states
for state in data['states'][:5]:
    print(state)

{'name': 'Alabama', 'abbreviation': 'AL', 'area_codes': ['205', '251', '256', '334', '938']}
{'name': 'Alaska', 'abbreviation': 'AK', 'area_codes': ['907']}
{'name': 'Arizona', 'abbreviation': 'AZ', 'area_codes': ['480', '520', '602', '623', '928']}
{'name': 'Arkansas', 'abbreviation': 'AR', 'area_codes': ['479', '501', '870']}
{'name': 'California', 'abbreviation': 'CA', 'area_codes': ['209', '213', '310', '323', '408', '415', '424', '442', '510', '530', '559', '562', '619', '626', '628', '650', '657', '661', '669', '707', '714', '747', '760', '805', '818', '831', '858', '909', '916', '925', '949', '951']}


In [17]:
## Access the state name and state abbreviation of first five states
for state in data['states'][:5]:
    print(state['name'], state['abbreviation'])

Alabama AL
Alaska AK
Arizona AZ
Arkansas AR
California CA


In [18]:
## remove the area codes from the data dictionary
for state in data['states']:
    del state['area_codes']

In [19]:
for state in data['states'][:5]:
    print(state)

{'name': 'Alabama', 'abbreviation': 'AL'}
{'name': 'Alaska', 'abbreviation': 'AK'}
{'name': 'Arizona', 'abbreviation': 'AZ'}
{'name': 'Arkansas', 'abbreviation': 'AR'}
{'name': 'California', 'abbreviation': 'CA'}


In [20]:
## dump the modified data dictionary to a new json file
with open('new_states.json', 'w') as f:
    json.dump(data, f)

In [21]:
## use indent
with open('new_states.json', 'w') as f:
    json.dump(data, f, indent = 4)

### Example 03

It is pretty common for websites to return JSONs through their APIs, so that it is easy to parse.

In [22]:
import json
from urllib.request import urlopen

with urlopen("https://data.covid19india.org/v4/min/data.min.json") as response:
    source = response.read()

data = json.loads(source)

In [23]:
## for readability
print(json.dumps(data, indent = 4))

{
    "AN": {
        "delta": {
            "tested": 1376,
            "vaccinated1": 3,
            "vaccinated2": 13
        },
        "delta21_14": {
            "confirmed": 9
        },
        "delta7": {
            "confirmed": 3,
            "recovered": 5,
            "tested": 8936,
            "vaccinated1": 884,
            "vaccinated2": 10640
        },
        "districts": {
            "Nicobars": {
                "delta7": {
                    "vaccinated1": 62,
                    "vaccinated2": 811
                },
                "meta": {
                    "population": 36842
                },
                "total": {
                    "vaccinated1": 25394,
                    "vaccinated2": 20313
                }
            },
            "North and Middle Andaman": {
                "delta": {
                    "vaccinated2": 8
                },
                "delta7": {
                    "vaccinated1": 90,
                    "vaccinated2

In [24]:
## print states
for state in data:
    print(state)

AN
AP
AR
AS
BR
CH
CT
DL
DN
GA
GJ
HP
HR
JH
JK
KA
KL
LA
LD
MH
ML
MN
MP
MZ
NL
OR
PB
PY
RJ
SK
TG
TN
TR
TT
UP
UT
WB


In [25]:
## Access the state and the total number of people tested for Covid Vaccine
for state in data:
    print(f"State: {state}, Total Tested: {data[state]['total']['tested']}")

State: AN, Total Tested: 598033
State: AP, Total Tested: 29518787
State: AR, Total Tested: 1185436
State: AS, Total Tested: 24712042
State: BR, Total Tested: 50531824
State: CH, Total Tested: 792851
State: CT, Total Tested: 13709510
State: DL, Total Tested: 29427753
State: DN, Total Tested: 72410
State: GA, Total Tested: 1468399
State: GJ, Total Tested: 30928063
State: HP, Total Tested: 3685011
State: HR, Total Tested: 13032504
State: JH, Total Tested: 15985878
State: JK, Total Tested: 16202346
State: KA, Total Tested: 50873103
State: KL, Total Tested: 37886378
State: LA, Total Tested: 555568
State: LD, Total Tested: 263541
State: MH, Total Tested: 62667211
State: ML, Total Tested: 1151665
State: MN, Total Tested: 1367673
State: MP, Total Tested: 20294225
State: MZ, Total Tested: 1298444
State: NL, Total Tested: 395416
State: OR, Total Tested: 21994343
State: PB, Total Tested: 15429415
State: PY, Total Tested: 1919060
State: RJ, Total Tested: 14807752
State: SK, Total Tested: 261343
St

In [26]:
## Access the state and the total number of people vaccinated twice for Covid Vaccine
for state in data:
    print(f"State: {state}, Total Vaccinated Twice: {data[state]['total']['vaccinated2']}")

State: AN, Total Vaccinated Twice: 200157
State: AP, Total Vaccinated Twice: 20375181
State: AR, Total Vaccinated Twice: 534486
State: AS, Total Vaccinated Twice: 8068795
State: BR, Total Vaccinated Twice: 18346781
State: CH, Total Vaccinated Twice: 546981
State: CT, Total Vaccinated Twice: 7343273
State: DL, Total Vaccinated Twice: 7425404
State: DN, Total Vaccinated Twice: 370255
State: GA, Total Vaccinated Twice: 911114
State: GJ, Total Vaccinated Twice: 25972387
State: HP, Total Vaccinated Twice: 3443823
State: HR, Total Vaccinated Twice: 8115463
State: JH, Total Vaccinated Twice: 5585648
State: JK, Total Vaccinated Twice: 5149471
State: KA, Total Vaccinated Twice: 22858384
State: KL, Total Vaccinated Twice: 13658343
State: LA, Total Vaccinated Twice: 152280
State: LD, Total Vaccinated Twice: 45951
State: MH, Total Vaccinated Twice: 30975692
State: ML, Total Vaccinated Twice: 641819
State: MN, Total Vaccinated Twice: 719413
State: MP, Total Vaccinated Twice: 20838045
State: MZ, Tot

In [27]:
## Access the data of surat as on 2021-02-02'
print(f" Surat Population: {data['GJ']['districts']['Surat']['meta']['population']}")
print(f" Surat Total Confirmed Cases: {data['GJ']['districts']['Surat']['total']['confirmed']}")
print(f" Surat Total Recovered Cases: {data['GJ']['districts']['Surat']['total']['recovered']}")
print(f" Surat Total Vaccinations: {data['GJ']['districts']['Surat']['total']['vaccinated2']}")

 Surat Population: 4996391
 Surat Total Confirmed Cases: 143874
 Surat Total Recovered Cases: 141885
 Surat Total Vaccinations: 2529712


### Example 04

In [28]:
with open("employees.json") as datafile:
    data = json.load(datafile)

In [43]:
## Display the first names and last names of employee with email alias
for employee in data:
    print(f"{employee['FIRST_NAME']:15} {employee['LAST_NAME']:15} {employee['EMAIL']}")

Donald          OConnell        DOCONNEL
Douglas         Grant           DGRANT
Jennifer        Whalen          JWHALEN
Michael         Hartstein       MHARTSTE
Pat             Fay             PFAY
Susan           Mavris          SMAVRIS
Hermann         Baer            HBAER
Shelley         Higgins         SHIGGINS
William         Gietz           WGIETZ
Steven          King            SKING
Neena           Kochhar         NKOCHHAR
Lex             De Haan         LDEHAAN
Alexander       Hunold          AHUNOLD
Bruce           Ernst           BERNST
David           Austin          DAUSTIN
Valli           Pataballa       VPATABAL
Diana           Lorentz         DLORENTZ
Nancy           Greenberg       NGREENBE
Daniel          Faviet          DFAVIET
John            Chen            JCHEN
Ismael          Sciarra         ISCIARRA
Jose Manuel     Urman           JMURMAN
Luis            Popp            LPOPP
Den             Raphaely        DRAPHEAL
Alexander       Khoo            AKHOO
Shelli 

### Example 05

In [51]:
with open("questionnaire.json") as datafile:
    data = json.load(datafile)

In [52]:
data

{'quiz': {'sport': {'q1': {'question': 'Which one is correct team name in NBA?',
    'options': ['New York Bulls',
     'Los Angeles Kings',
     'Golden State Warriros',
     'Huston Rocket'],
    'answer': 'Huston Rocket'}},
  'maths': {'q1': {'question': '5 + 7 = ?',
    'options': ['10', '11', '12', '13'],
    'answer': '12'},
   'q2': {'question': '12 - 8 = ?',
    'options': ['1', '2', '3', '4'],
    'answer': '4'}}}}

In [53]:
data['quiz']

{'sport': {'q1': {'question': 'Which one is correct team name in NBA?',
   'options': ['New York Bulls',
    'Los Angeles Kings',
    'Golden State Warriros',
    'Huston Rocket'],
   'answer': 'Huston Rocket'}},
 'maths': {'q1': {'question': '5 + 7 = ?',
   'options': ['10', '11', '12', '13'],
   'answer': '12'},
  'q2': {'question': '12 - 8 = ?',
   'options': ['1', '2', '3', '4'],
   'answer': '4'}}}

In [57]:
data['quiz']['sport']

{'q1': {'question': 'Which one is correct team name in NBA?',
  'options': ['New York Bulls',
   'Los Angeles Kings',
   'Golden State Warriros',
   'Huston Rocket'],
  'answer': 'Huston Rocket'}}

In [59]:
data['quiz']['sport']['q1']

{'question': 'Which one is correct team name in NBA?',
 'options': ['New York Bulls',
  'Los Angeles Kings',
  'Golden State Warriros',
  'Huston Rocket'],
 'answer': 'Huston Rocket'}

In [60]:
data['quiz']['maths']['q1']

{'question': '5 + 7 = ?', 'options': ['10', '11', '12', '13'], 'answer': '12'}

In [61]:
data['quiz']['maths']['q2']

{'question': '12 - 8 = ?', 'options': ['1', '2', '3', '4'], 'answer': '4'}

In [96]:
for k1, v1 in data['quiz'].items():
    for k2, v2 in v1.items():
        print(k1, k2, v2)

sport q1 {'question': 'Which one is correct team name in NBA?', 'options': ['New York Bulls', 'Los Angeles Kings', 'Golden State Warriros', 'Huston Rocket'], 'answer': 'Huston Rocket'}
maths q1 {'question': '5 + 7 = ?', 'options': ['10', '11', '12', '13'], 'answer': '12'}
maths q2 {'question': '12 - 8 = ?', 'options': ['1', '2', '3', '4'], 'answer': '4'}


In [99]:
for k1, v1 in data['quiz'].items():
    for k2, v2 in v1.items():
        print(k1, k2, v2['question'], v2['options'][0],  v2['options'][1],  v2['options'][2],  v2['options'][3], v2['answer'])

sport q1 Which one is correct team name in NBA? New York Bulls Los Angeles Kings Golden State Warriros Huston Rocket Huston Rocket
maths q1 5 + 7 = ? 10 11 12 13 12
maths q2 12 - 8 = ? 1 2 3 4 4


In [105]:
my_data =[];
for k1, v1 in data['quiz'].items():
    for k2, v2 in v1.items():
        my_data.append((k1, k2, v2['question'], v2['options'][0],  v2['options'][1],  v2['options'][2],  v2['options'][3], v2['answer']))
        

cols = ['Discipline', 'Question Number', 'Question', 'Option 1', 'Option 2', 'Option 3', 'Option 4', 'Answer']
df = pd.DataFrame(my_data, columns = cols)
df

Unnamed: 0,Discipline,Question Number,Question,Option 1,Option 2,Option 3,Option 4,Answer
0,sport,q1,Which one is correct team name in NBA?,New York Bulls,Los Angeles Kings,Golden State Warriros,Huston Rocket,Huston Rocket
1,maths,q1,5 + 7 = ?,10,11,12,13,12
2,maths,q2,12 - 8 = ?,1,2,3,4,4
