## Process JSON String

Let us understand how to process JSON strings using Python as programming language. Later we will see different ways of storing JSON data in files.

We will see following examples of processing JSON strings.
* Single JSON document.
* Multiple JSON documents, with one JSON per line.
* Multiple JSON documents as an Array under one attribute. Most of the REST APIs which return multiple elements follow this approach.
* We can process JSON Strings either by using `json` module or `pandas`.
* As part of developing backend for web or mobile applications we use `json` or some high level wrappers. For bulk data processing typically we fall back on modules such as `pandas`.
* You should be familiar with both. For now, we will focus on `json`.
* We should first import `json` module to process the JSON strings using it.
* We have a function called as `loads` which takes a JSON in string and returns `dict`.

### Single JSON document

Let us go through the details of processing Single JSON document. 
* Import `json` module.
* Create JSON String.
* Pass the string to `json.loads`. It will return `dict`.
* Assign it to a variable and use it further.

In [None]:
import json

In [None]:
person = '{"id":1,"first_name":"Frasco","last_name":"Necolds","email":"fnecolds0@vk.com","gender":"Male","ip_address":"243.67.63.34"}'

In [None]:
type(person)

In [None]:
json.loads?

In [None]:
person_dict = json.loads(person)

In [None]:
type(person_dict)

In [None]:
print(person_dict)

In [None]:
person_dict['id']

In [None]:
person_dict['first_name']

In [None]:
person_dict.keys()

In [None]:
person_dict.items()

* Here is an example of a single JSON as string that is part of multiple lines.

In [None]:
import json

In [None]:
person = '''{
    "id":1,
    "first_name":"Frasco",
    "last_name":"Necolds",
    "email":"fnecolds0@vk.com",
    "gender":"Male",
    "ip_address":"243.67.63.34"
}'''

In [None]:
type(person)

In [None]:
person_dict = json.loads(person)

In [None]:
type(person_dict)

In [None]:
print(person_dict)

### Multiple JSON Documents - One per line

Let us go through the steps involved in processing a string which contain one JSON per line.
* We should convert the string into list of JSON strings and then use `json.loads` to process each JSON.
* Import `json` module.
* Split the string into multiple strings using new line character (`\n`) as delimiter. String have a function called as `splitlines` and we should be able to leverage it.
* Use `for` loop or `map` function to convert list of JSON Strings into list of dicts. We should use `json.loads` to convert each JSON String as dict.

In [None]:
persons = '''{"id":1,"first_name":"Frasco","last_name":"Necolds","email":"fnecolds0@vk.com","gender":"Male","ip_address":"243.67.63.34"}
{"id":2,"first_name":"Dulce","last_name":"Santos","email":"dsantos1@mashable.com","gender":"Female","ip_address":"60.30.246.227"}
{"id":3,"first_name":"Prissie","last_name":"Tebbett","email":"ptebbett2@infoseek.co.jp","gender":"Genderfluid","ip_address":"22.21.162.56"}
{"id":4,"first_name":"Schuyler","last_name":"Coppledike","email":"scoppledike3@gnu.org","gender":"Agender","ip_address":"120.35.186.161"}
{"id":5,"first_name":"Leopold","last_name":"Jarred","email":"ljarred4@wp.com","gender":"Agender","ip_address":"30.119.34.4"}
{"id":6,"first_name":"Joanna","last_name":"Teager","email":"jteager5@apache.org","gender":"Bigender","ip_address":"245.221.176.34"}
{"id":7,"first_name":"Lion","last_name":"Beere","email":"lbeere6@bloomberg.com","gender":"Polygender","ip_address":"105.54.139.46"}
{"id":8,"first_name":"Marabel","last_name":"Wornum","email":"mwornum7@posterous.com","gender":"Polygender","ip_address":"247.229.14.25"}
{"id":9,"first_name":"Helenka","last_name":"Mullender","email":"hmullender8@cloudflare.com","gender":"Non-binary","ip_address":"133.216.118.88"}
{"id":10,"first_name":"Christine","last_name":"Swane","email":"cswane9@shop-pro.jp","gender":"Polygender","ip_address":"86.16.210.164"}'''

In [None]:
type(persons)

In [None]:
persons.splitlines?

In [None]:
# Using for loop
import json

In [None]:
persons_list = persons.splitlines()

In [None]:
type(persons_list)

In [None]:
type(persons_list[0])

In [None]:
persons_list[1]

In [None]:
json.loads(persons_list[0])

* Converting list of strings to list of dicts using conventional loops.

In [None]:
persons_dict_list = []

for person in persons_list:
    persons_dict_list.append(json.loads(person))

In [None]:
type(persons_dict_list)

In [None]:
type(persons_dict_list[0])

In [None]:
persons_dict_list[0]

In [None]:
persons_dict_list[0]['first_name']

* Converting list of strings to list of dicts using list comprehensions.

In [None]:
persons_dict_list = [json.loads(person) for person in persons_list]

In [None]:
type(persons_dict_list)

In [None]:
type(persons_dict_list[0])

In [None]:
persons_dict_list[0]

* Converting list of strings to list of dicts using `map` function.

In [None]:
persons_dict_list = list(map(json.loads, persons_list))

In [None]:
type(persons_dict_list)

In [None]:
type(persons_dict_list[0])

In [None]:
persons_dict_list[0]

In [None]:
list(map(lambda person: person['first_name'], persons_dict_list))

In [None]:
list(filter(lambda person: person['gender'] == 'Female', persons_dict_list))

### Multiple JSON Documents - Array

Let us go through the details of processing multiple JSON Documents as an array.
* We should be able to use `json.loads`. For the below string it will return Python list.
* Steps are same as processing single JSON document.
  * Import `json` module.
  * Use `json.loads` to convert to Python list.
  * Process using Python capabilities.

In [None]:
persons = '''[{"id":1,"first_name":"Frasco","last_name":"Necolds","email":"fnecolds0@vk.com","gender":"Male","ip_address":"243.67.63.34"},
{"id":2,"first_name":"Dulce","last_name":"Santos","email":"dsantos1@mashable.com","gender":"Female","ip_address":"60.30.246.227"},
{"id":3,"first_name":"Prissie","last_name":"Tebbett","email":"ptebbett2@infoseek.co.jp","gender":"Genderfluid","ip_address":"22.21.162.56"},
{"id":4,"first_name":"Schuyler","last_name":"Coppledike","email":"scoppledike3@gnu.org","gender":"Agender","ip_address":"120.35.186.161"},
{"id":5,"first_name":"Leopold","last_name":"Jarred","email":"ljarred4@wp.com","gender":"Agender","ip_address":"30.119.34.4"},
{"id":6,"first_name":"Joanna","last_name":"Teager","email":"jteager5@apache.org","gender":"Bigender","ip_address":"245.221.176.34"},
{"id":7,"first_name":"Lion","last_name":"Beere","email":"lbeere6@bloomberg.com","gender":"Polygender","ip_address":"105.54.139.46"},
{"id":8,"first_name":"Marabel","last_name":"Wornum","email":"mwornum7@posterous.com","gender":"Polygender","ip_address":"247.229.14.25"},
{"id":9,"first_name":"Helenka","last_name":"Mullender","email":"hmullender8@cloudflare.com","gender":"Non-binary","ip_address":"133.216.118.88"},
{"id":10,"first_name":"Christine","last_name":"Swane","email":"cswane9@shop-pro.jp","gender":"Polygender","ip_address":"86.16.210.164"}]'''

In [None]:
persons_dict_list = json.loads(persons)

In [None]:
type(persons_dict_list)

In [None]:
type(persons_dict_list[0])

In [None]:
persons_dict_list[0]

* When we use `json.loads` on below string, it will create a dict where value will be a list.

In [None]:
persons = '''{
    "results": [
        {
            "id": 1,
            "first_name": "Frasco",
            "last_name": "Necolds",
            "email": "fnecolds0@vk.com",
            "gender": "Male",
            "ip_address": "243.67.63.34"
        },
        {
            "id": 2,
            "first_name": "Dulce",
            "last_name": "Santos",
            "email": "dsantos1@mashable.com",
            "gender": "Female",
            "ip_address": "60.30.246.227"
        },
        {
            "id": 3,
            "first_name": "Prissie",
            "last_name": "Tebbett",
            "email": "ptebbett2@infoseek.co.jp",
            "gender": "Genderfluid",
            "ip_address": "22.21.162.56"
        },
        {
            "id": 4,
            "first_name": "Schuyler",
            "last_name": "Coppledike",
            "email": "scoppledike3@gnu.org",
            "gender": "Agender",
            "ip_address": "120.35.186.161"
        },
        {
            "id": 5,
            "first_name": "Leopold",
            "last_name": "Jarred",
            "email": "ljarred4@wp.com",
            "gender": "Agender",
            "ip_address": "30.119.34.4"
        },
        {
            "id": 6,
            "first_name": "Joanna",
            "last_name": "Teager",
            "email": "jteager5@apache.org",
            "gender": "Bigender",
            "ip_address": "245.221.176.34"
        },
        {
            "id": 7,
            "first_name": "Lion",
            "last_name": "Beere",
            "email": "lbeere6@bloomberg.com",
            "gender": "Polygender",
            "ip_address": "105.54.139.46"
        },
        {
            "id": 8,
            "first_name": "Marabel",
            "last_name": "Wornum",
            "email": "mwornum7@posterous.com",
            "gender": "Polygender",
            "ip_address": "247.229.14.25"
        },
        {
            "id": 9,
            "first_name": "Helenka",
            "last_name": "Mullender",
            "email": "hmullender8@cloudflare.com",
            "gender": "Non-binary",
            "ip_address": "133.216.118.88"
        },
        {
            "id": 10,
            "first_name": "Christine",
            "last_name": "Swane",
            "email": "cswane9@shop-pro.jp",
            "gender": "Polygender",
            "ip_address": "86.16.210.164"
        }
    ]
}'''

In [None]:
import json

In [None]:
person_results = json.loads(persons)

In [None]:
type(person_results)

In [None]:
person_results.keys()

In [None]:
type(person_results['results'])

In [None]:
type(person_results['results'][0])

In [None]:
person_results['results'][0]

In [None]:
results = person_results['results']

In [None]:
type(results)

In [None]:
results[0]

### Exercise on processing collections

Take person_results and get list of dicts where each dict contain id, first_name and email. We would like to send an offer for all the persons in the form of email.
* You should use `map` function for the same.
* Do not use loops.