## Multiple JSON Documents in files

At times we might have multiple JSON documents in a text file. Typically, we will have one valid JSON per line. Let us understand the process of reading a file where there are multiple JSON documents one per line.
* If you use `pandas`, it is straight forward. However, we will talk about using `pandas` later.
* We cannot use `json` module directly. Here are the steps to use JSON module.
  * Create file type object by passing the path to `open`.
  * Use `read` to read the content in the file into a string.
  * Once string object is created, we can use `splitlines` to convert these lines into list of strings. Here each element is of type string which contain json.
  * Now we can iterate through the elements and convert each string with JSON to dict using `json.loads`.

In [1]:
import json

In [2]:
json.load?

[0;31mSignature:[0m
[0mjson[0m[0;34m.[0m[0mload[0m[0;34m([0m[0;34m[0m
[0;34m[0m    [0mfp[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0;34m*[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mcls[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mobject_hook[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mparse_float[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mparse_int[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mparse_constant[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mobject_pairs_hook[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0;34m**[0m[0mkw[0m[0;34m,[0m[0;34m[0m
[0;34m[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Deserialize ``fp`` (a ``.read()``-supporting file-like object containing
a JSON document) to a Python object.

``object_hook`` is an optional function that will be called with

In [3]:
json.load(open('customers.json'))

JSONDecodeError: Extra data: line 2 column 1 (char 124)

* Create file type object by passing the path to `open`.

In [4]:
type(open('customers.json'))

_io.TextIOWrapper

* Use `read` to read the content in the file into a string.

In [5]:
type(open('customers.json').read())

str

In [6]:
open('customers.json').read()

'{"id":1,"first_name":"Frasco","last_name":"Necolds","email":"fnecolds0@vk.com","gender":"Male","ip_address":"243.67.63.34"}\n{"id":2,"first_name":"Dulce","last_name":"Santos","email":"dsantos1@mashable.com","gender":"Female","ip_address":"60.30.246.227"}\n{"id":3,"first_name":"Prissie","last_name":"Tebbett","email":"ptebbett2@infoseek.co.jp","gender":"Genderfluid","ip_address":"22.21.162.56"}\n{"id":4,"first_name":"Schuyler","last_name":"Coppledike","email":"scoppledike3@gnu.org","gender":"Agender","ip_address":"120.35.186.161"}\n{"id":5,"first_name":"Leopold","last_name":"Jarred","email":"ljarred4@wp.com","gender":"Agender","ip_address":"30.119.34.4"}\n{"id":6,"first_name":"Joanna","last_name":"Teager","email":"jteager5@apache.org","gender":"Bigender","ip_address":"245.221.176.34"}\n{"id":7,"first_name":"Lion","last_name":"Beere","email":"lbeere6@bloomberg.com","gender":"Polygender","ip_address":"105.54.139.46"}\n{"id":8,"first_name":"Marabel","last_name":"Wornum","email":"mwornum7@p

* Once string object is created, we can use `splitlines` to convert these lines into list of strings. Here each element is of type string which contain json.

In [7]:
customers_str_list = open('customers.json').read().splitlines()

In [8]:
type(customers_str_list)

list

* Each element in the list is of type string.

In [9]:
type(customers_str_list[0])

str

In [10]:
len(customers_str_list)

10

In [11]:
customers_str_list[0]

'{"id":1,"first_name":"Frasco","last_name":"Necolds","email":"fnecolds0@vk.com","gender":"Male","ip_address":"243.67.63.34"}'

In [12]:
json.loads(customers_str_list[0])

{'id': 1,
 'first_name': 'Frasco',
 'last_name': 'Necolds',
 'email': 'fnecolds0@vk.com',
 'gender': 'Male',
 'ip_address': '243.67.63.34'}

In [13]:
customers_str_list[:3]

['{"id":1,"first_name":"Frasco","last_name":"Necolds","email":"fnecolds0@vk.com","gender":"Male","ip_address":"243.67.63.34"}',
 '{"id":2,"first_name":"Dulce","last_name":"Santos","email":"dsantos1@mashable.com","gender":"Female","ip_address":"60.30.246.227"}',
 '{"id":3,"first_name":"Prissie","last_name":"Tebbett","email":"ptebbett2@infoseek.co.jp","gender":"Genderfluid","ip_address":"22.21.162.56"}']

* Now we can iterate through the elements and convert each string with JSON to dict using `json.loads`.

In [14]:
customers_dict_list = [json.loads(customer) for customer in customers_str_list]

In [15]:
type(customers_dict_list)

list

In [16]:
type(customers_dict_list[0])

dict

In [17]:
customers_dict_list[0]

{'id': 1,
 'first_name': 'Frasco',
 'last_name': 'Necolds',
 'email': 'fnecolds0@vk.com',
 'gender': 'Male',
 'ip_address': '243.67.63.34'}

* Here is the logic to convert list of strings to list of dicts using `map` function.

In [18]:
customers_dict_list = list(map(json.loads, customers_str_list))

In [19]:
type(customers_dict_list)

list

In [20]:
type(customers_dict_list[0])

dict

In [21]:
customers_dict_list[0]

{'id': 1,
 'first_name': 'Frasco',
 'last_name': 'Necolds',
 'email': 'fnecolds0@vk.com',
 'gender': 'Male',
 'ip_address': '243.67.63.34'}