## Write to JSON files using json module

Let us understand how to write to JSON files using `json` module.
* We can use `dump` to save JSON data to files. We can also generate a JSON string from a `dict` using `dumps`.

In the below example we will see how to dump a dict to a string. We will see a bit more advanced example of writing list of dicts into JSON file shortly.

In [None]:
course = {'course_name': 'Programming using Python',
  'course_author': 'Bob Dillon',
  'course_status': 'published',
  'course_published_dt': '2020-09-30'}

In [None]:
type(course)

In [None]:
import json

In [None]:
json.dumps(course)

In [None]:
type(json.dumps(course))

In [None]:
courses = [{'course_name': 'Programming using Python',
  'course_author': 'Bob Dillon',
  'course_status': 'published',
  'course_published_dt': '2020-09-30'},
 {'course_name': 'Data Engineering using Python',
  'course_author': 'Bob Dillon',
  'course_status': 'published',
  'course_published_dt': '2020-07-15'},
 {'course_name': 'Data Engineering using Scala',
  'course_author': 'Elvis Presley',
  'course_status': 'draft',
  'course_published_dt': None},
 {'course_name': 'Programming using Scala',
  'course_author': 'Elvis Presley',
  'course_status': 'published',
  'course_published_dt': '2020-05-12'},
 {'course_name': 'Programming using Java',
  'course_author': 'Mike Jack',
  'course_status': 'inactive',
  'course_published_dt': '2020-08-10'},
 {'course_name': 'Web Applications - Python Flask',
  'course_author': 'Bob Dillon',
  'course_status': 'inactive',
  'course_published_dt': '2020-07-20'},
 {'course_name': 'Web Applications - Java Spring',
  'course_author': 'Mike Jack',
  'course_status': 'draft',
  'course_published_dt': None},
 {'course_name': 'Pipeline Orchestration - Python',
  'course_author': 'Bob Dillon',
  'course_status': 'draft',
  'course_published_dt': None},
 {'course_name': 'Streaming Pipelines - Python',
  'course_author': 'Bob Dillon',
  'course_status': 'published',
  'course_published_dt': '2020-10-05'},
 {'course_name': 'Web Applications - Scala Play',
  'course_author': 'Elvis Presley',
  'course_status': 'inactive',
  'course_published_dt': '2020-09-30'},
 {'course_name': 'Web Applications - Python Django',
  'course_author': 'Bob Dillon',
  'course_status': 'published',
  'course_published_dt': '2020-06-23'},
 {'course_name': 'Server Automation - Ansible',
  'course_author': 'Uncle Sam',
  'course_status': 'published',
  'course_published_dt': '2020-07-05'}]

In [None]:
json.dumps(courses) # This will dump json as a string object (not to file)

Here are the steps involved in writing data in single JSON to JSON files using `json` module.
* Make sure you have a dict or any other object which can be converted to single valid json.
* Open the file in write mode.
* Write the data into the file. You have to use `json.dump` to write to file.
* Close the file.

In [None]:
# list of dicts

courses = [{'course_name': 'Programming using Python',
  'course_author': 'Bob Dillon',
  'course_status': 'published',
  'course_published_dt': '2020-09-30'},
 {'course_name': 'Data Engineering using Python',
  'course_author': 'Bob Dillon',
  'course_status': 'published',
  'course_published_dt': '2020-07-15'},
 {'course_name': 'Data Engineering using Scala',
  'course_author': 'Elvis Presley',
  'course_status': 'draft',
  'course_published_dt': None},
 {'course_name': 'Programming using Scala',
  'course_author': 'Elvis Presley',
  'course_status': 'published',
  'course_published_dt': '2020-05-12'},
 {'course_name': 'Programming using Java',
  'course_author': 'Mike Jack',
  'course_status': 'inactive',
  'course_published_dt': '2020-08-10'},
 {'course_name': 'Web Applications - Python Flask',
  'course_author': 'Bob Dillon',
  'course_status': 'inactive',
  'course_published_dt': '2020-07-20'},
 {'course_name': 'Web Applications - Java Spring',
  'course_author': 'Mike Jack',
  'course_status': 'draft',
  'course_published_dt': None},
 {'course_name': 'Pipeline Orchestration - Python',
  'course_author': 'Bob Dillon',
  'course_status': 'draft',
  'course_published_dt': None},
 {'course_name': 'Streaming Pipelines - Python',
  'course_author': 'Bob Dillon',
  'course_status': 'published',
  'course_published_dt': '2020-10-05'},
 {'course_name': 'Web Applications - Scala Play',
  'course_author': 'Elvis Presley',
  'course_status': 'inactive',
  'course_published_dt': '2020-09-30'},
 {'course_name': 'Web Applications - Python Django',
  'course_author': 'Bob Dillon',
  'course_status': 'published',
  'course_published_dt': '2020-06-23'},
 {'course_name': 'Server Automation - Ansible',
  'course_author': 'Uncle Sam',
  'course_status': 'published',
  'course_published_dt': '2020-07-05'}]

In [None]:
json.dump?

In [None]:
open?

In [None]:
!mkdir -p data/courses

In [None]:
# Opening the file in write mode
courses_file = open('data/courses/courses.json', 'w')

In [None]:
# Dumpling the list of dicts as JSON array into the file
json.dump(courses, courses_file)

In [None]:
# Closing the file
courses_file.close()

In [None]:
!ls -ltr data/courses/courses.json

In [None]:
!cat data/courses/courses.json

Here are the steps involved in dumping a JSON array to the file.
* Make sure you have list of dicts.
* Open the file in write mode.
* Write the data into the file. You have to use `json.dump` to write to file.
* You have to dump one element at a time into the file until all the dicts are written to file.
* Close the file.

In [None]:
!rm data/courses/courses.json

In [None]:
# Opening the file in write mode
courses_file = open('data/courses/courses.json', 'w')

In [None]:
len(courses)

In [None]:
# Writing each dict as JSON document as line in the file
for course in courses:
    json.dump(course, courses_file) # Writing one JSON at a time.
    courses_file.write('\n') # we need to add new line character after each JSON document in the file.

In [None]:
# Closing the file
courses_file.close()

In [None]:
!ls -ltr data/courses/courses.json

In [None]:
!cat data/courses/courses.json

In [None]:
import pandas as pd

In [None]:
pd.read_json('data/courses/courses.json', lines=True)