## WORKING WITH CSV AND JSON FILES
CSV (Comma Seperated Values) and JSON (JavaScript Object Notation) files are text files, they can be viewed using text editors. CSV files are simplied spreadsheets and JSON format is used in many web applications.

In [1]:
import csv

## THE CSV MODULE
The CSV module is used for reading and writing CSV files. Each line in a CSV file represents a row in an excel spreadsheet and each cell value is seperated by a comma in a row.

## READING FROM CSV FILES
To read from a csv file, we use the reader() method, this method takes a file object as parameter. It also takes an optional parameter called delimeter. A delimiter is a character used to separate the values. By default, the delimiter is a comma. If the CSV file is seperated with a delimeter different from a comma, then you will have to pass the delimiter as an argument to the reader() method. 

In [4]:
file_obj = open('100 Sales Records.csv')
reader_obj = csv.reader(file_obj)
reader_obj

<_csv.reader at 0x16c30d22c88>

You can iterate through the reader object to access contents of the csv file

In [3]:
for line in reader_obj:
    print(line)

['Region', 'Country', 'Item Type', 'Sales Channel', 'Order Priority', 'Order Date', 'Order ID', 'Ship Date', 'Units Sold', 'Unit Price', 'Unit Cost', 'Total Revenue', 'Total Cost', 'Total Profit']
['Australia and Oceania', 'Tuvalu', 'Baby Food', 'Offline', 'H', '5/28/2010', '669165933', '6/27/2010', '9925', '255.28', '159.42', '2533654.00', '1582243.50', '951410.50']
['Central America and the Caribbean', 'Grenada', 'Cereal', 'Online', 'C', '8/22/2012', '963881480', '9/15/2012', '2804', '205.70', '117.11', '576782.80', '328376.44', '248406.36']
['Europe', 'Russia', 'Office Supplies', 'Offline', 'L', '5/2/2014', '341417157', '5/8/2014', '1779', '651.21', '524.96', '1158502.59', '933903.84', '224598.75']
['Sub-Saharan Africa', 'Sao Tome and Principe', 'Fruits', 'Online', 'C', '6/20/2014', '514321792', '7/5/2014', '8102', '9.33', '6.92', '75591.66', '56065.84', '19525.82']
['Sub-Saharan Africa', 'Rwanda', 'Office Supplies', 'Offline', 'L', '2/1/2013', '115456712', '2/6/2013', '5062', '651.

From the above code, you can see that the header was also printed out. If you do not want it printed out, you can skip it by using the next() iterator function. You can also choose which of the values you want to print.

In [5]:
with open ('100 Sales Records.csv', 'r') as csv_file:
    csv_reader = csv.reader(csv_file)
    next(csv_reader)
    for line in csv_reader:
        print(f'Item:{line[2]} Units sold:{line[9]} Unit Price:{line[10]}')

Item:Baby Food Units sold:255.28 Unit Price:159.42
Item:Cereal Units sold:205.70 Unit Price:117.11
Item:Office Supplies Units sold:651.21 Unit Price:524.96
Item:Fruits Units sold:9.33 Unit Price:6.92
Item:Office Supplies Units sold:651.21 Unit Price:524.96
Item:Baby Food Units sold:255.28 Unit Price:159.42
Item:Household Units sold:668.27 Unit Price:502.54
Item:Vegetables Units sold:154.06 Unit Price:90.93
Item:Personal Care Units sold:81.73 Unit Price:56.67
Item:Cereal Units sold:205.70 Unit Price:117.11
Item:Vegetables Units sold:154.06 Unit Price:90.93
Item:Clothes Units sold:109.28 Unit Price:35.84
Item:Clothes Units sold:109.28 Unit Price:35.84
Item:Household Units sold:668.27 Unit Price:502.54
Item:Personal Care Units sold:81.73 Unit Price:56.67
Item:Clothes Units sold:109.28 Unit Price:35.84
Item:Cosmetics Units sold:437.20 Unit Price:263.33
Item:Beverages Units sold:47.45 Unit Price:31.79
Item:Household Units sold:668.27 Unit Price:502.54
Item:Meat Units sold:421.89 Unit Price:

## THE DICTREADER() METHOD
The CSV module has a DictReader() method which reads a csv file and returns an OrderedDict of values. An OrderedDict is a sub class of a dictionary that preserves the order in which the keys were inserted.
The DictReader() method is useful for csv files that have header rows. Unlike the reader() method which returns the header rows of a csv file as part of the contents of the file, the DictReader() uses the header rows as keys for each dictionary object.

In [8]:
with open('100 Sales Records.csv', 'r') as f:
    csv_reader = csv.DictReader(f)
    for line in csv_reader:
        print(line)

OrderedDict([('Region', 'Australia and Oceania'), ('Country', 'Tuvalu'), ('Item Type', 'Baby Food'), ('Sales Channel', 'Offline'), ('Order Priority', 'H'), ('Order Date', '5/28/2010'), ('Order ID', '669165933'), ('Ship Date', '6/27/2010'), ('Units Sold', '9925'), ('Unit Price', '255.28'), ('Unit Cost', '159.42'), ('Total Revenue', '2533654.00'), ('Total Cost', '1582243.50'), ('Total Profit', '951410.50')])
OrderedDict([('Region', 'Central America and the Caribbean'), ('Country', 'Grenada'), ('Item Type', 'Cereal'), ('Sales Channel', 'Online'), ('Order Priority', 'C'), ('Order Date', '8/22/2012'), ('Order ID', '963881480'), ('Ship Date', '9/15/2012'), ('Units Sold', '2804'), ('Unit Price', '205.70'), ('Unit Cost', '117.11'), ('Total Revenue', '576782.80'), ('Total Cost', '328376.44'), ('Total Profit', '248406.36')])
OrderedDict([('Region', 'Europe'), ('Country', 'Russia'), ('Item Type', 'Office Supplies'), ('Sales Channel', 'Offline'), ('Order Priority', 'L'), ('Order Date', '5/2/2014')

If there is no header rows in the csv file, the DictReader() would use the first row as dictionary keys. To avoid this, we can pass a list of headers to the DictReader() method.

## WRITING TO CSV FILES
To write data to a csv file, we use the writer() method of the csv module. This method also takes a file object as a parameter. The delimiter parameter is optional.


In [None]:
f = open('100SalesRecords_copy.csv', 'w')
writer_obj = csv.writer(f)

The writer() method returns a writer object. This object has a writerow() method that takes a list of values as parameter. In the code below, we try to write some values in an existing csv file to a new csv file. 

In [None]:
with open('100 Sales Records.csv', 'r') as file_reader:
    csv_reader = csv.DictReader(file_reader)
    with open('new_sales_records.csv', 'w') as file_writer:
        csv_writer = csv.writer(file_writer, delimiter = '\t')
        csv_writer.writerow(['Item Type', 'Units Sold', 'Unit Price'])
        for line in csv_reader:
            item_type = line['Item Type']
            units_sold = line['Units Sold']
            units_price = line['Unit Price']
            csv_writer.writerow([item_type, units_sold, units_price])
            

## THE DICTWRITER() METHOD
The DictWriter() method is used to write dictionary objects to a csv file. Asides taking a file object as argument, this method also takes a delimiter and a list of headers (fieldnames) as parameters.

To write the headers to the csv file, we use the writeheader() method of the writer object.  The writerow() method of the dict writer object takes a dictionary object as parameter. The keys of the dictionary are headers of the csv.

In [None]:
fields = ['Name', 'Gender', 'Age']
dict_list = [{'Name':'Angela', 'Gender':'Female', 'Age':20}, 
             {'Name':'Betty', 'Gender':'Female', 'Age':18},
            {'Name':'Charles', 'Gender':'Male', 'Age':19},
            {'Name':'Desmond', 'Gender':'Male', 'Age':22},
            {'Name':'Esther', 'Gender':'Female', 'Age':20}]

In [None]:
with open('Students_list.csv', 'w') as f:
    csv_writer = csv.DictWriter(f, fieldnames =fields, delimiter = '\t')
    csv_writer.writeheader()
    for dict_item in dict_list:
        csv_writer.writerow(dict_item)

## JSON AND API
JSON stands for JavaScript Object Notation. It is a popular way to format data as a single human readable string. Many websites provide their content as json data for programs to interact with. This is known as API (Application Programming Interface). Accessing an API is similar to accesing a web page via a URL. To make an API call, you need to check the API documentation to know the URLs your program needs to request in order to get the data you want, as well as the general format of the JSON data structures that are returned.
![image.png](attachment:image.png)

The above picture is an example of a json data. Json data supports primitive types; strings, numbers, nested data structures.

## THE JSON MODULE
The json module allows us to translate json data to python objects and vice versa. You can convert any python object to json. Below is the conversion chart for encoding/serialising python objects.

![image.png](attachment:image.png)

Below is the conversion chart for decoding/deserialising json data.

 ![image.png](attachment:image.png)

The json module has various methods to parse json data. The loads() and dumps() methods allow us to parse a string of json data. The loads() function which stands for `load string` allows us to translate/decode/deserialize a string of json data to a python object.

In [9]:
import json
json_str = '{"name": "Amanda", "age": 20, "can_drive": true, "IQ": null}'
data = json.loads(json_str)
data

{'name': 'Amanda', 'age': 20, 'can_drive': True, 'IQ': None}

The dumps() function which stands for `dump string` is used to translate/encode/serialise python objects to a string of json data.

In [10]:
data = {'name': 'Amanda', 'age': 20, 'can_drive': True, 'IQ': None}
json_str = json.dumps(data)
json_str

'{"name": "Amanda", "age": 20, "can_drive": true, "IQ": null}'

The load() and dump() methods allow us to parse a file containing json data. Below is an example of how we can use the load() method to deserialise a json data from a URL.

In [11]:
with open('example_2.json', 'r') as f:
    data = json.load(f)

In [12]:
print(data)

{'quiz': {'sport': {'q1': {'question': 'Which one is correct team name in NBA?', 'options': ['New York Bulls', 'Los Angeles Kings', 'Golden State Warriros', 'Huston Rocket'], 'answer': 'Huston Rocket'}}, 'maths': {'q1': {'question': '5 + 7 = ?', 'options': ['10', '11', '12', '13'], 'answer': '12'}, 'q2': {'question': '12 - 8 = ?', 'options': ['1', '2', '3', '4'], 'answer': '4'}}}}


In [13]:
data['quiz']['sport']

{'q1': {'question': 'Which one is correct team name in NBA?',
  'options': ['New York Bulls',
   'Los Angeles Kings',
   'Golden State Warriros',
   'Huston Rocket'],
  'answer': 'Huston Rocket'}}

Below is an example of how to serialise a python object to json using the dump method. The dump method takes two required parameters, the python object to be serialised and the file object to which the json data would be written.

In [14]:
data = {
  "colors": [
    {
      "color": "black",
      "category": "hue",
      "type": "primary",
      "code": {
        "rgba": [255,255,255,1],
        "hex": "#000"
      }
    },
    {
      "color": "white",
      "category": "value",
      "code": {
        "rgba": [0,0,0,1],
        "hex": "#FFF"
      }
    },
    {
      "color": "red",
      "category": "hue",
      "type": "primary",
      "code": {
        "rgba": [255,0,0,1],
        "hex": "#FF0"
      }
    },
    {
      "color": "blue",
      "category": "hue",
      "type": "primary",
      "code": {
        "rgba": [0,0,255,1],
        "hex": "#00F"
      }
    }
  ]
}

In [15]:
with open('colors.json', 'w') as f:
    json.dump(data, f)

## EXERCISE
Load the todo json data from the url https://jsonplaceholder.typicode.com/todos and write to a new csv file all todos that have been completed.

In [1]:
import requests
from bs4 import BeautifulSoup
import json
import csv

In [5]:
data = requests.get('https://jsonplaceholder.typicode.com/todos')
json_data = json.loads(data.text)


In [79]:
j = open('newcompleted.csv', 'w')
csv_writer = csv.writer(j)
csv_writer.writerow(['userId', 'id', 'title', 'completed'])
for el in json_data:
    if el['completed'] == True:
        csv_writer.writerow([el['userId'], el['id'], el['title'], el['completed']])
        
        
j.close()     