# Chapter 16: WORKING WITH CSV FILES AND JSON DATA

## The csv Module

CSV stands for “comma-separated values,” and CSV files are simplified spreadsheets stored as plaintext files. Python’s `csv` module makes it easy to parse CSV files.

### reader Objects

To read data from a CSV file with the `csv` module, you need to create a `reader` object. A `reader` object lets you iterate over lines in the CSV file.

In [20]:
import csv

example_file = open("automate-online-materials/example.csv")
example_reader = csv.reader(example_file)
example_data = list(example_reader)
example_data

[['04/05/2014 13:34', 'Apples', '73'],
 ['04/05/2014 03:41', 'Cherries', '85'],
 ['04/06/2014 12:46', 'Pears', '14'],
 ['04/08/2014 08:59', 'Oranges', '52'],
 ['04/10/2014 02:07', 'Apples', '152'],
 ['04/10/2014 18:10', 'Bananas', '23'],
 ['04/10/2014 02:40', 'Strawberries', '98']]

In [26]:
example_data[0]  # first row

['04/05/2014 13:34', 'Apples', '73']

In [31]:
example_data[0][1]  # second element of the first row

'Apples'

### Reading Data from reader Objects in a for Loop

In [44]:
import csv

reader = csv.reader(open("automate-online-materials/example.csv"))
for row in reader:
    print(f"Row #{reader.line_num} {row}")

Row #1 ['04/05/2014 13:34', 'Apples', '73']
Row #2 ['04/05/2014 03:41', 'Cherries', '85']
Row #3 ['04/06/2014 12:46', 'Pears', '14']
Row #4 ['04/08/2014 08:59', 'Oranges', '52']
Row #5 ['04/10/2014 02:07', 'Apples', '152']
Row #6 ['04/10/2014 18:10', 'Bananas', '23']
Row #7 ['04/10/2014 02:40', 'Strawberries', '98']


### writer Objects

A `writer` object lets you write data to a CSV file. To create a `writer` object, you use the `csv.writer()` function.

In [1]:
import csv

output_file = open('output.csv', 'w')
output_writer = csv.writer(output_file)

output_writer.writerow(['spam', 'eggs', 'bacon', 'ham'])
output_writer.writerow(['Hello, world!', 'eggs', 'bacon', 'ham'])
output_writer.writerow([1, 2, 3.141592, 4])

output_file.close()

In [6]:
!cat output.csv

spam,eggs,bacon,ham
"Hello, world!",eggs,bacon,ham
1,2,3.141592,4


### The delimiter and lineterminator Keyword Arguments

In [1]:
import csv

csv_file = open('example.tsv', 'w', newline='')
csv_writer = csv.writer(csv_file, delimiter='\t', lineterminator='\n\n')
csv_writer.writerow(['apples', 'oranges', 'grapes'])
csv_writer.writerow(['eggs', 'bacon', 'ham'])
csv_writer.writerow(['spam', 'spam', 'spam', 'spam', 'spam', 'spam'])

csv_file.close()

In [2]:
!cat example.tsv

apples	oranges	grapes

eggs	bacon	ham

spam	spam	spam	spam	spam	spam



### DictReader and DictWriter CSV Objects

In [10]:
import csv

csv_file = open("automate-online-materials/exampleWithHeader.csv")
dict_reader = csv.DictReader(csv_file)
for row in dict_reader:
    print(row['Timestamp'], row['Fruit'], row['Quantity'])

4/5/2014 13:34 Apples 73
4/5/2014 3:41 Cherries 85
4/6/2014 12:46 Pears 14
4/8/2014 8:59 Oranges 52
4/10/2014 2:07 Apples 152
4/10/2014 18:10 Bananas 23
4/10/2014 2:40 Strawberries 98


If you tried to use `DictReader` objects with *example.csv*, which doesn’t have column headers in the first row, the `DictReader` object would use `'4/5/2015 13:34', 'Apples'`, and `'73'` as the dictionary keys. To avoid this, you can supply the `DictReader()` function with a second argument containing made-up header names:

In [11]:
file = open("automate-online-materials/example.csv")
dict_reader = csv.DictReader(file, ['time', 'name', 'amount'])
for row in dict_reader:
    print(row['time'], row['name'], row['amount'])

04/05/2014 13:34 Apples 73
04/05/2014 03:41 Cherries 85
04/06/2014 12:46 Pears 14
04/08/2014 08:59 Oranges 52
04/10/2014 02:07 Apples 152
04/10/2014 18:10 Bananas 23
04/10/2014 02:40 Strawberries 98


`DictWriter` objects use dictionaries to create CSV files.

In [12]:
output_file = open('output.csv', 'w', newline='')
outputDictWriter = csv.DictWriter(output_file, ['Name', 'Pet', 'Phone'])
outputDictWriter.writeheader()
outputDictWriter.writerow({'Name':'Alice', 'Pet':'cat', 'Phone':'555-1234'})
outputDictWriter.writerow({'Name':'Bob', 'Phone':'555-9999'})
outputDictWriter.writerow({'Phone':'555-5555', 'Name':'Carol', 'Pet':'dog'})

output_file.close()

In [16]:
!cat output.csv

Name,Pet,Phone
Alice,cat,555-1234
Bob,,555-9999
Carol,dog,555-5555


## Project: Removing the Header from CSV Files

In [10]:
# removeCsvHeader.py - Removes the header from all CSV files in the current
# working directory

import os
import csv

os.makedirs('headerRemoved', exist_ok=True)

# Loop through every file in the current working directory.
for csvFilename in os.listdir('.'):
    if not csvFilename.endswith('.csv'):
        continue  # skip non-csv files
        
    print(f"Removing header from {csvFilename} ...")
    
    # Read the CSV file in (skipping first row)
    csvRows = []
    csvFileObj = open(csvFilename)
    readerObj = csv.reader(csvFileObj)
    for row in readerObj:
        if readerObj.line_num == 1:
            continue  # skip first row
        csvRows.append(row)
    csvFileObj.close()
    
    # Write out the CSV file
    csvFileObj = open(os.path.join('headerRemoved', csvFilename), 'w', newline='')
    csvWriter = csv.writer(csvFileObj)
    for row in csvRows:
        csvWriter.writerow(row)
    csvFileObj.close()

Removing header from example.csv ...
Removing header from exampleWithHeader.csv ...
Removing header from output.csv ...


### Ideas for Similar Programs

- Compare data between different rows in a CSV file or between multiple CSV files.
- Copy specific data from a CSV file to an Excel file, or vice versa.
- Check for invalid data or formatting mistakes in CSV files and alert the user to these errors.
- Read data from a CSV file as input for your Python programs.

---

## JSON and APIs

JavaScript Object Notation is a popular way to format data as a single human-readable string. JSON is the native way that JavaScript programs write their data structures.

**JSON (JavaScript Object Notation)** is a lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate. © [json.org](https://json.org/)

### The json Module

Python’s `json` module handles all the details of translating between a string with JSON data and Python values for the `json.loads()` and `json.dumps()` functions. JSON can’t store *every* kind of Python value. It can contain values of only the following data types: strings, integers, floats, Booleans, lists, dictionaries, and `NoneType`. JSON cannot represent Python-specific objects, such as `File` objects, CSV `reader` or `writer` objects, `Regex` objects, or Selenium `WebElement` objects.

#### Reading JSON with the `loads()` Function

In [14]:
stringOfJsonData = '{"name":"Zophie", "isCat":true, "miceCaught":0, "felineIQ":null}'
import json

jsonDataAsPythonValue = json.loads(stringOfJsonData)
jsonDataAsPythonValue

{'name': 'Zophie', 'isCat': True, 'miceCaught': 0, 'felineIQ': None}

#### Writing JSON with the `dumps()` Function

In [15]:
pythonValue = {'isCat':True, 'miceCaugth':0, 'name':'Zophie', 'felineIQ':None}
import json

stringOfJsonData = json.dumps(pythonValue)
stringOfJsonData

'{"isCat": true, "miceCaugth": 0, "name": "Zophie", "felineIQ": null}'

### Project: Fetching Current Weather Data

https://openweathermap.org/api
- https://openweathermap.org/current
- https://openweathermap.org/forecast5

In [16]:
# getOpenWeather.py - Prints the weather for a location from the command line.

import sys
import json
import requests

# Computer location from command line arguments.
if len(sys.argv) < 2:
    print("Usage: getOpenWeather.py city_name, 2-letter_country_code")
    sys.exit()

# variables to request
# location = ",".join(sys.argv[1:])
location = "Samarkand, Uzbekistan"
API_KEY = 'YOUR_API_KEY_HERE'

# Download the JSON data from OpenWeatherMap.org's API
url = f"https://api.openweathermap.org/data/2.5/weather?q={location}&appid={API_KEY}"
response = requests.get(url)
response.raise_for_status()
#print(response.text)

# Load JSON data into a Python variable
w = json.loads(response.text)  # weather data
# Print weather description
print(f"Current weather in {location}:")
print(W['weather'][0]['main'], '-', W['weather'][0]['description'])

Current weather in Samarkand, Uzbekistan:
Clouds - overcast clouds


#### Ideas for Similar Programs

- Collect weather forecasts for several campsites or hiking trails to see which one will have the best weather.
- Schedule a program to regularly check the weather and send you a frost alert if you need to move your plants indoors. (Chapter 17 covers scheduling, and Chapter 18 explains how to send email.)
- Pull weather data from multiple sites to show all at once, or calculate and show the average of the multiple weather predictions.

## Practice Project

### Excel-to-CSV Converter

**LINK ➡** https://automatetheboringstuff.com/2e/chapter16/#calibre_link-372:~:text=Excel%2Dto%2DCSV%20Converter

In [None]:
import openpyxl

for excelFile in os.listdir('.'):
    # Skip non-xlsx files, load the workbook object.
    for sheetName in wb.get_sheet_names():
        # Loop through every sheet in the workbook.
        sheet = wb.get_sheet_by_name(sheetName)

        # Create the CSV filename from the Excel filename and sheet title.
        # Create the csv.writer object for this CSV file.

        # Loop through every row in the sheet.
        for rowNum in range(1, sheet.max_row + 1):
            rowData = []    # append each cell to this list
            # Loop through each cell in the row.
            for colNum in range(1, sheet.max_column + 1):
                # Append each cell's data to rowData.

            # Write the rowData list to the CSV file.

        csvFile.close()