# <p style="text-align: Center;">Working with CSV and JSON</p>
## <p style="text-align: Center;">University of Wyoming COSC 1010</p>
### <p style="text-align: Center;">Adapted from: *Automate the Boring Stuff with Python* By Al Sweigart </p>

## Working with CSV files and JSON Data
---
* CSV and JSON files are plain text, meaning you don't necessarily need a special module to work with them
* In addition, you can open them in a normal text editor like Notepad
* But, Python has special *csv* and *json* modules to make your life easier when working with them 
* Each module provides functionality to work with the specific file formats 

## Working with CSV files and JSON Data
---
* **CSV** stands for "comma-separated-value" 
* CSV files are simplified spreadsheets, stored as plaintext files 
* Python's *csv* module makes parsing them easy 
* **JSON** is a format that stores JavaScript source code in plaintext
    * JSON is short for *JavaScript Object Notation* 
* You don't need to know JavaScript to use JSON

## The CSV Module 
---
* Each line in a CSV file represents a row in a spreadsheet
* The commas separate the cells in the row
* The advantage of CSV files is that they are simple
* CSV is widely supported by many types of programs, and can be almost universally viewed without a special program 
* They are a straightforward way to represent spreadsheet data 
* It is exactly as advertised, a text file with comma-separated values 

## The CSV Module 
---
* CSV files are simple, and lack many of the features of an Excel spreadsheet, like:
    * They don't have types, everything is a string 
    * No settings for fonts, size, or color
    * Do not have multiple worksheets
    * Cell height and width can't be changed 
    * No merged cells
    * No images or charts

## The CSV Module 
---
* Since CSV files are just text you can read them in as a string
* As each cell is separated by a comma you could split the cells on the commas
* The issue is not every comma may represent a boundary between cells
* CSV has escape characters that allow commas to be part of values, which `split()` would not properly handle
* As a result using the `csv` module is the better choice 

## Reader Objects
--- 
* To read data from a CSV file with hte `csv` module a `Reader` object is needed
* A `Reader` object allows you to iterate over the lines in a CSV file
* The `csv` module comes with Python, so it doesn't need to be installed through pip 
* To work if a CSV file the `csv` module needs to be imported 

## Reader Objects
--- 
* To read a CSV file first it needs to be opened with `open()`
* Then the file object `open()` returns will be passed to `csv.reader()`
* `csv.reader()` will return a  `Reader` object for you to utilize 

In [1]:
import csv 

exampleCSV = open("example.csv")
exampleReader = csv.reader(exampleCSV)
exampleData = list(exampleReader)
print(exampleData)

[['4/5/2014 13:34', 'Apples', '73'], ['4/5/2014 3:41', 'Cherries', '85'], ['4/6/2014 12:46', 'Pears', '14'], ['4/8/2014 8:59', 'Oranges', '52'], ['4/10/2014 2:07', 'Apples', '152'], ['4/10/2014 18:10', 'Bananas', '23'], ['4/10/2014 2:40', 'Strawberries', '98']]


## Reader Objects
--- 
* The most direct way to access values in a Reader object is to pass it to `list()`
* Using `list()` will return a  list of lists 
* This list of lists can then be stored in a variable, like `exampleData`
* The data can be printed to show it is a list of lists

## Reader Objects
--- 
* Once the CSV files has been converted to a list of lists the individual items can be accessed
* You can utilize `exampleData[row][col]` where:
    * `row` is the index of one of the lists
    * `col` is the index of the item you want to work with in that list

In [2]:
print(exampleData[0][0])
print(exampleData[0][1])
print(exampleData[0][2])

4/5/2014 13:34
Apples
73


In [4]:
print(exampleData[1][1])
print(exampleData[6][1])

Cherries
Strawberries


## Reading Data from Reader Objects in a for loop
--- 
* For larger CSV files, it may be best to sue a `Reader` object in a `for` loop
* This avoids loading the entire file into memory at once 
* This can be done once you have the reader object 
* You can loop through the rows in the reader object, much like you would a list
* The reader object can be looped over only once
    * to re-read the CSV file you must call `csv.reader()` to create a new `Reader`

In [6]:
exampleCSV = open("example.csv")
exampleReader = csv.reader(exampleCSV)
for row in exampleReader:
    print('Row #' + str(exampleReader.line_num) + ' ' + str(row))

Row #1 ['4/5/2014 13:34', 'Apples', '73']
Row #2 ['4/5/2014 3:41', 'Cherries', '85']
Row #3 ['4/6/2014 12:46', 'Pears', '14']
Row #4 ['4/8/2014 8:59', 'Oranges', '52']
Row #5 ['4/10/2014 2:07', 'Apples', '152']
Row #6 ['4/10/2014 18:10', 'Bananas', '23']
Row #7 ['4/10/2014 2:40', 'Strawberries', '98']


## Writer Objects
--- 
* A `Writer` object lets you write data to a CSV file
* To create a `Write` object the `csv.writer()` function is called
* Again `open()` needs to be called first, but it needs an additional argument 
    * You need to pass `'w'` to indicate the file will be opened for writing 
* Then the `writerow()` method for `Writer` objects can be used
    * It takes a list of arguments 
    * Each value in the list is placed in its own cell
* The writer will automatically escape any commas in the values 

In [8]:
outputFile = open("output.csv","w")
outputWriter = csv.writer(outputFile)
outputWriter.writerow(['spam','eggs','bacon','ham'])
outputWriter.writerow(['Hello, class!','eggs','bacon','ham'])
outputWriter.writerow([1,2,3,4])

outputFile.close()

## Delimiters and Line Terminators  
---
* Suppose you want to separate values with something like a tab character rather than commas
* Or, you want to rows to be double spaced 
* You can change the delimiter and line terminator characters in your file
    * The *delimiter* is the character that appears between cells on a row, by default a comma
    * The *line terminator* is the character that comes at the end of a row. by default a new line

In [9]:
tsvFile = open('example.tsv','w')
tsvWriter = csv.writer(tsvFile, delimiter="\t", lineterminator="\n\n")

tsvWriter.writerow(['spam','eggs','bacon','ham'])
tsvWriter.writerow(['Hello, class!','eggs','bacon','ham'])
tsvWriter.writerow([1,2,3,4])

tsvFile.close()

## Delimiters and Line Terminators  
---
* Passing `delimiter='\t'` changes the character between cells to be a tab character 
* Passing `lineterminator='\n\n'` changes the character between rows to be two newline characters 
* The file being written to ended with `.tsv` to denote it is a *tab separated values* file 

## JSON and APIs
---
* JavaScript Object Notation is a popular way to format data as a single human-readable string 
* JSON is the native way that JavaScript programs write their data structures
* You don't need to know JavaScript
* JSON is typically a defined order for `key:value` pairs

## JSON and APIs
---
* Here is an example of JSON:
```
    {
        "universityName":"University of Wyoming"
        "city":"Laramie",
        "state":"Wyoming",
        "elevation":7220
    }
```

## JSON and APIs
---
* JSON is useful to know as many websites offer JSON content as a way for programmers to interact with the website
* This is known as providing an *application programming interface* (API) 
* Accessing an API is the same as accessing any other webpage, it is done through a URL 
* The difference is that rather returning HTML the data returned is formatted for machines, like JSON

## JSON and APIs
---
* Many websites make data available through JSON
    * Some require registration 
    * Others are free 
    * each site will have its own documentation 
* Reading documentation is important as it says what URLs your program should use tor equest data
* And what the data will look like when it is returned

## JSON and APIs
---
* Using APIs you could write a program that:
    * Scrape raw data from websites
    * Automatically download new posts from a social network
    * Create a *movie encyclopedia* for your personal library by scraping data from IMDb, Wikipedia, etc

## The `json` module
--- 
* Python's `json` module handles all the details of translating between a string with JSON data and Python values 
* This occurs with the `json.loads()`, to load the JSON data
* And with `json.dumps()` to write the data  out
* JSON cannot represent Python-specific  objects

## The `json` module
--- 
* JSON can't store *every* kind of Python value
* It can contain only:
    * strings
    * integers
    * floats
    * Booleans
    * lists
    * dictionaries
    * NoneType

## Reading JSON with the `loads()` Function
---
* To translate a string containing JSON into a Python value, the string needs to be passed to `json.loads()`
    * The name means *load string*
* First though, the `json` module must be imported 
* Then `json.loads()` can be called
* JSON strings always use double quotes 
* It will return the data as a Python dictionary 
* Python dictionaries are not ordered, so the key value pairs may appear in a different order
* Much like we could have dictionaries in dictionaries, you can have JSON objects nested in others

In [11]:
import json

jsonString = """{
        "universityName":"University of Wyoming",
        "city":"Laramie",
        "state":"Wyoming",
        "elevation":7220
    }"""

jsonAsPy =  json.loads(jsonString)
print(jsonAsPy)

{'universityName': 'University of Wyoming', 'city': 'Laramie', 'state': 'Wyoming', 'elevation': 7220}


## Writing JSON with `dumps()`
---
* `dumps()` stands for *dump string* 
* It translates a Python value into  a string of JSON formatted data
* The value can only be one of the following types:
    * dictionaries
    * lists
    * integers
    * floats
    * strings
    * Booleans
    * None

In [12]:
randDict = {"keyOne":"hi", "keyTwo":7220}
stringOfJSON = json.dumps(randDict)
print(stringOfJSON)

{"keyOne": "hi", "keyTwo": 7220}


## Putting it Together
---
* We can put it together to write a program to give us a weather forecast
* We can:
    * Make an API call to retrieve a JSON 
    * Parse that string into a Python dictionary
    * Give a forecast for Laramie

## Putting it Together
--- 
* To begin we will need an additional module to make the API request
    * API requests are very similar to when you visit a website
* We will be using the `requests` module to accomplish this
* We will also need the URL `https://api.weather.gov/gridpoints/CYS/84,23/forecast`

In [17]:
#import the request module so we can reach out to the external website
import requests

# first we will store the URL in a variable
url = "https://api.weather.gov/gridpoints/CYS/84,23/forecast"
#we then need to make our request
#what we are really doing is sending an HTTP GET request to a website
#but getting JSON back, rather HTML or a website
response = requests.get(url)
# that is the overall response though, we need to pull our data from it
response_data = response.text
# Print it out, it will be truncated
# visiting the URL is a great way to see what it looks like!
print(response_data)


{
    "@context": [
        "https://geojson.org/geojson-ld/geojson-context.jsonld",
        {
            "@version": "1.1",
            "wx": "https://api.weather.gov/ontology#",
            "geo": "http://www.opengis.net/ont/geosparql#",
            "unit": "http://codes.wmo.int/common/unit/",
            "@vocab": "https://api.weather.gov/ontology#"
        }
    ],
    "type": "Feature",
    "geometry": {
        "type": "Polygon",
        "coordinates": [
            [
                [
                    -105.5968706,
                    41.312215100000003
                ],
                [
                    -105.59459770000001,
                    41.290412000000003
                ],
                [
                    -105.5655653,
                    41.292117800000007
                ],
                [
                    -105.56783200000001,
                    41.313921100000009
                ],
                [
                    -105.5968706,
              

In [19]:
#Just to be safe, we will save that JSON string to a file
with open("weather.json","w") as file:
    file.write(response_data)

## Putting it Together
---
* OK, we have the JSON data, both in memory and stored locally
* The file is a lot, the data we really want is a sub dictionary of a sub dictionary
* We can read in the overall data, and then using the keys find the data we really want

In [21]:
#Move all the data into a Python dictionary
data = json.loads(response_data)
#This is the first sub dictionary, you can figure it out by looking at the json file
props = data["properties"]
#Periods is the data we actually want
periods = props["periods"]

## Putting it Together
---
* Ok! WE now have the actual data we want
* Now we can parse through that data and give a nice weather forecast for Laramie
* Again, looking at your data ahead of time really helps to know what you have access to

In [31]:
print("Weather forecast for Laramie:")

for period in periods:
    #Periods is a list
    #With dictionaries as its elements
    name = "{:18}".format(period["name"])
    print(f"{name}{period['temperature']}F")
    print("---------------------")

Weather forecast for Laramie:
Today             58F
---------------------
Tonight           39F
---------------------
Thursday          50F
---------------------
Thursday Night    29F
---------------------
Friday            52F
---------------------
Friday Night      27F
---------------------
Saturday          53F
---------------------
Saturday Night    29F
---------------------
Sunday            46F
---------------------
Sunday Night      24F
---------------------
Monday            37F
---------------------
Monday Night      18F
---------------------
Tuesday           46F
---------------------
Tuesday Night     26F
---------------------
