<center> <img src="res/ds3000.png"> </center>

<center> <h1> Week 4 - Day 1 </h1> </center>

<center> <h2> Part 2: Working with CSV Files </h2></center>

## Outline
1. <a href='#1'>Reading CSV Files into Pandas `DataFrames`</a>
2. <a href='#2'>Writing a DataFrame to a CSV File</a>
3. <a href='#3'>Reading Excel Spreadsheets</a>
4. <a href='#4'>Writing to an Excel File</a>
5. <a href='#5'>Reading CSV Files Using the Built-in CSV Module</a>
6. <a href='#6'>Writing Data to CSV Files</a>

## Comma-Separated Values (CSV) Files
* A csv file a delimited text file that uses a comma to separate values
* **`csv` module** provides functions for working with CSV files

<a id="1"></a>

## 1. Reading CSV Files into Pandas `DataFrames` 
* Load a CSV dataset into a `DataFrame` with the pandas function **`read_csv`**
* `names` argument specifies the `DataFrame`’s column names
    * Without this argument, `read_csv` assumes that the CSV file’s first row is a comma-delimited list of column names

In [None]:
import pandas as pd

In [None]:
df = pd.read_csv('res/grade_book.csv', names=['Name', 'Potion1', 'Potion2', 'Potion3'])

In [None]:
df

In [None]:
df = pd.read_csv('res/grade_book.csv')
df

* Can instruct pandas to not use the first row as the header line
* Set **header=None**

In [None]:
df = pd.read_csv('res/grade_book.csv', header=None)
df

* If column names are already available in the first line of the CSV file, they become DF column names:

In [None]:
df = pd.read_csv('res/grade_book_header.csv')
df

* Can specify the index column for the names of the rows:

In [None]:
df = pd.read_csv('res/grade_book_header.csv', index_col = "Name")
df

In [None]:
df.loc["Harry"]

In [None]:
df = pd.read_csv('res/grade_book_header.csv')
df

<a id="2"></a>

## 2. Writing a DataFrame to a CSV File
* To save a `DataFrame` to a file using CSV format, call `DataFrame` method **`to_csv`**
* `index=False` indicates that the row names (`0`–`5` at the left of the `DataFrame`’s output above are not written to the file
* Resulting file contains the column names as the first row
* If you don't want to write column names, use `header=False` method call

In [None]:
df.to_csv('res/grades_from_dataframe.csv', index=False)

In [None]:
!more res\grades_from_dataframe.csv

### 2.1. Appending to a CSV
* Set **mode = "a"**
* Set **header = None** if you do not want to write the column names again

In [None]:
df.to_csv('res/grades_from_dataframe.csv', index=False, header=None, mode = "a")

In [None]:
!more res\grades_from_dataframe.csv

### 2.2. More File Parsing Functions in pandas
| Mode | Description
| ------ | :------
| **`read_csv`** | Load delimited data from a file, URL; uses comma as default delimited
| **`read-fwf`** | Read data in fixed-width column format (i.e., no delimiters such as tab-separated txt files)
| **`read_clipboard`** | Version of read_csv that reads data from the clipboard; useful for converting tables from web pages
| **`read_html`** | Read all tables contained in the given HTML document.
| **`read_json`** | Read data from a JSON (JavaScript Object Notation) string representation
| **`read_excel`** | Read data from Excel files

<a id="3"></a>

  
## 3. Reading Excel Spreadsheets

In [None]:
df = pd.read_excel('res/grade_book_header.xlsx')
df

### 3.1. Reading a Specific Sheet from an Excel File
* Can specify the sheet to read from an Excel file
* Provide a second argument with the sheet name

In [None]:
df = pd.read_excel('res/grade_book_header.xlsx', 'Sheet2')
df

<a id="4"></a>

## 4. Writing to an Excel File

In [None]:
df

In [None]:
writer = pd.ExcelWriter('res/charms.xlsx')

df.to_excel(writer, 'Sheet1') #can specify the name of the sheet

writer.save() # need to save the file

<a id="5"></a>

## 5. Reading CSV Files Using the Built-in CSV Module
* `csv` module’s **`reader` function** returns an object that reads CSV-format data from the specified file object
* Can iterate through the `reader` object one record (or row) of comma-delimited values at a time
* `csv` module’s documentation recommends opening CSV files with the additional keyword argument `newline=''` to ensure that newlines are processed properly 

In [None]:
import csv

In [None]:
!more res\grade_book.csv

In [None]:
with open('res/grade_book.csv', 'r', newline='') as grades:
    reader = csv.reader(grades)
    
    for row in reader:
        name, potion1, potion2, potion3 = row
        print("Grades for", name, "are", potion1, potion2, potion3)

<a id="6"></a>

## 6. Writing Data to CSV Files
* **`writer` function** returns an object that writes CSV data to the specified file object
* `writer`’s **`writerow` method** receives an iterable to store in the file
* By default, `writerow` delimits values with commas, but you can specify custom delimiters

In [None]:
with open('res/grade_book2.csv', mode='a', newline='') as grades:
    writer = csv.writer(grades)
    writer.writerow(['Ginny', 75, 95, 90])
    writer.writerow(['Neville', 70, 64, 63])
    writer.writerow(['Luna', 79, 85, 91])

In [None]:
!more res\grade_book2.csv

* `writerow` calls above can be replaced with one **`writerows`** call that outputs a comma-separated list of iterables representing the records