# Comma Separated Values (CSV) Module in Python

Though this file is comma separated, other separators exist like *semicolon* and *tab*

*Pandas* could be used or the *csv* module

---

## Reading data from a CSV file

Using the `csv` library we could *read* and *write* data using the `reader(csvfile, delimiter=',')` 

In [2]:
# Standard Library 
import csv 

# Opening the csv file and inputting that object to the reader function
with open('contacts.csv', newline='') as csvfile:
    reader = csv.reader(csvfile, delimiter=',')
    for row in reader:
        print(row)
        print(','.join(row))

['Name', 'Phone']
Name,Phone
['mother', '222-555-101']
mother,222-555-101
['father', '222-555-102']
father,222-555-102
['wife', '222-555-103']
wife,222-555-103
['mother-in-law', '222-555-104']
mother-in-law,222-555-104


Using that `csv` lib we open a csv file as an object and using the `csv.reader(csv_obj, delimiter=',')` because there could be different separators.

When we open the csv as a file `with open('csv_file.csv', newline='') as csvfile` we have `newline` which protects us from incorrect interpretation of new line characters.

After that we loop over our reader object because it returns rows of data 

---

## We could read CSV files as a OrderedDict Object 

Each line (row) is mapped to an `OrderedDict` object using the `DictReader` class 

In [3]:
import csv 

with open('contacts.csv', newline='') as csvfile:
    reader = csv.DictReader(csvfile)
    for row in reader:
        print(row['Name'], ':', row['Phone'])
    

mother : 222-555-101
father : 222-555-102
wife : 222-555-103
mother-in-law : 222-555-104


`DictReader` treats the first line of the file as a header which means if we don't have set columns then we **MUST** set our own columns 


In [4]:
with open('contacts.csv', newline='') as csvfile:
    fieldnames = ['Name', 'Phone']
    reader = csv.DictReader(csvfile, fieldnames=fieldnames)
    for row in reader:
        print(row['Name'], ':', row['Phone'])

Name : Phone
mother : 222-555-101
father : 222-555-102
wife : 222-555-103
mother-in-law : 222-555-104


## Saving data to a CSV file 

This could be done using the `writer(csvfile, delimiter)` function 

Instead of opening our file without a *mode* we added a *"w"* mode for writing 

we use that `writer=csv.writer(csvfile, delimiter=',')` to create the writer object 

then we added data using `writer.writerow([])`

We added `quotechar` and `quoting` options because by *default* the values are **quoted**
- using `quoting=csv.QUOTE_MINIMAL` means that only values with **special characters** like *separators* or `quotechar` will be quoted 

We could see this with `"grandmother, grandfather"` because it has a `,` which is actually our delimiter and instead of separating them, we end up quoting it.
- `quoting` has several options:
    - `csv.QUOTE_ALL` will quote all values 
    - `csv.QUOTE_NONNUMERIC` will quote non-numerical values 
    - `csv.QUOTE_NONE` doesnt quote any value (kinda bad idea)
-  

In [7]:
import csv 

with open("exported_csv_file.csv", "w", newline="") as csvfile:
    writer = csv.writer(csvfile, delimiter=',',quotechar='"', quoting=csv.QUOTE_MINIMAL)
    
    writer.writerow(['Name', 'Phone'])
    writer.writerow(['mother', '222-555-101'])
    writer.writerow(['father', '222-555-102'])
    writer.writerow(['wife', '222-555-103'])
    writer.writerow(['mother-in-law', '222-555-104'])
    writer.writerow(['grandmother, grandfather', '222-555-105'])

## We know about DictReader but now DictWriter?

We set it up similar to the `writer` object but this time we have to worry about a few more items:
- fieldnames
- writing headers 
- a different way to enter row information while using the same `writerow()` function

With a *file object*, we need a list of column names as *fieldnames*. After that we pass in both of these necessary information to `csv.DictWriter()`

Before saving values we must call the `writer.writeheader()`

As usual... we use the `writer.writerow()` to input the data into the CSV file; however, *this time* we add in a **dictionary** which contains the `{fieldname1: val1, fieldname2: val2}` to specify what the data belongs to instead of using a **list**


In [8]:
import csv 

with open('dict_exported_contacts.csv', 'w', newline='') as csvfile:
    fieldnames = ['Name', 'Phone']
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
    
    writer.writeheader()
    writer.writerow({'Name': 'mother', 'Phone': '222-555-101'})
    writer.writerow({'Name': 'father', 'Phone': '222-555-102'})
    writer.writerow({'Name': 'wife', 'Phone': '222-555-103'})
    writer.writerow({'Name': 'mother-in-law', 'Phone': '222-555-104'})
    writer.writerow({'Name': 'grandmother, grandfather and auntie', 'Phone': '222-555-105'})

## Summary and Recap

**Comma Separated Values (CSV)** is another way to read and insert data throughout multiple platforms. We end up using the `csv` library but `pandas` also exist to work with csv files. These separators don't *necessarily** have to be commas; they could be spaces or tabs.

**How do we read CSV files?** 
- We import the `csv` library 
    - `import csv` 
- Open the csv as a **file object** and understand that `newline` helps prevent misinterpretation of new lines characters
    - `with open('csv_file_name.csv', newline='') as csvfile` 
- Use the `csv.reader()` function to a reader object while specifying the csvfile object and delimiter used (as many could be the separators) 
    - `reader = csv.reader(csvfile, delimiter=',')` 
- Loop through the **reader object** to see each row of data
    -  ```python
       for row in reader:
            print(','.join(row))
       ```

**Instead of using a reader object** we could work with `OrderedDict` 
- Do the same thing with importing and opening a file object 
- This time we set the reader object to a `reader = csv.DictReader(csvfile)`
- However... we **must** also work with columns using `fieldnames`
    - set the field names (columns) as a list `fieldnames=['col1', 'col2']`
    - add this fieldnames to the reader object `reader = csv.DictReader(csvfile, fieldnames=fieldnames)`  
- This means we could access the data using **key values** 
    - ```python
      for row in reader:
            print(row['key1'], row['key2'])
      ```  

**Saving data** to a csv also has two of the *same* ways as reading CSV files
- We import our csv library 
    - `import csv` 
- Open the file as an object **BUT** we must open it in *writing* form
    - `with open('file_name.csv', "w", newline='') as csvfile` 
- Create the writer object 
    - `writer = csv.writer(csvfile, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL`
- Insert rows
    - `writer.writerow(['data1', 'data2'])` 

**Before we move onto** the second way we must look at **quote characters**. Now values are, by default, **quoted** and using the `quoting=csv.QUOTE_MINIMAL` specifies that only **special characters** will now use the `quotechar='"'` provided.
- There are more options like:
    - `csv.QUOTE_ALL` will quote all values 
    - `csv.QUOTE_NONNUMERIC` will quote non-numerical values 
    - `csv.QUOTE_NONE` doesnt quote any value (kinda bad idea)

**Saving data with dictionaries** is another way to store data
- This follows the same pattern with importing csv, opening a file object in *writing* form
- But it also includes `fieldnames=['col1','col2']` as we need the fieldnames for columns
- After that, we create the writer object using `DictWriter()`
    - `writer = csv.DictWriter(csvfile, fieldnames=fieldnames)`
- Now we have to **actually** write the header before inputting data 
    - `writer.writeheader()`  
- Instead of adding data with a list to `writerow()` function
    - We use dictionaries specifying the fieldnames as keys then provide its value
        -  `writer.writerow({'Name': 'mother', 'Phone': '222-555-101'})`

## Let's dumb it down for Justin (me)

**Comma Separated Values (CSV)** are used with delimiters (separators) that could be commas, tabs, spaces etc...

Using the `csv` library we could *read* and *write* data in two separate ways using the **reader/writer** function attached to an object or using the **DictReader/DictWriter** classes to treat rows as *dictionaries* instead of *lists*

It all starts with **opening the csv file as an object**
- `with open('csv_file_name.csv', newline='') as csvfile`

Depending on our need (read/write) we need to add in "w" into that `open()` function to write data. **Newline** is important!! as it clears misunderstanding of new line characters.

We create the reader/writer object with either of the two methods:
- `reader = csv.reader(csvfile, delimiter=',')`
- `reader = csv.DictReader(csvfile, fieldnames=['col1','col2'])`
    - Fieldnames are important because dictionaries need **columns**
- `writer = csv.writer(csvfile, delimiter=',', qoutechar='"', quoting=csv.QUOTE_MINIMAL)` 
- `writer = csv.DictWriter(csvfile, fieldnames=['col1', 'col2']`
    - Quoting is important because it handles *special characters* in our inputted data

If we are **reading** we just loop through the reader object:
- *normal* reader object would be a list type
- *dictionary* reader object would be a dictionary type (access with keys)

If we are **writing** we use `writer.writerow()`
However; depending on our type we might insert a list `['d1','d2']` or a dictionary based on the **fieldnames** `{'fieldname1':'val', 'fieldname2':'val2'}`

**BUT** if we are writing using `DictWriter()` we **MUST** `writer.writeheader()` before we insert data 