# Storing Data in CSV Files

Although printing to the terminal is a lot of fun, it’s not incredibly useful when it comes to data aggregation and analysis. To make the majority of web scrapers remotely useful, you need to be able to save the information that they scrape.

Chapter 7 of the textbook covers three main methods of data management for web scrapers: storing media files, storing data in CSV files, and storing data in MySQL database. In the following, we learn how to write data to a CSV file using Python's built-in `csv` module. The `pandas` library provides an alternative for this task. 

## What Is a CSV File?

The **CSV (Comma Separated Values) format** is the most common import and export format for spreadsheets and databases. A CSV file allows data to be saved in a tabular structure with a .csv extension. 

CSV format was used for many years prior to attempts to describe the format in a standardized way. The lack of a well-defined standard means that subtle differences often exist . These differences can make it annoying to process CSV files from multiple sources. 

While the **delimiters** and **quoting** characters vary in the data produced and consumed by different applications, the overall format is similar enough that it is possible to write a single module which can efficiently manipulate such data, hiding the details of reading and writing the data from the programmer.

## Python's `csv` module

The `csv` module implements classes to read and write tabular data in CSV format. It allows programmers to say, “write this data in the format preferred by Excel,” or “read data from this file which was generated by Excel,” without knowing the precise details of the CSV format used by Excel.

The `csv` module provides several functions and classes for reading and writing CSVs, including: `csv.reader` and `csv.writer` functions.

### `csv.reader`:

The `csv.reader` functions takes the following parameters:

                  csv.reader(csvfile, dialect='excel', fmtparams)
                                        
- `csvfile`: This is usually an object which supports the iterator protocol and returns a string each time its __next__() method is called.

- `dialect='excel'`: An optional parameter used to define a set of parameters specific to a particular CSV dialect.

- `fmtparams`: An optional parameter that can be used to override existing formatting parameters.

Here is an example of how to use the `csv.reader`:

#### <font color=blue> Example 1: Reading a CSV file.

In [None]:
import csv
 
with open('example1.csv', newline='') as File:  
    reader = csv.reader(File)
    for row in reader:
        print(row)

### `csv.writer`:

This fucntion is similar to the `csv.reader`  and is used to write data to a CSV. It takes three parameters:

             csv.writer(csvfile, dialect='excel', fmtparams)

- `csvfile`: This can be any object with a `write()` method.
- `dialect='excel'`: An optional parameter used to define a set of parameters specific to a particular CSV.
- `fmtparam`: An optional parameter that can be used to override existing formatting parameters.

The `writer()` function creates an object suitable for writing. To iterate the data over the rows, you will need to use the **`writerow()`** or **`writerows()`** methods.

**<font color=red> Note:** If `csvfile` is a file object, it should be opened with `newline='' `.

#### <font color=blue> Example 2: Writing a CSV file.

In [None]:
import csv

with open('example2.csv', 'w') as Example2:
    writer = csv.writer(Example2)
    writer.writerow(["first_name", "last_name", "Grade"])
     
print("Writing complete")

In [None]:
import csv

myData = [["first_name", "second_name", "Grade"],
          ['Alex', 'Brian', 'A'],
          ['Tom', 'Smith', 'B']]
 
with open('example3.csv', 'w') as Example3:
    writer = csv.writer(Example3)
    writer.writerows(myData)  #note that its writerows not writerow
     
print("Writing complete")


Check the file created and notice the extra line between rows. To aovid this, as noted above, we should use the parameter `newline=''`.

In [None]:
import csv

myData = [["first_name", "second_name", "Grade"],
          ['Alex', 'Brian', 'A'],
          ['Tom', 'Smith', 'B']]
 
with open('example3.csv', 'a', newline='') as Example3:
    writer = csv.writer(Example3)
    writer.writerows(myData)  #note that its writerows not writerow
     
print("Writing complete")

### Open Mode:

The first argument of `open()` function is a string containing the file name. The second argument is another string containing a few characters describing the way in which the file will be used. The *open mode* can be:


- `'r'` when the file will only be read, 
- `'w'` for only writing (an existing file with the same name will be erased), 
- `'a'` opens the file for appending; any data written to the file is automatically added to the end. 
- `'r+'` opens the file for both reading and writing. 

The mode argument is optional; `'r'` will be assumed if it’s omitted.


#### <font color=blue> Example 4: Appending new rows to an existing file. </font>

In [None]:
import csv

newData = ['Amir', 'Gandomi', 'B']
 
with open('example3.csv', 'a', newline='') as Example3:
    writer = csv.writer(Example3)
    writer.writerow(newData)  #note that its writerows not writerow
     
print("Appnding complete")