# CSV Files
---
## CSV Files

Python can work with files besides just text files. Comma Separated Value (CSV) files are an example of a commonly used file format for storing data. CSV files are similar to a spreadsheet in that data is stored in rows and columns. Each row of data is on its own line in the file, and commas are used to indicate a new column. Here is an example of a CSV file.

![Monty Python CSV](https://apollo-media.codio.com/media%2F1%2Fc473eb6beb4634bfe7a411fb3b37daf0-ececdc69-7163-4d1c-b8d4-6b77565aa9f9.webp)

In order to read a CSV file, Python needs to import the `csv` module. If you are importing a module, always start your code with the import statements. The CSV file will be opened much like a text file, but Python needs to run the file through a CSV reader.

In [1]:
import csv

with open("student_folder/csv/monty_python_movies.csv", "r") as input_file:
    reader = csv.reader(input_file)
    for row in reader:
        print(row)

['Movie Title', 'Rating']
['Monty Python and the Holy Grail', '5']
["Monty Python's Life of Brian", '4']
['Monty Python Live at the Hollywood Bowl', '4']
["Monty Python's The Meaning of Life", '5']


## What happens if you:

* Remove the `import csv` line of code?
* Use the CSV filename as the parameter of `reader`:

```python
import csv

with open("student_folder/csv/monty_python_movies.csv", "r") as input_file:
    reader = csv.reader("monty_python_movies.csv")
    for row in reader:
        for data in row:
            print(data)
```

In [2]:
with open("student_folder/csv/monty_python_movies.csv", "r") as input_file:
    reader = csv.reader(input_file)
    for row in reader:
        print(row)

['Movie Title', 'Rating']
['Monty Python and the Holy Grail', '5']
["Monty Python's Life of Brian", '4']
['Monty Python Live at the Hollywood Bowl', '4']
["Monty Python's The Meaning of Life", '5']


In [8]:
import csv

with open("student_folder/csv/monty_python_movies.csv", "r") as input_file:
    reader = csv.reader("monty_python_movies.csv")
    for row in reader:
        for data in row:
            print(data)

m
o
n
t
y
_
p
y
t
h
o
n
_
m
o
v
i
e
s
.
c
s
v


## Next
The first row of a CSV file is helpful because the header values provide context for the data. However, the first row is not useful if you want to know how many rows of data, or calculate the avg value, etc. The `next` command allows Python to skip the first row before looping through the CSV file.

In [10]:
import csv

with open("student_folder/csv/home_runs.csv", "r") as input_file:
    reader = csv.reader(input_file)
    next(reader) #skip the header row
    for row in reader:
        print(row)

['Barry Bonds', '762', 'No']
['Hank Aaron', '755', 'No']
['Babe Ruth', '714', 'No']
['Alex Rodriguez', '696', 'No']
['Willie Mays', '660', 'No']
['Albert Pujols', '656', 'Yes']
['Ken Griffey Jr.', '630', 'No']
['Jim Thome', '612', 'No']
['Sammy Sosa', '609', 'No']
['Frank Robinson', '586', 'No']


## What happens if you:

* Remove the line `next(reader)`?
* Have two lines of `next(reader)`?

In [11]:
import csv

with open("student_folder/csv/home_runs.csv", "r") as input_file:
    reader = csv.reader(input_file)
    for row in reader:
        print(row)

['Player', 'Home Runs', 'Active Player']
['Barry Bonds', '762', 'No']
['Hank Aaron', '755', 'No']
['Babe Ruth', '714', 'No']
['Alex Rodriguez', '696', 'No']
['Willie Mays', '660', 'No']
['Albert Pujols', '656', 'Yes']
['Ken Griffey Jr.', '630', 'No']
['Jim Thome', '612', 'No']
['Sammy Sosa', '609', 'No']
['Frank Robinson', '586', 'No']


In [12]:
import csv

with open("student_folder/csv/home_runs.csv", "r") as input_file:
    reader = csv.reader(input_file)
    next(reader) #skip the header row
    next(reader) #skip the header row
    for row in reader:
        print(row)

['Hank Aaron', '755', 'No']
['Babe Ruth', '714', 'No']
['Alex Rodriguez', '696', 'No']
['Willie Mays', '660', 'No']
['Albert Pujols', '656', 'Yes']
['Ken Griffey Jr.', '630', 'No']
['Jim Thome', '612', 'No']
['Sammy Sosa', '609', 'No']
['Frank Robinson', '586', 'No']


## Reading Question

Why do you need the `import csv` statement in the examples above?

- **Because `csv.reader` would generate an error if the import statement did not exist.**
- You do not need it; it is a typo.
- You cannot open a file with the `.csv` file extension without importing `csv`.
- Because importing `csv` is required for every Python program.

The `csv.reader` command would cause an error if `csv` is not imported. It is possible to read a `.csv` file without importing the `csv` module. It would be done just like reading a text file. Notice, `csv.reader` is **not used** in the code snippet below.

```python
with open("student_folder/csv/monty_python_movies.csv", "r") as input_file:
  for line in input_file.readlines():
    print(line, end="")
```