In this lesson, we will work with csv files. We will start with a text file with a list of colors, and then make it into a csv file.

CSV stands for "comma separated value." There's a list of values on each line of the file, with a comma separating each of them.

We don't have a csv file in our directory right now. We're going to make several here.

example
`1, red`
`2, yellow`

Let's start by importing the `csv` module.

In [1]:
import csv

Now let's open the `colors_for_csv.txt` file.

In [2]:
with open('colors_for_csv.txt') as in_file:
    colors = [line.strip() for line in in_file] #list comprehension

In [3]:
#let's verify that we have the colors down correctly
print(colors)

['Red', 'Orange', 'Blue', 'Yellow', 'Green', 'Purple']


Now, we are going to make a very simple csv file from these colors. First, let's place them in alphabetical order with the `.sort()` method. 

In [4]:
colors.sort()
print(colors)

['Blue', 'Green', 'Orange', 'Purple', 'Red', 'Yellow']


CSV files have a comma separating each value from the one after it. We are going to write to a csv file with the number of the color in one line (using `.zip`) and the color itself after that. 

In [11]:
#first, we create the output file, opening it just like any other file
out_file1 = open('simple.csv', 'w', newline='')
#then we create a Writer object.
colorWriter = csv.writer(out_file1)


In [12]:
#Let's write the colors list to the file
colorWriter.writerow(colors)
out_file1.close()

That's it. We have just written to our first csv file. It should be in the same directory as this file. Open it and take a look. We should see something like this:

Blue,Green,Orange,Purple,Red,Yellow 

All of these colors are on the same line. Let's read the colors back now. We wrote with a `csv.writer` object, so we must read with a `csv.reader` object, right?

In [20]:
in_file = open('simple.csv') #don't add a 'w' or it will overwrite the file
colorReader = csv.reader(in_file)

In [21]:
#Let's print out the words in the file:
for row in colorReader:
    print('Row #', colorReader.line_num, row)

Row # 1 ['Blue', 'Green', 'Orange', 'Purple', 'Red', 'Yellow']


We got it, every line we've written to the file is returned to us as a single row.

In [36]:
#So let's make more than one row
out_file = open('multiple_rows.csv', 'w', newline='')
colorWriter = csv.writer(out_file)

In [37]:
#start with writing the rows by hand
colorWriter.writerow([1,'Red'])

7

In [38]:
#add another row
colorWriter.writerow([2,'Blue'])

#close the file
out_file.close()

In [40]:
#Let's open the file and take a look
#We should see something like this:
#1,Red
#2,Blue

#A reader lets you read the contents of the csv file, as we did above. Let's open a csv reader and 
#read this file with a for loop.

In [42]:
in_file = open('multiple_rows.csv', 'r')
colorReader = csv.reader(in_file)
for row in colorReader:
    print(row)

['1', 'Red']
['2', 'Blue']


We've written to a csv file! Now let's overwrite this file and write each line using a for loop. The `enumerate()` function lets us combine the data with its position in a list.

In [60]:
colors = ['Blue', 'Green', 'Orange', 'Purple', 'Red', 'Yellow']

out_file = open('multiple_rows.csv', 'w', newline='')
colorWriter = csv.writer(out_file)
for index, row in enumerate(colors):
    colorWriter.writerow([index, row]) #notice how we pass the arguments as a list
out_file.close()

In [63]:
#notice the index starts with zero. If we want to start with 1, we can do this:
colors = ['Blue', 'Green', 'Orange', 'Purple', 'Red', 'Yellow']

out_file = open('multiple_rows.csv', 'w', newline='')
colorWriter = csv.writer(out_file)
for index, row in enumerate(colors):
    colorWriter.writerow([index+1, row]) #we add 1 to the index because it is not 
out_file.close()
#Much better! the index starts at 1

In [67]:
#We can also separate the elements with tabs
#Let's use the suffix .tsv because its a tab separated file 
out_file_with_tabs = open('rows_with_tabs.tsv', 'w', newline='')
tabWriter = csv.writer(out_file_with_tabs, delimiter='\t')
for index, row in enumerate(colors):
    tabWriter.writerow([index+1, row]) #the 'delimiter' argument lets us do t
out_file_with_tabs.close()