Skip to content

Commit

Permalink
beginnings of tutorial on using csv module
Browse files Browse the repository at this point in the history
  • Loading branch information
Serdar Tumgoren committed Mar 12, 2011
1 parent dd701f9 commit 96bb3a9
Show file tree
Hide file tree
Showing 2 changed files with 88 additions and 14 deletions.
10 changes: 10 additions & 0 deletions tutorials/textfiles101/data/banklist_sample.csv
@@ -0,0 +1,10 @@
Bank Name,City,State,CERT #,Acquiring Institution,Closing Date,Updated Date
"San Luis Trust Bank, FSB ",San Luis Obispo,CA,34783,First California Bank,18-Feb-11,18-Feb-11
Charter Oak Bank,Napa,CA,57855,Bank of Marin,18-Feb-11,18-Feb-11
Citizens Bank of Effingham,Springfield,GA,34601,Heritage Bank of the South,18-Feb-11,18-Feb-11
Habersham Bank,Clarkesville,GA,151,SCBT National Association,18-Feb-11,18-Feb-11
Canyon National Bank,Palm Springs,CA,34692,Pacific Premier Bank,11-Feb-11,18-Feb-11
Badger State Bank,Cassville,WI,13272,Royal Bank,11-Feb-11,18-Feb-11
Peoples State Bank,Hamtramck,MI,14939,First Michigan Bank,11-Feb-11,18-Feb-11
Sunshine State Community Bank,Port Orange,FL,35478,"Premier American Bank, N.A.",11-Feb-11,18-Feb-11
Community First Bank Chicago,Chicago,IL,57948,Northbrook Bank & Trust,4-Feb-11,10-Feb-11
92 changes: 78 additions & 14 deletions tutorials/textfiles101/read_data_from_CSV_2.py
@@ -1,22 +1,86 @@
#!/usr/bin/env python
"""
Below is a bare-bones example showing how to read data from a CSV.
This file shows how to read data from a file using Python's
built-in csv module.
http://docs.python.org/library/csv.html
The csv module is smart enough to handle fields that contain apostrophes,
commas and other common field delimiters.
For this tutorial, we're using a subset of the FDIC failed banks list:
http://www.fdic.gov/bank/individual/failed/banklist.html
We combine a "for" loop with the "open" function to read each line
and add it to a list.
The "open" function accepts a number of extra options, but in
in its most basic form can simply be called with the path to a file.
"""
# A list to store our data points
data_store = []
import csv


"""
Why the CSV module?
The manual approach to splitting CSV records into columns
is often tricky and error-prone.
# Loop through lines, do some basic clean up, and
# add data to our data_store
for line in open('data/banklist.csv'):
In the below example, we see that splitting on a comma
does not work for the first record in our bank data.
"""

print "\n\nExample 1: Split lines manually\n"

for line in open('data/banklist_sample.csv'):
clean_line = line.strip()
data_points = clean_line.split(',')
data_store.append(data_points)
print data_points

"""
Splitting on a comma caused "San Luis Trust Bank, FSB "
to become two fields: "San Luis Trust Bank" and "FSB".
In a case like this, it's much easier to let Python's
built-in csv module handle the field parsing for you.
Introducing the CSV module
for line in data_store:
print line
We already imported the csv module at the top of this script.
Now we create a csv "reader" object, capable of stepping through
each line of the file and smartly parsing it out for us.
We create the reader object by passing an open file to
the csv's reader method.
"""

print "\n\nExample 2: Read file with the CSV module\n"
bank_file = csv.reader(open('data/banklist_sample.csv', 'rb'))

for record in bank_file:
print record

"""
Notice that in the above example, csv is smart enough to handle
the comma inside the first bank name. So instead of two fields,
it gives us "San Luis Trust Bank, FSB" as a single field.
Customizing the delimiters
By default, csv reader assumes the file is comma-delimited
You can customize the delimiters and field quote characters by using
extra options when you create the reader object
"""
#TODO: Create new sample .tsv file with pipes as quote character
#print "\n\nExample 2: Read file with the CSV module\n"
#bank_file = csv.reader(open('data/banklist_sample.csv', 'rb'))
#
#for record in bank_file:
# print record

"""
Working with Column Headers
- demo manual approach by first reading in all lines and extracting the
first line. Show alternative for large files using "next" method to
extract first line and then iterating over the remaining lines
- Even easier: the DictReader approach
"""

0 comments on commit 96bb3a9

Please sign in to comment.