Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
beginnings of tutorial on using csv module
- Loading branch information
Serdar Tumgoren
committed
Mar 12, 2011
1 parent
dd701f9
commit 96bb3a9
Showing
2 changed files
with
88 additions
and
14 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
Bank Name,City,State,CERT #,Acquiring Institution,Closing Date,Updated Date | ||
"San Luis Trust Bank, FSB ",San Luis Obispo,CA,34783,First California Bank,18-Feb-11,18-Feb-11 | ||
Charter Oak Bank,Napa,CA,57855,Bank of Marin,18-Feb-11,18-Feb-11 | ||
Citizens Bank of Effingham,Springfield,GA,34601,Heritage Bank of the South,18-Feb-11,18-Feb-11 | ||
Habersham Bank,Clarkesville,GA,151,SCBT National Association,18-Feb-11,18-Feb-11 | ||
Canyon National Bank,Palm Springs,CA,34692,Pacific Premier Bank,11-Feb-11,18-Feb-11 | ||
Badger State Bank,Cassville,WI,13272,Royal Bank,11-Feb-11,18-Feb-11 | ||
Peoples State Bank,Hamtramck,MI,14939,First Michigan Bank,11-Feb-11,18-Feb-11 | ||
Sunshine State Community Bank,Port Orange,FL,35478,"Premier American Bank, N.A.",11-Feb-11,18-Feb-11 | ||
Community First Bank Chicago,Chicago,IL,57948,Northbrook Bank & Trust,4-Feb-11,10-Feb-11 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,22 +1,86 @@ | ||
#!/usr/bin/env python | ||
""" | ||
Below is a bare-bones example showing how to read data from a CSV. | ||
This file shows how to read data from a file using Python's | ||
built-in csv module. | ||
http://docs.python.org/library/csv.html | ||
The csv module is smart enough to handle fields that contain apostrophes, | ||
commas and other common field delimiters. | ||
For this tutorial, we're using a subset of the FDIC failed banks list: | ||
http://www.fdic.gov/bank/individual/failed/banklist.html | ||
We combine a "for" loop with the "open" function to read each line | ||
and add it to a list. | ||
The "open" function accepts a number of extra options, but in | ||
in its most basic form can simply be called with the path to a file. | ||
""" | ||
# A list to store our data points | ||
data_store = [] | ||
import csv | ||
|
||
|
||
""" | ||
Why the CSV module? | ||
The manual approach to splitting CSV records into columns | ||
is often tricky and error-prone. | ||
# Loop through lines, do some basic clean up, and | ||
# add data to our data_store | ||
for line in open('data/banklist.csv'): | ||
In the below example, we see that splitting on a comma | ||
does not work for the first record in our bank data. | ||
""" | ||
|
||
print "\n\nExample 1: Split lines manually\n" | ||
|
||
for line in open('data/banklist_sample.csv'): | ||
clean_line = line.strip() | ||
data_points = clean_line.split(',') | ||
data_store.append(data_points) | ||
print data_points | ||
|
||
""" | ||
Splitting on a comma caused "San Luis Trust Bank, FSB " | ||
to become two fields: "San Luis Trust Bank" and "FSB". | ||
In a case like this, it's much easier to let Python's | ||
built-in csv module handle the field parsing for you. | ||
Introducing the CSV module | ||
for line in data_store: | ||
print line | ||
We already imported the csv module at the top of this script. | ||
Now we create a csv "reader" object, capable of stepping through | ||
each line of the file and smartly parsing it out for us. | ||
We create the reader object by passing an open file to | ||
the csv's reader method. | ||
""" | ||
|
||
print "\n\nExample 2: Read file with the CSV module\n" | ||
bank_file = csv.reader(open('data/banklist_sample.csv', 'rb')) | ||
|
||
for record in bank_file: | ||
print record | ||
|
||
""" | ||
Notice that in the above example, csv is smart enough to handle | ||
the comma inside the first bank name. So instead of two fields, | ||
it gives us "San Luis Trust Bank, FSB" as a single field. | ||
Customizing the delimiters | ||
By default, csv reader assumes the file is comma-delimited | ||
You can customize the delimiters and field quote characters by using | ||
extra options when you create the reader object | ||
""" | ||
#TODO: Create new sample .tsv file with pipes as quote character | ||
#print "\n\nExample 2: Read file with the CSV module\n" | ||
#bank_file = csv.reader(open('data/banklist_sample.csv', 'rb')) | ||
# | ||
#for record in bank_file: | ||
# print record | ||
|
||
""" | ||
Working with Column Headers | ||
- demo manual approach by first reading in all lines and extracting the | ||
first line. Show alternative for large files using "next" method to | ||
extract first line and then iterating over the remaining lines | ||
- Even easier: the DictReader approach | ||
""" |