Skip to content


Subversion checkout URL

You can clone with
Download ZIP
Browse files

beginnings of tutorial on using csv module

  • Loading branch information...
1 parent dd701f9 commit 96bb3a978360c0ab4c454479bd38974d8c5d8334 Serdar Tumgoren committed
10 tutorials/textfiles101/data/banklist_sample.csv
@@ -0,0 +1,10 @@
+Bank Name,City,State,CERT #,Acquiring Institution,Closing Date,Updated Date
+"San Luis Trust Bank, FSB ",San Luis Obispo,CA,34783,First California Bank,18-Feb-11,18-Feb-11
+Charter Oak Bank,Napa,CA,57855,Bank of Marin,18-Feb-11,18-Feb-11
+Citizens Bank of Effingham,Springfield,GA,34601,Heritage Bank of the South,18-Feb-11,18-Feb-11
+Habersham Bank,Clarkesville,GA,151,SCBT National Association,18-Feb-11,18-Feb-11
+Canyon National Bank,Palm Springs,CA,34692,Pacific Premier Bank,11-Feb-11,18-Feb-11
+Badger State Bank,Cassville,WI,13272,Royal Bank,11-Feb-11,18-Feb-11
+Peoples State Bank,Hamtramck,MI,14939,First Michigan Bank,11-Feb-11,18-Feb-11
+Sunshine State Community Bank,Port Orange,FL,35478,"Premier American Bank, N.A.",11-Feb-11,18-Feb-11
+Community First Bank Chicago,Chicago,IL,57948,Northbrook Bank & Trust,4-Feb-11,10-Feb-11
92 tutorials/textfiles101/
@@ -1,22 +1,86 @@
#!/usr/bin/env python
-Below is a bare-bones example showing how to read data from a CSV.
+This file shows how to read data from a file using Python's
+built-in csv module.
+The csv module is smart enough to handle fields that contain apostrophes,
+commas and other common field delimiters.
+For this tutorial, we're using a subset of the FDIC failed banks list:
-We combine a "for" loop with the "open" function to read each line
-and add it to a list.
-The "open" function accepts a number of extra options, but in
-in its most basic form can simply be called with the path to a file.
-# A list to store our data points
-data_store = []
+import csv
+ Why the CSV module?
+The manual approach to splitting CSV records into columns
+is often tricky and error-prone.
-# Loop through lines, do some basic clean up, and
-# add data to our data_store
-for line in open('data/banklist.csv'):
+In the below example, we see that splitting on a comma
+does not work for the first record in our bank data.
+print "\n\nExample 1: Split lines manually\n"
+for line in open('data/banklist_sample.csv'):
clean_line = line.strip()
data_points = clean_line.split(',')
- data_store.append(data_points)
+ print data_points
+Splitting on a comma caused "San Luis Trust Bank, FSB "
+to become two fields: "San Luis Trust Bank" and "FSB".
+In a case like this, it's much easier to let Python's
+built-in csv module handle the field parsing for you.
+ Introducing the CSV module
-for line in data_store:
- print line
+We already imported the csv module at the top of this script.
+Now we create a csv "reader" object, capable of stepping through
+each line of the file and smartly parsing it out for us.
+We create the reader object by passing an open file to
+the csv's reader method.
+print "\n\nExample 2: Read file with the CSV module\n"
+bank_file = csv.reader(open('data/banklist_sample.csv', 'rb'))
+for record in bank_file:
+ print record
+Notice that in the above example, csv is smart enough to handle
+the comma inside the first bank name. So instead of two fields,
+it gives us "San Luis Trust Bank, FSB" as a single field.
+ Customizing the delimiters
+ By default, csv reader assumes the file is comma-delimited
+ You can customize the delimiters and field quote characters by using
+ extra options when you create the reader object
+#TODO: Create new sample .tsv file with pipes as quote character
+#print "\n\nExample 2: Read file with the CSV module\n"
+#bank_file = csv.reader(open('data/banklist_sample.csv', 'rb'))
+#for record in bank_file:
+# print record
+ Working with Column Headers
+- demo manual approach by first reading in all lines and extracting the
+ first line. Show alternative for large files using "next" method to
+ extract first line and then iterating over the remaining lines
+- Even easier: the DictReader approach

0 comments on commit 96bb3a9

Please sign in to comment.
Something went wrong with that request. Please try again.