we'll work with the most common file format that you will encounter in the wild world of data—CSV. It stands for Comma Separated Values, which almost explains all the formatting there is. (There is also a header part of the file, but those values are also comma separated.)

Python has a module called csv that supports reading and writing CSV files in various dialects. Dialects are important because there is no standard CSV, and different applications implement CSV in slightly different ways. A file's dialect is almost always recognizable by the first look into the file.

###### import data from a CSV file

Open the ch02-data.csv file for reading.

Read the header first.

Read the rest of the rows.

In case there is an error, raise an exception.

After reading everything, print the header and the rest of the rows.

In [3]:
import csv

filename = r'C:\Users\piush\Desktop\Dataset\PythonDataVisualizationCookbook\Chapter 02\ch02-data.csv'

data = []
try:
    with open(filename) as f:
        reader = csv.reader(f)
        #for python 3.2 onwards
        header = next(reader)
        data = [row for row in reader]
except csv.Error as e:
    print ("Error reading CSV file at line %s: %s" % (reader.line_num, e))
    sys.exit(-1)
if header:
    print (header)
    print ('==================')

for datarow in data:
    print (datarow)

['day', 'ammount']
['2013-01-24', '323']
['2013-01-25', '233']
['2013-01-26', '433']
['2013-01-27', '555']
['2013-01-28', '123']
['2013-01-29', '0']
['2013-01-30', '221']


If you have larger files that you want to load, it's often better to use well-known libraries like NumPy's loadtxt() that cope better with large CSV files.

The basic usage is simple as shown in the following code snippet:

import numpy

data = numpy.loadtxt('ch02-data.csv', dtype='string', delimiter=',')

Note that we need to define a delimiter to instruct NumPy to separate our data as appropriate. The function numpy.loadtxt() is somewhat faster than the similar function numpy.genfromtxt(), but the latter can cope better with missing data, and you are able to provide functions to express what is to be done during the processing of certain columns of loaded data files.