# Python Tutorial: Working with CSV files - Read csv files using csv module

CSV files are one popular type of files which stand for Comma-Separated Value files. This type of files contain data in the form of rows, each row containing multiple values each separated by comma. Though by definition, the values in a row in a CSV file are separated by comma, this delimiting character can be anything other than a comma. For example, | can be used as delimiter.

We will see all these details and how to work with this type of file using a module called csv that is available in Python in this tutorial.

## csv module

To work with csv files, we are going to use the csv module in Python. 
There is also an in-built function in Python called open() that we can use to work with any file.
There is another package called pandas using which we can work with csv files, we will look into that in another tutorial.

In [4]:
import csv

## Reading csv file using csv module

To read a file using csv module, we use the __reader__ function as follows. Suppose we have the file Sample.csv in the current directory.

In [2]:
with open(r'Sample.csv', 'r') as file:
    read = csv.reader(file)
    for row in read:
        print(row)

['Name', 'Age', 'Sex']
['Raga', '33', 'Male']


In the above example, the csv file was using comma as the delimiter. Let's say the file uses tab delimiter. How would we read such a file using csv module? Here you go...

In [3]:
with open(r'Sample_Tab.csv', 'r') as file:
    read = csv.reader(file, delimiter = '\t')
    for row in read:
        print(row)

['Name', 'Age', 'Sex']
['Raga', '33', 'Male']


Suppose we have got initial spaces added in each value in the csv file. Let's see how it looks if we read using the above method:

In [5]:
with open(r'Sample_InitialSpace.csv', 'r') as file:
    read = csv.reader(file)
    for row in read:
        print(row)

['Name', ' Age', ' Sex']
['Raga', ' 33', ' Male']


As we could see, Python has read the values including the initial spaces. If we would want to exclude the initial spaces, then we can do it as follows:

In [6]:
with open(r'Sample_InitialSpace.csv', 'r') as file:
    read = csv.reader(file, skipinitialspace=True)
    for row in read:
        print(row)

['Name', 'Age', 'Sex']
['Raga', '33', 'Male']


Similarly, what happens if the values happen to contain quotes in them? How would Python read them? Let's see...

In [10]:
with open(r'Sample_Quotes.csv', 'r') as file:
    read = csv.reader(file, skipinitialspace=True)
    for row in read:
        print(row)

['Name', 'Age', 'Sex']
['Raga', '33', 'Male']


As we saw, Python understood that the values contain quotes in them and ignored while reading. But, if the quotes are part of the values and you do not want Python to ignore them, then there is an option for you to use:

In [12]:
with open(r'Sample_Quotes.csv', 'r') as file:
    read = csv.reader(file, skipinitialspace=True, quoting=csv.QUOTE_NONE)
    for row in read:
        print(row)

['"Name"', '"Age"', '"Sex"']
['"Raga"', '"33"', '"Male"']


There are four pre-defined constants that we can pass to the quoting parameter:

- __csv.QUOTE_MINIMAL__ = CSV file has quotes around those entries which contain special characters such as delimiter character
- __csv.QUOTE_NONNUMERIC__ = CSV file has quotes around the non-numeric entries
- __csv.QUOTE_NONE__ = CSV file has none of the entries that have quotes around them
- __csv.AUOTE_ALL__ = CSV file has all values inside quotation marks. This is the default if you specify nothing.

If we see in the last example, we have passed two parameters to the function reader. This can make the code look more redundant and difficult to read when we happen to read more and more files in the program. This redundancy can be avoided by using a concept called __Dialects__

Dialect helps in grouping together many specific parameters into a single dialect name. This dialect name can then be passed as a parameter to multiple reader functions (and writer functions, as we see in a later chapter)

In [15]:
csv.register_dialect('SampleDialect', delimiter='\t', skipinitialspace=True, quoting=csv.QUOTE_NONE)

with open(r'Sample_All.csv', 'r') as file:
    read = csv.reader(file, dialect='SampleDialect')
    for row in read:
        print(row)

['"Name"', '"Age"', '"Sex"']
['"Raga"', '"33"', '"Male"']


The contents of the csv file can be read as dictionary data too. Just quickly, a dictionary data format is one in which data is read as key-value pair. Example {'Name':'Raga', 'Age':'33', 'Sex':'Male'}. csv file can be read as such key-value pair using DictReader() __class__. Yes, please note that DictReader() is a class and not a function.

In [16]:
with open(r'Sample.csv', 'r') as file:
    read = csv.DictReader(file)
    for row in read:
        print(row)

{'Name': 'Raga', 'Age': '33', 'Sex': 'Male'}


We will see this DictReader() class in more detail in a later section. There are many other functions and classes available in the csv module that we can use for various purposes.

But, we have just introduced the very basic of how to read a simple csv file using the basic functions and classes available in csv module. We will see the other functions and classes of csv module as and when needed.