##



In [1]:
import csv

In [4]:
! type data_file\record.csv 

Region,Country,Capital
Asia,Kyrgyzstan,Bishkek
Central America,Honduras,Tegucigalpa
Europe,Bulgaria,Sofia
Sub-Saharan Africa,Cameroon,Yaounde


In [9]:
mypath1 = 'data_file/record.csv'
file = open(mypath1, 'r') # Created a file pointer. In recor mode.
with file:
    read = csv.reader(file) # With the file pointer, we can initialize a csv.reader object
    
    for row in read: # Return a list of strings for each row
        print(row)

['Region', 'Country', 'Capital']
['Asia', 'Kyrgyzstan', 'Bishkek']
['Central America', 'Honduras', 'Tegucigalpa']
['Europe', 'Bulgaria', 'Sofia']
['Sub-Saharan Africa', 'Cameroon', 'Yaounde']


In [10]:
! type data_file\record_pipe.csv 

Region|Country|Capital
Asia|Kyrgyzstan|Bishkek
Central America|Honduras|Tegucigalpa
Europe|Bulgaria|Sofia
Sub-Saharan Africa|Cameroon|Yaounde


In [11]:
mypath2 = 'data_file/record_pipe.csv'
file = open(mypath2, 'r') # Created a file pointer. In recor mode.
with file:
    read = csv.reader(file) # With the file pointer, we can initialize a csv.reader object
    
    for row in read: # Return AN ONLY STRING FOR EACH ROW
        print(row)

['Region|Country|Capital']
['Asia|Kyrgyzstan|Bishkek']
['Central America|Honduras|Tegucigalpa']
['Europe|Bulgaria|Sofia']
['Sub-Saharan Africa|Cameroon|Yaounde']


## The delimiter argument

### The pipe delimiter

**THe values in each row are not separated into elements within the list, and instead, the entire contents of the row, including the pipes, are treated as a single string.**

**How do we infor it?**

In [12]:
file = open(mypath2, 'r') # Created a file pointer. In recor mode.
with file:
    read = csv.reader(file, delimiter ="|") # With the file pointer, we can initialize a csv.reader object
    
    for row in read: # Return a list of strings for each row
        print(row)

['Region', 'Country', 'Capital']
['Asia', 'Kyrgyzstan', 'Bishkek']
['Central America', 'Honduras', 'Tegucigalpa']
['Europe', 'Bulgaria', 'Sofia']
['Sub-Saharan Africa', 'Cameroon', 'Yaounde']


### Tabular delimiter

In [13]:
! type data_file\record_tab.csv 

Region	Country	Capital
Asia	Kyrgyzstan	Bishkek
Central America		Honduras	Tegucigalpa
Europe	Bulgaria	Sofia
Sub-Saharan Africa	Cameroon	Yaounde


In [15]:
mypath3 = 'data_file/record_tab.csv'
file = open(mypath3, 'r') # Created a file pointer. In recor mode.
with file:
    read = csv.reader(file, delimiter ="\t") # With the file pointer, we can initialize a csv.reader object
    
    for row in read: # Return a list of strings for each row
        print(row)

['Region', 'Country', 'Capital']
['Asia', 'Kyrgyzstan', 'Bishkek']
['Central America', '', 'Honduras', 'Tegucigalpa']
['Europe', 'Bulgaria', 'Sofia']
['Sub-Saharan Africa', 'Cameroon', 'Yaounde']


## Situation: imagine dozens of columns in your CSV data

It may be difficult to remember which index represent each column --> **Include a DictReader**

When using a DictReader, the first row is assuming to be the row containing the colmun headers and the folowing rows are considered the data rows.

In [19]:
file = open(mypath1, 'r') # Created a file pointer. In recor mode.
with file:
    read = csv.DictReader(file)
    
    for row in read:  # In this for loop, we are casting each row as a DICTIONARY
        print(dict(row))

{'Region': 'Asia', 'Country': 'Kyrgyzstan', 'Capital': 'Bishkek'}
{'Region': 'Central America', 'Country': 'Honduras', 'Capital': 'Tegucigalpa'}
{'Region': 'Europe', 'Country': 'Bulgaria', 'Capital': 'Sofia'}
{'Region': 'Sub-Saharan Africa', 'Country': 'Cameroon', 'Capital': 'Yaounde'}


- **What if print out the raw contents of each row?**

This is an OrderedDict. The order in a Dict doesn't really matters, but in order to retain the sequence of columns in a CSV file, we need to ensure that the keys in the dictionary are ordered. This is why the csv.DictReader makes use of an ordered dictionary.

In [20]:
file = open(mypath1, 'r') # Created a file pointer. In recor mode.
with file:
    read = csv.DictReader(file)
    
    for row in read:  # In this for loop, we are casting each row as a DICTIONARY
        print(row)

OrderedDict([('Region', 'Asia'), ('Country', 'Kyrgyzstan'), ('Capital', 'Bishkek')])
OrderedDict([('Region', 'Central America'), ('Country', 'Honduras'), ('Capital', 'Tegucigalpa')])
OrderedDict([('Region', 'Europe'), ('Country', 'Bulgaria'), ('Capital', 'Sofia')])
OrderedDict([('Region', 'Sub-Saharan Africa'), ('Country', 'Cameroon'), ('Capital', 'Yaounde')])


## Convert python objects in csv files

### Writerow function

In [31]:
names = [['FirstName', 'LastName'],
        ['Sofia', 'Reyes'],
        ['Jerome', 'Jackson'],
        ['Jia', 'Zhong']]

In [32]:
mypath2 = 'data_file/names.csv'

file = open(mypath2, 'w') 

with file:
    file_writer = csv.writer(file)
    
    for row in names:
        file_writer.writerow(row)

In [33]:
! type data_file\names.csv 

FirstName,LastName

Sofia,Reyes

Jerome,Jackson

Jia,Zhong



### Writerows function

- Rather than iterating over the individual rows, and then writing them to the file using the write.row function, we simply call the write.rows funtcion with an S in order to write **the entire contents of this two dimensional list into the CSV file**


In [35]:
nums =  [[10,20,30],
        [40,50,60],
        [70,80,90]]
mypath3 = 'data_file/number.csv'
file = open(mypath3, 'w')

with file:
    write = csv.writer(file)
    write.writerows(nums) # Rather than iterating over the individual rows, and then writing them to the file using the write.row function, we simply call the write.rows funtcion with an S in order


In [37]:
! type data_file\number.csv 

10,20,30

40,50,60

70,80,90



### We have seen the DictReader object was able to pas CSV data into Python dictionaries, is there something available to perform the reverse operation?

Yes, the **DictWriter object**

In [None]:
! type data_file\record.csv 

In [None]:
! type data_file\record.csv 

## CSV Dialects

In [39]:
csv.register_dialect('tab', delimiter = '\t') #Create a new dialect called tab

In [40]:
with open('data_file/record_tab.csv', 'r') as file:
    reader = csv.reader(file, dialect='tab')
    
    for row in reader:
        print(row)

['Region', 'Country', 'Capital']
['Asia', 'Kyrgyzstan', 'Bishkek']
['Central America', '', 'Honduras', 'Tegucigalpa']
['Europe', 'Bulgaria', 'Sofia']
['Sub-Saharan Africa', 'Cameroon', 'Yaounde']


In [48]:
csv.register_dialect('plus', 
                     delimiter = '+', 
                     lineterminator = '\n\n\r')

In [49]:
names = [['FirstName', 'LastName'],
        ['Sofia', 'Reyes'],
        ['Jerome', 'Jackson'],
        ['Jia', 'Zhong']]

In [50]:
file = open('data_file/names_dialect.csv', 'w')

with file:
    file_writer = csv.writer(file, dialect = 'plus')
    for row in names:
        file_writer.writerow(row)

In [51]:
! type data_file\names_dialect.csv 

FirstName+LastName


Sofia+Reyes


Jerome+Jackson


Jia+Zhong


