#### working with a csv file

* A csv file is simple file with comma separated values


##### Aside creating a file directly from within jupyter

In [41]:
%%file books.csv 
title,author,price,rating
The Accursed God,Vivek Dutta Mishra,299,4.5
Brethren, John Grishma, 499,4.1
The Sons of Fortune, Jeffrey Archer, 410,4.3
The Count of Monte Cristo, Alexandre Dumas, 400,4.9
Manas, Vivek Dutta Mishra,199,4.5

Overwriting books.csv


#### we can read a csv file using simple open() command also

In [42]:
with open('books.csv') as f:
    while True:
        line =f.readline()
        if line:
            print(line)
        else:
            break

title,author,price,rating

The Accursed God,Vivek Dutta Mishra,299,4.5

Brethren, John Grishma, 499,4.1

The Sons of Fortune, Jeffrey Archer, 410,4.3

The Count of Monte Cristo, Alexandre Dumas, 400,4.9

Manas, Vivek Dutta Mishra,199,4.5



### How to extract information from each line?

* remember first line in the header line

In [43]:
def read_csv(file):
    records=[]
    with open(file) as f:
        lines=f.readlines()
        header=lines.pop(0)
        
        keys=header.strip().split(',')
        for line in lines:
            record=dict()
            info=line.strip().split(',')
            for i in range(len(info)):
                record[keys[i]]=info[i]
            records.append(record)
    return records


In [44]:
books=read_csv("books.csv")

In [45]:
for book in books:
    for k,v in book.items():
        print(f"{k}\t{v}")
    print()

title	The Accursed God
author	Vivek Dutta Mishra
price	299
rating	4.5

title	Brethren
author	 John Grishma
price	 499
rating	4.1

title	The Sons of Fortune
author	 Jeffrey Archer
price	 410
rating	4.3

title	The Count of Monte Cristo
author	 Alexandre Dumas
price	 400
rating	4.9

title	Manas
author	 Vivek Dutta Mishra
price	199
rating	4.5



### Making the life easier with csv module

* csv module provides automatic and robust way to read info

In [46]:
import csv

#help(csv.reader)

In [47]:
import csv

with open('books.csv') as f:
    reader = csv.reader(f)
    header=next(reader) #read the header line but not print it
    for value in reader:
        print(value)


['The Accursed God', 'Vivek Dutta Mishra', '299', '4.5']
['Brethren', ' John Grishma', ' 499', '4.1']
['The Sons of Fortune', ' Jeffrey Archer', ' 410', '4.3']
['The Count of Monte Cristo', ' Alexandre Dumas', ' 400', '4.9']
['Manas', ' Vivek Dutta Mishra', '199', '4.5']


#### we may have delimeters other than comma

* sometimes comma may be part of the data
* such fields with comma may be double quoted
* csv knows how to read those files

In [48]:
%%file books2.csv 
title,author,price,rating
"The Accursed God, The Lost Epic Book1" ,Vivek Dutta Mishra,299,4.5
Brethren, John Grishma, 499,4.1
The Sons of Fortune, Jeffrey Archer, 410,4.3
The Count of Monte Cristo, Alexandre Dumas, 400,4.9
Manas,Vivek Dutta Mishra,199,4.5

Overwriting books2.csv


In [49]:
books=read_csv('books2.csv')
for book in books:
    print(book['title'])

IndexError: list index out of range

#### Problem 

* my function **read_csv** treats each comma as separator
* it will not work if comma is part of data

In [50]:
import csv

with open('books2.csv') as f:
    reader = csv.reader(f)
    header=next(reader) #read the header line but not print it
    for value in reader:
        print(value)

['The Accursed God, The Lost Epic Book1 ', 'Vivek Dutta Mishra', '299', '4.5']
['Brethren', ' John Grishma', ' 499', '4.1']
['The Sons of Fortune', ' Jeffrey Archer', ' 410', '4.3']
['The Count of Monte Cristo', ' Alexandre Dumas', ' 400', '4.9']
['Manas', 'Vivek Dutta Mishra', '199', '4.5']


#### A different delimeter

* we may consier a different delimeter while reading or writing

In [51]:
import csv
def copy_csv(source,target,source_delimeter=',',target_delimeter=','):
    with open(source) as src:
        with open(target,"w") as trgt:
            reader= csv.reader(src,delimiter=source_delimeter)
            writer= csv.writer(trgt,delimiter=target_delimeter)
            for line in reader:
                writer.writerow(line)


In [52]:
copy_csv('books.csv','books3.csv')

In [53]:
copy_csv('books2.csv','books4.csv',target_delimeter="|")

### Reading/Writing data as dictionary

In [55]:
with open('books.csv') as f:
    books= csv.DictReader(f)
    for book in books:
        print(book['title'])

The Accursed God
Brethren
The Sons of Fortune
The Count of Monte Cristo
Manas
