<h1 style="color : red;">Working with csv files</h1>

## What is a csv file

A CSV file (Comma Separated Values file) is a type of plain text file that uses specific structuring to arrange tabular data. Because it’s a plain text file, it can contain only actual text data—in other words, printable ASCII or Unicode characters.

## Create a csv file using write method. Do not use csv module.

In [17]:
with open("sample_files/state_info.csv", "w") as file:
    file.write("Florida,Tallahasee,FL\n")
    file.write("Georgia,Atlanta,GA\n")
    file.write("South Carolina,Columbia,SC\n")

### read the file for verification

In [18]:
with open("sample_files/state_info.csv") as file:
    print(file.read())

Florida,Tallahasee,FL
Georgia,Atlanta,GA
South Carolina,Columbia,SC



## Using the CSV module

In [34]:
import csv

In [60]:
f = open("sample_files/state_info.csv")
csv_f = csv.reader(f) # Returns a reader object which will iterate over lines in the given csvfile.
for row in csv_f: # iterating over the reader object
    state, capital, abbreviation = row #unpacking a list
    print("State: {}, Capital: {}".format(state, capital))
    print("The abbreviation of {} is {}".format(state, abbreviation))
f.close()

print("*"*50)

with open('sample_files/state_info.csv') as file:
    csv_file = csv.reader(file)
    for row in csv_file:
        print(row)

State: Florida, Capital: Tallahasee
The abbreviation of Florida is FL
State: Georgia, Capital: Atlanta
The abbreviation of Georgia is GA
State: South Carolina, Capital: Columbia
The abbreviation of South Carolina is SC
**************************************************
['Florida', 'Tallahasee', 'FL']
['Georgia', 'Atlanta', 'GA']
['South Carolina', 'Columbia', 'SC']


## Create a csv file from a list that has no headers

In [35]:
country_list = [["USA", "Washington D.C."], ["Canada", "Ottawa"], ["Mexico", "Mexico City"]]

In [36]:
with open("sample_files/countries.csv", "w") as file:
    writer = csv.writer(file) #Return a writer object responsible 
                              #for converting the user’s data into delimited strings on the given file-like object.
    writer.writerows(country_list)

### Read  file

In [37]:
f = open("sample_files/countries.csv")
csv_f = csv.reader(f)
for row in csv_f:
    country, capital = row
    print("The capital of {} is {}".format(country, capital))

The capital of USA is Washington D.C.
The capital of Canada is Ottawa
The capital of Mexico is Mexico City


In [38]:
f.close()

## Using DictReader to read a csv file that has headers.

### Create a csv file using a list. The csv file contains a header.

The first member of the list is the header.

In [53]:
# this is the same procedure as the one above, except this list has a header row
river_list = [['River', 'Continent'], ['Nile', 'Africa'], 
              ['Tiber', 'Europe'], ['Mississippi', 'North America'], 
              ['Tigris', 'Asia'], ['Amazon', 'South America']]
with open("sample_files/rivers.csv", "w") as file:
    writer = csv.writer(file)
    writer.writerows(river_list)

### Read a csv file that has headers. 

DictReader assumens that the first row is the header.

DictReader creates an object that operates like a regular reader but maps the information in each row to a dict whose keys are given by the optional fieldnames parameter.

In [50]:
with open('sample_files/rivers.csv') as file:
    reader = csv.DictReader(file)
    print(reader.fieldnames)
    print('*'*50)
    for row in reader:    
        print(("The river {} is located in {}".format(row['River'], row['Continent'])))
    #the file will close after the iteration above is complete.
    #the code below will not show any results.
    for row in reader:
        print(row)
print('*'*50)
with open('sample_files/rivers.csv') as file:
    reader = csv.DictReader(file)
    for row in reader:
        print(row)

['River', 'Continent']
**************************************************
The river Nile is located in Africa
The river Tiber is located in Europe
The river Mississippi is located in North America
The river Tigris is located in Asia
The river Amazon is located in South America
**************************************************
OrderedDict([('River', 'Nile'), ('Continent', 'Africa')])
OrderedDict([('River', 'Tiber'), ('Continent', 'Europe')])
OrderedDict([('River', 'Mississippi'), ('Continent', 'North America')])
OrderedDict([('River', 'Tigris'), ('Continent', 'Asia')])
OrderedDict([('River', 'Amazon'), ('Continent', 'South America')])


## Use DictWriter to create a csv file that has header. Use a list of key and a list of dictionaries.

### Create keys and dictionary

In [29]:
keys = ["author", "title", "genre"]

data = [
    {
        "author" : "Isaac Assimov",
        "title" : "Foundation",
        "genre" : "science fiction"
    },
    {
        "author" : "Jane Austen",
        "title" : "Pride and Prejudice",
        "genre" : "fiction"
    },
    {
        "author" : "Napoleon Hill",
        "title" : "Think and Grow Rich",
        "genre" : "self-development"
    }
]

### Create the file to insert data

In [30]:
with open("sample_files/authors.csv", "w") as authors:
    writer = csv.DictWriter(authors, fieldnames=keys)
    writer.writeheader()
    writer.writerows(data)

### Verify data was written to file

In [57]:
with open("sample_files/authors.csv") as authors:
    reader = csv.DictReader(authors)
    print(reader.fieldnames)
    print("*"*50)
    for row in reader:
        print("{} wrote {}. The genre of the book is {}". format(row["author"], row["title"], row["genre"]))

print("*"*50)        
        
        
with open("sample_files/authors.csv") as authors:
    reader = csv.DictReader(authors)
    for row in reader:
        print(row)

['author', 'title', 'genre']
**************************************************
Isaac Assimov wrote Foundation. The genre of the book is science fiction
Jane Austen wrote Pride and Prejudice. The genre of the book is fiction
Napoleon Hill wrote Think and Grow Rich. The genre of the book is self-development
**************************************************
OrderedDict([('author', 'Isaac Assimov'), ('title', 'Foundation'), ('genre', 'science fiction')])
OrderedDict([('author', 'Jane Austen'), ('title', 'Pride and Prejudice'), ('genre', 'fiction')])
OrderedDict([('author', 'Napoleon Hill'), ('title', 'Think and Grow Rich'), ('genre', 'self-development')])
