# File Handling (Part 03) - CSV files

## What is CSV? 
CSV (Comma Separated Values) is a simple file format used to store tabular data, such as a spreadsheet or database. CSV files contain data in a structured way such that each item or attribute is separated by a comma, and each record is on a new line of the file. CSV (Comma Separated Values) format is the most common import and export format for spreadsheets and databases. You can simply write them in a text editor but write the extension `.csv` instead of `.txt`. For example, data for a CSV file can look like:

If you open a csv file in a text editor like notepad etc, then you will see the data like in the above cell. But you can open a csv file in MS Excel as well and it will show you in a proper format.

### Working with CSVs
We can read and write in csv files using simple file handling modes and techniques that we have learned so far but there is a built in csv module in Python that will make the things easier. So we will be using that module.

In [3]:
# Writing using simple write mode
data = [
    ['Name', 'Age', 'City'],
    ['Alice', '25', 'New York'],
    ['Bob', '30', 'San Francisco']
]

with open('21_candidates.csv', 'w') as file:
    for row in data:
        file.write(','.join(row) + '\n')

In [4]:
# Reading using simple read mode
with open('21_candidates.csv', 'r') as file:
    lines = file.readlines()
    for line in lines:
        print(line.strip().split(','))

['Name', 'Age', 'City']
['Alice', '25', 'New York']
['Bob', '30', 'San Francisco']


### Using CSV Module
1. **Reading**  
   `reader`: This function takes a file object and returns a reader object that iterates over the lines of the file as lists.

In [5]:
import csv

In [6]:
with open("21_size.csv", "r") as file:
    reader = csv.reader(file)
    for row in reader:
        print(row)

['Size', 'Abbreviated Size']
['Extra Small', 'XS']
['Small', 'S']
['Medium', 'M']
['Large', 'L']
['Extra Large', 'XL']


`DictReader`: This class takes a file object and returns a DictReader object that iterates over the lines of the file as dictionaries, where each key is a header and each value is the corresponding data.

In [7]:
with open("21_size.csv", "r") as file:
    reader = csv.DictReader(file)
    for row in reader:
        print(row)

{'Size': 'Extra Small', 'Abbreviated Size': 'XS'}
{'Size': 'Small', 'Abbreviated Size': 'S'}
{'Size': 'Medium', 'Abbreviated Size': 'M'}
{'Size': 'Large', 'Abbreviated Size': 'L'}
{'Size': 'Extra Large', 'Abbreviated Size': 'XL'}


2. **Writing**  
   `writer`: Like the simple write example, this function creates a writer object that allows writing rows to the file.

In [8]:
data = [
    ['Product', 'Price', 'Quantity'],
    ['Laptop', 1200.50, 5],
    ['Mouse', 19.99, 20],
    ['Monitor', 299.99, 3],
    ['Headphones', 89.95, 10]
]
with open('21_inventory.csv', 'w') as file:     # newline='' will handle the extra CR
    csv_writer = csv.writer(file)
    # Write header row
    csv_writer.writerow(data[0])
    # Write data rows
    csv_writer.writerows(data[1:])

In [9]:
with open('21_inventory.csv', 'w',newline='') as file:    # newline='' sets the uniform CRLF for all plateforms
    csv_writer = csv.writer(file)
    csv_writer.writerow(data[0])
    # Write data rows
    csv_writer.writerows(data[1:])

`DictWriter`: Similar to DictReader, this class creates a writer object that writes data as dictionaries.n

In [10]:
# Writing data to a CSV file using DictWriter
data = [
    {'Product': 'Laptop', 'Price': 1200.50, 'Quantity': 5},
    {'Product': 'Mouse', 'Price': 19.99, 'Quantity': 20},
    {'Product': 'Monitor', 'Price': 299.99, 'Quantity': 3},
    {'Product': 'Headphones', 'Price': 89.95, 'Quantity': 10}
]

inventory_columns = ['Product', 'Price', 'Quantity']

with open('21_inventory.csv', 'w', newline='') as file:
    csv_writer = csv.DictWriter(file, fieldnames=inventory_columns)
    
    # Write header row
    csv_writer.writeheader()
    
    # Write data rows
    csv_writer.writerows(data)


### Accessing specific rows and columns

In [11]:
# 1. Accessing the price and quantity of monitor
with open("21_inventory.csv", "r") as file:
    # Create a reader object
    reader = csv.DictReader(file)

    # Find the row for the desired product
    for row in reader:
        if row["Product"] == "Monitor":
            price = row["Price"]
            quantity = row['Quantity']
            break

    print(f"Price of Monitor: ${price}")
    print(f"Quantity of Monitor: ${quantity}")

Price of Monitor: $299.99
Quantity of Monitor: $3


In [12]:
# 2. Accessing total quantity of all items:
with open("21_inventory.csv", "r") as file:
    # Create a reader object
    reader = csv.reader(file)

    # Skip the header row
    next(reader)

    # Total quantity
    total_quantity = 0
    for row in reader:
        quantity = int(row[2])
        total_quantity += quantity

    # Print the total quantity
    print(f"Total Quantity: {total_quantity}")

Total Quantity: 38


In [13]:
# 3. Accessing average price of all items:
with open("21_inventory.csv", "r") as file:
    # Create a reader object
    reader = csv.reader(file)

    # Skip the header row
    next(reader)

    # Total price and quantity
    total_price = 0
    total_quantity = 0
    for row in reader:
        price = float(row[1])
        quantity = int(row[2])
        total_price += price * quantity
        total_quantity += quantity

    # Calculate the average price
    average_price = total_price / total_quantity

    # Print the average price
    print(f"Average Price: ${average_price}")

Average Price: $215.83605263157895


# File Handling using Pandas
Pandas is a very useful library and it is very extensively used for data analysis and data science. But we will see only the file handling part of it. We can read and write csv files using pandas as well. We can also access specific rows and columns using pandas. Here is the command for installing pandas:  
`pip install pandas`

In [14]:
# Importing the library
import pandas as pd

Pyarrow will become a required dependency of pandas in the next major release of pandas (pandas 3.0),
(to allow more performant data types, such as the Arrow string type, and better interoperability with other libraries)
but was not found to be installed on your system.
If this would cause problems for you,
please provide us feedback at https://github.com/pandas-dev/pandas/issues/54466
        
  import pandas as pd


## Reading and Writing using Pandas
To read data from a file, we can use a function from pandas called `read_csv`. This function reads the data from a csv file and returns a DataFrame object. We can also write data to a csv file using the `to_csv` function. We can also access specific rows and columns using pandas.

In [15]:
data_frame = pd.read_csv("21_inventory.csv")
data_frame

Unnamed: 0,Product,Price,Quantity
0,Laptop,1200.5,5
1,Mouse,19.99,20
2,Monitor,299.99,3
3,Headphones,89.95,10


In [16]:
data_frame['Product']

0        Laptop
1         Mouse
2       Monitor
3    Headphones
Name: Product, dtype: object

In [17]:
# Accessing the price and quantity of monitor
monitor = data_frame[data_frame['Product'] == 'Monitor']
monitor

Unnamed: 0,Product,Price,Quantity
2,Monitor,299.99,3


In [18]:
monitor['Price']

2    299.99
Name: Price, dtype: float64

In [19]:
# Accessing average price of all items
total = (data_frame['Price']*data_frame['Quantity']).sum()
average_price = total / data_frame['Quantity'].sum()
average_price

215.83605263157895