# <span style="color:red;">Step 1: Understanding the CSV Module</span>

Python's ```csv``` module provides functionality to handle **CSV (Comma-Separated Values)** files.   
These files typically contain tabular data where each row represents a record, and each value within a row is separated by a comma (,).      

The csv module provides:   
- csv.reader to read CSV files.
- csv.writer to write data to CSV files.
- Options to handle different delimiters, quoting, and more.

# <span style="color:red;">Step 2: Creating a Sample CSV File with Numeric Data</span>

Create a CSV File containing Numeric Data as described below:    
```
ID,Name,Maths,Science,English
1,Alice,85,90,88
2,Bob,78,80,79
3,Charlie,92,89,94
4,David,85,92,87
5,Eva,91,95,93
```

# <span style="color:red;">Step 3: Reading the CSV File</span>       
We will now learn how to read this CSV file using Python's **csv.reader**.

## <span style="color:blue;">i. Opening the CSV File</span>    
We need to **open the CSV file** using the **open()** function and then **pass** the **file object** to **csv.reader**.

In [2]:
import csv

# Open the CSV file
with open('data.csv', mode='r') as file:
    # Create a csv reader object
    csv_reader = csv.reader(file)
    
    # Reading each row in the csv
    for row in csv_reader:
        print(row)

['ID', 'Name', 'Maths', 'Science', 'English']
['1', 'Alice', '85', '90', '88']
['2', 'Bob', '78', '80', '79']
['3', 'Charlie', '92', '89', '94']
['4', 'David', '85', '92', '87']
['5', 'Eva', '91', '95', '93']


**Note:**    
- The **first row** contains the **headers (ID, Name, Maths, Science, English)**.
- Each **subsequent row** contains the **numeric data** in **string format**.

In [3]:
csv_reader.line_num

6

## <span style="color:blue;">ii. Skipping the Header</span>    
Since the **first row contains the header (column names)**, we usually want to **skip it** when performing calculations.   
Here’s how to skip the header:

In [5]:
import csv

# Open the CSV file
with open('data.csv', mode='r') as file:
    # Create a csv reader object
    csv_reader = csv.reader(file)
    
    # Skip the header
    next(csv_reader)
    
    # Reading each row after the header
    for row in csv_reader:
        print(row)

['1', 'Alice', '85', '90', '88']
['2', 'Bob', '78', '80', '79']
['3', 'Charlie', '92', '89', '94']
['4', 'David', '85', '92', '87']
['5', 'Eva', '91', '95', '93']


# <span style="color:red;">Step 4: Processing the Numeric Data</span>      
Now that we have the data, we’ll convert the relevant columns (Maths, Science, English) from strings to integers to perform mathematical operations.

## <span style="color:blue;">i. Processing the Numeric Data</span>    
Here’s how we can convert the strings representing numeric values to integers:

In [7]:
import csv

# Open the CSV file
with open('data.csv', mode='r') as file:
    # Create a csv reader object
    csv_reader = csv.reader(file)
    
    # Skip the header
    next(csv_reader)
    
    # Process each row
    for row in csv_reader:
        # Convert Maths, Science, and English scores to integers
        maths = int(row[2])
        science = int(row[3])
        english = int(row[4])
        
        print(f"Student: {row[1]}, Maths: {maths}, Science: {science}, English: {english}")

Student: Alice, Maths: 85, Science: 90, English: 88
Student: Bob, Maths: 78, Science: 80, English: 79
Student: Charlie, Maths: 92, Science: 89, English: 94
Student: David, Maths: 85, Science: 92, English: 87
Student: Eva, Maths: 91, Science: 95, English: 93


**Creating a List Containing the Processed Data:**

In [18]:
import csv

# Open the CSV file
with open('data.csv', mode='r') as file:
    # Create a csv reader object
    csv_reader = csv.reader(file)
    
    # Skip the header
    header = next(csv_reader)

    # Store the Processed data in a List
    processed_data = []
        
    # Process each row
    for row in csv_reader:
        # Convert Maths, Science, and English scores to integers
        maths = int(row[2])
        science = int(row[3])
        english = int(row[4])
        
        # Append the processed row back into the list
        processed_data.append([row[0], row[1], maths, science, english])

    # Printing the updated rows after processing
    print(header)
    for row in processed_data:
        print(row)

['ID', 'Name', 'Maths', 'Science', 'English']
['1', 'Alice', 85, 90, 88]
['2', 'Bob', 78, 80, 79]
['3', 'Charlie', 92, 89, 94]
['4', 'David', 85, 92, 87]
['5', 'Eva', 91, 95, 93]


**Important:**    
We can also create the **processed_data** object outside the **with block** to retain the data inside the **processed_data object**.

## <span style="color:blue;">ii. Performing Data Crunching</span>    
Let’s perform some basic data crunching operations, such as **calculating*** the **total** and **average score** for **each student**.

**Example:** Calculating Total and Average for Each Student.

In [25]:
import csv

# Open the CSV file
with open('data.csv', mode='r') as file:
    # Create a csv reader object
    csv_reader = csv.reader(file)
    
    # Skip the header
    header = next(csv_reader)

    # Store the Processed data in a List
    processed_data = []
        
    # Process each row
    for row in csv_reader:
        # Convert Maths, Science, and English scores to integers
        maths = int(row[2])
        science = int(row[3])
        english = int(row[4])
        
        # Append the processed row back into the list
        processed_data.append([row[0], row[1], maths, science, english])
    
    # Printing the updated rows after processing
    for row in processed_data:
        print(row)

    # Adding a Line Break
    print("\n")
    
    # Calculate total and average - Add these Columns to the 'processed_data' object
    header.append("Total")
    header.append("Average")
    for index, row in enumerate(processed_data):
        total = row[2] + row[3] + row[4]
        average = round(total / 3, 2)

        processed_data[index].append(total)
        processed_data[index].append(average)

    # Printing the updated rows after processing
    print(header)
    for row in processed_data:
        print(row)    

['1', 'Alice', 85, 90, 88]
['2', 'Bob', 78, 80, 79]
['3', 'Charlie', 92, 89, 94]
['4', 'David', 85, 92, 87]
['5', 'Eva', 91, 95, 93]


['ID', 'Name', 'Maths', 'Science', 'English', 'Total', 'Average']
['1', 'Alice', 85, 90, 88, 263, 87.67]
['2', 'Bob', 78, 80, 79, 237, 79.0]
['3', 'Charlie', 92, 89, 94, 275, 91.67]
['4', 'David', 85, 92, 87, 264, 88.0]
['5', 'Eva', 91, 95, 93, 279, 93.0]


# <span style="color:red;">Step 5: Additional Data Crunching Operations</span>     
Now, let’s implement more data crunching functionalities like:
- Finding the highest scorer in each subject.
- Finding the overall highest and lowest average scores.
- Summing up all scores in a subject (e.g., total of all students' Maths scores).

## <span style="color:blue;">1. Finding the Highest Scorer in Each Subject</span>

In [26]:
import csv

highest_maths = {"name": None, "score": 0}
highest_science = {"name": None, "score": 0}
highest_english = {"name": None, "score": 0}

# Open the CSV file
with open('data.csv', mode='r') as file:
    csv_reader = csv.reader(file)
    
    # Skip the header
    next(csv_reader)
    
    for row in csv_reader:
        maths = int(row[2])
        science = int(row[3])
        english = int(row[4])
        
        # Check for highest Maths score
        if maths > highest_maths["score"]:
            highest_maths["name"] = row[1]
            highest_maths["score"] = maths
            
        # Check for highest Science score
        if science > highest_science["score"]:
            highest_science["name"] = row[1]
            highest_science["score"] = science
        
        # Check for highest English score
        if english > highest_english["score"]:
            highest_english["name"] = row[1]
            highest_english["score"] = english

print(f"Highest in Maths: {highest_maths['name']} ({highest_maths['score']})")
print(f"Highest in Science: {highest_science['name']} ({highest_science['score']})")
print(f"Highest in English: {highest_english['name']} ({highest_english['score']})")

Highest in Maths: Charlie (92)
Highest in Science: Eva (95)
Highest in English: Charlie (94)


## <span style="color:blue;">2. Finding Overall Highest and Lowest Average Scores</span>

In [27]:
import csv

highest_avg = {"name": None, "average": 0}
lowest_avg = {"name": None, "average": float('inf')}

# Open the CSV file
with open('data.csv', mode='r') as file:
    csv_reader = csv.reader(file)
    
    # Skip the header
    next(csv_reader)
    
    for row in csv_reader:
        maths = int(row[2])
        science = int(row[3])
        english = int(row[4])
        
        total = maths + science + english
        average = total / 3
        
        # Check for highest average
        if average > highest_avg["average"]:
            highest_avg["name"] = row[1]
            highest_avg["average"] = average
        
        # Check for lowest average
        if average < lowest_avg["average"]:
            lowest_avg["name"] = row[1]
            lowest_avg["average"] = average

print(f"Highest Average: {highest_avg['name']} ({highest_avg['average']:.2f})")
print(f"Lowest Average: {lowest_avg['name']} ({lowest_avg['average']:.2f})")

Highest Average: Eva (93.00)
Lowest Average: Bob (79.00)


## <span style="color:red;">3. Summing All Scores in a Subject</span>

In [28]:
# Let's Sum the overall scores in Math
import csv

total_maths = 0

# Open the CSV file
with open('data.csv', mode='r') as file:
    csv_reader = csv.reader(file)
    
    # Skip the header
    next(csv_reader)
    
    for row in csv_reader:
        maths = int(row[2])
        total_maths += maths

print(f"Total Maths Score: {total_maths}")


Total Maths Score: 431


In [29]:
import csv

# Open the CSV file
with open('data.csv', mode='r') as file:
    # Create a csv reader object
    csv_reader = csv.reader(file, delimiter=",")
    
    # Reading each row in the csv
    for row in csv_reader:
        print(row)

['ID', 'Name', 'Maths', 'Science', 'English']
['1', 'Alice', '85', '90', '88']
['2', 'Bob', '78', '80', '79']
['3', 'Charlie', '92', '89', '94']
['4', 'David', '85', '92', '87']
['5', 'Eva', '91', '95', '93']
