# **Code-Along: Handling Encoding and Mixed Delimiters in CSV Files**

In this notebook, you will learn how to:
1. Read a CSV file with a specific encoding.
2. Handle mixed delimiters in a CSV file.
3. Use Python's built-in tools (`open()` and `csv` module) to process the file.

---

## **Scenario**
You are working with a CSV file that contains data about public transportation usage. The file:
- Uses **UTF-8 encoding**.
- Contains **mixed delimiters** (commas and semicolons).
- Includes a header row.

Your task is to:
1. Read the file correctly.
2. Extract specific information.
3. Perform basic analysis on the data.

---

## **Dataset: public_transport.csv**

**File Contents:**

- Station
- Daily Ridership (how many people rode)
- City 

## **Part 1: Reading the CSV File**

### Task:
1. Open the file `public_transport.csv` using the `open()` function and the `with` statement.
2. Use the `csv.reader()` function to read the file.
3. Handle the mixed delimiters (commas and semicolons) by splitting the rows manually.


In [None]:
# import the csv package 

import csv

In [None]:
# Open the CSV file with UTF-8 encoding
with open('data/public_transport.csv', 'r', encoding='utf-8') as file:
    reader = csv.reader(file)
    
    # Iterate through the rows
    for row in reader:
        # The first column is the station name
        station = None
        
        # The second column contains "Daily_Ridership;City", so we split it by the semicolon
        ridership, city = None.split(';')  # Correctly access the second element of the list and split it
        
        # Print the extracted data
        print(f"Station: {station}, Daily Ridership: {ridership}, City: {city}")

## Part 2: Extracting Specific Information
### Task:
- Skip the header row.
- Print only the station names and their corresponding cities.


In [None]:
# Open the CSV file with UTF-8 encoding
with open('data/public_transport.csv', 'r', encoding='utf-8') as file:
    reader = csv.reader(file)
    
    # Skip the header row
    next(None)
    
    # Print station names and cities
    for row in None:
        station = None
        city = None.split(';')[1]  # Split the second column and get the city
        print(f"Station: {station}, City: {city}")

## Part 3: Basic Analysis

### Task:
- Count the total number of stations.
- Calculate the total daily ridership across all stations.


In [None]:
# Initialize variables
station_count = 0
total_ridership = 0

# Open the CSV file with UTF-8 encoding
with open('data/public_transport.csv', 'r', encoding='utf-8') as file:
    reader = csv.reader(file)
    
    # Skip the header row
    next(None)
    
    # Iterate through the rows
    for row in None:
        station_count += 1
        ridership = int(None.split(';')[0])  # Convert ridership to an integer
        total_ridership += ridership

# Print the results
print("Total Stations:", station_count)
print("Total Daily Ridership:", total_ridership)