# Introduction to CSV and Pandas
This notebook shows how to read a CSV file in two ways:
1. **Using Python's built-in `csv` module**
2. **Using the `pandas` library**
We'll also see how to examine the data once it's loaded.

In [None]:
# Reading data using Python's built-in csv module
import csv

with open('students.csv', 'r') as f:
    reader = csv.DictReader(f) # csv.DictReader returns an iterator that produces dictionaries
    data_csv = list(reader)

print('First record using csv module:', data_csv[0])

## Reading Data with Pandas
`pandas` makes it easier to work with tabular data and offers many helpful methods.

In [None]:
# Reading data using pandas

import pandas as pd
import os
df = pd.read_csv("students.csv")
print('First 5 records using pandas:')
df.head()

In [None]:
# Reading a specfic column
df['Name'] 

In [None]:
# Accessing a specific row
df.iloc[2] 

In [None]:
# use .loc
df.loc[2]

## Olympics dataset from Kaggle has inconsistent lines

Here is an example of inconsistent lines in the dataset:

- **Normal Lines**:
    ```
    M,110M Hurdles Men,Rio,2016,S,Orlando ORTEGA,ESP,13.17
    M,110M Hurdles Men,Rio,2016,B,Dimitri BASCOU,FRA,13.24
    ```

- **Inconsistent Lines**:
    ```
    - M,110M Hurdles Men,Beijing,2008,G,Dayron ROBLES,CUB,12.93,+0.1
    - M,110M Hurdles Men,Beijing,2008,S,David PAYNE,USA,13.17,+0.1
    - M,110M Hurdles Men,Beijing,2008,B,David OLIVER,USA,13.18,+0.1
    ```

The inconsistent lines have an extra field at the end (e.g., `+0.1`), which makes them differ from the standard format.

In [None]:
olympics_df = pd.read_csv("results.csv", on_bad_lines="skip")
olympics_df.head()

You can see that using pandas is concise and powerful. We can easily access columns, rows, and perform many transformations.

_End of notebook_