## 1. Warm-Up
**Discussion:**
- What is data handling?
- Why do we need Pandas?
- How is Pandas different from NumPy?

**Quick Demo:**
We start by importing the pandas library and checking its version.

In [2]:
import pandas as pd
print(pd.__version__)  # Check Pandas version

2.2.1


## 2. Introduction to Pandas
### What is Pandas?
Pandas is a powerful Python library used for data manipulation and analysis. It provides two primary data structures:
- **Series**: A one-dimensional labeled array.
- **DataFrame**: A two-dimensional labeled data structure, like a table or spreadsheet.

Let's start with a **Series** example.

In [None]:
data = [10, 20, 30, 40]
series = pd.Series(data)
print(series)

### Creating a DataFrame
Imagine you're creating a simple table with information about people.

In [3]:
data = {'Name': ['Alice', 'Bob', 'Charlie'],
        'Age': [25, 30, 35],
        'City': ['NY', 'LA', 'Chicago']}
df = pd.DataFrame(data)
print(df)

      Name  Age     City
0    Alice   25       NY
1      Bob   30       LA
2  Charlie   35  Chicago


### Activity:
Create a Pandas Series and a DataFrame with student names and marks.

In [None]:
# Your code here
student_data = {'Name': ['John', 'Emma', 'Ryan'],
                'Marks': [85, 90, 78]}
students_df = pd.DataFrame(student_data)
print(students_df)

## 3. Reading and Writing CSV & Excel Files
Pandas makes it easy to read and write data files like CSV and Excel.

### Reading a CSV File
Make sure `data.csv` exists in your working directory.

In [4]:
df_csv = pd.read_csv('data.csv')
print(df_csv.head())

FileNotFoundError: [Errno 2] No such file or directory: 'data.csv'

### Reading an Excel File
Make sure `data.xlsx` exists in your working directory.

In [None]:
df_excel = pd.read_excel('data.xlsx', sheet_name='Sheet1')
print(df_excel.head())

### Writing to CSV and Excel Files

In [None]:
df_csv.to_csv('output.csv', index=False)
df_excel.to_excel('output.xlsx', index=False, sheet_name='Results')

### Activity:
Read a dataset from a CSV file and display its first few rows.

In [None]:
# Your code here
data = pd.read_csv('sample.csv')
print(data.head())

## 4. Data Cleaning and Preprocessing
### Handling Missing Values

In [None]:
df.dropna(inplace=True)  # Remove rows with missing values
df.fillna(0, inplace=True)  # Fill missing values with 0

### Removing Duplicate Data

In [None]:
df.drop_duplicates(inplace=True)

### Renaming Columns

In [None]:
df.rename(columns={'OldName': 'NewName'}, inplace=True)

### Changing Column Data Types

In [None]:
df['Age'] = df['Age'].astype(int)

### Activity:
Clean a dataset by removing duplicates and handling missing values.

In [None]:
# Your code here
df = pd.read_csv('unclean_data.csv')
df.drop_duplicates(inplace=True)
df.fillna(0, inplace=True)
print(df.head())

## 5. Data Analysis with Pandas
### Filtering and Sorting Data

In [None]:
adults = df[df['Age'] > 18]
print(adults)

### Sorting Data

In [None]:
df.sort_values(by='Age', ascending=False, inplace=True)

### Aggregation Functions

In [None]:
print(df['Age'].mean())  # Average age
print(df['City'].value_counts())  # Count per city

### Activity:
Find the average age and count the number of people from each city in a dataset.

In [None]:
# Your code here
print(df['Age'].mean())
print(df['City'].value_counts())

## 6. Wrap-Up and Homework
### Recap:
- Basics of Pandas
- Reading and writing files
- Data cleaning and preprocessing
- Data analysis

### Homework:
1. Load a CSV file and display the first 10 rows.
2. Remove duplicate rows and handle missing values in a dataset.
3. Sort a dataset by age and find the total number of people from each city.

In [None]:
# Homework sample code
df_homework = pd.read_csv('homework.csv')
print(df_homework.head(10))
df_homework.drop_duplicates(inplace=True)
df_homework.fillna(0, inplace=True)
df_homework.sort_values(by='Age', inplace=True)
print(df_homework['City'].value_counts())