# 🐼 Pandas DataFrame Basics with Examples

This notebook will cover:
- What is a DataFrame?
- Creating DataFrames from different sources
- Indexing columns and rows
- Slicing rows and columns
- Filtering data based on conditions
- Multiple examples with clear comments
- Practice tasks for students
- Simple MCQs with answers

---

Let's start!

In [2]:
# Import pandas library
import pandas as pd

## 1️⃣ What is a DataFrame?

- A DataFrame is a two-dimensional, table-like data structure.
- It has rows and columns just like an Excel spreadsheet.
- Useful for storing and manipulating structured data.

Example: Think of a list of students with their names, ages, and cities.

## 2️⃣ Creating DataFrames

**Example 1: Creating from a Python dictionary**

In [3]:
# Create DataFrame from dictionary
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David'],
    'Age': [24, 27, 22, 32],
    'City': ['Delhi', 'Mumbai', 'Chennai', 'Kolkata']
}
df = pd.DataFrame(data)
print("DataFrame created from dictionary:")
print(df)

DataFrame created from dictionary:
      Name  Age     City
0    Alice   24    Delhi
1      Bob   27   Mumbai
2  Charlie   22  Chennai
3    David   32  Kolkata


**Example 2: Creating from a list of lists with column names**

In [4]:
# Data as list of lists
data_list = [
    ['Alice', 24, 'Delhi'],
    ['Bob', 27, 'Mumbai'],
    ['Charlie', 22, 'Chennai'],
    ['David', 32, 'Kolkata']
]
# Creating DataFrame with column names
df2 = pd.DataFrame(data_list, columns=['Name', 'Age', 'City'])
print("DataFrame created from list of lists:")
print(df2)

DataFrame created from list of lists:
      Name  Age     City
0    Alice   24    Delhi
1      Bob   27   Mumbai
2  Charlie   22  Chennai
3    David   32  Kolkata


## 3️⃣ Indexing Columns and Rows

### Accessing columns:
- Use `df['column_name']` or `df.column_name` (simple case)
- Returns a pandas Series (a single column)

In [None]:
# Access 'Name' column
names = df['Name']  # returns Series
print("Names column:")
print(names)

# Alternative syntax (only if column name has no spaces)
print("Names using dot notation:")
print(df.Name)

### Access multiple columns using a list of column names

In [None]:
# Select 'Name' and 'City' columns
name_city = df[['Name', 'City']]
print("Name and City columns:")
print(name_city)

### Accessing rows:
- Use `.iloc[]` for position-based indexing (like lists)
- Use `.loc[]` for label-based indexing (based on index labels)

In [None]:
# Get first row by position
first_row = df.iloc[0]
print("First row using iloc:")
print(first_row)

# Get rows by label index (here default index 0,1,...)
rows_1_to_2 = df.loc[1:2]  # includes row 2
print("Rows with label 1 to 2 using loc:")
print(rows_1_to_2)

## 4️⃣ Slicing DataFrames

- Like Python lists, you can slice rows using `:`
- Slicing columns works using column names and `.loc`

In [None]:
# First 3 rows
print("First 3 rows:")
print(df[:3])

# Last 2 rows
print("Last 2 rows:")
print(df[-2:])

### Slice specific rows and columns using `.loc` and `.iloc`

In [None]:
# Rows 1 to 3 (inclusive), columns 'Name' and 'Age'
print("Using loc for slicing rows and columns:")
print(df.loc[1:3, ['Name', 'Age']])

# Using iloc (position based): rows 0 to 2, columns 0 and 1
print("Using iloc for slicing rows and columns:")
print(df.iloc[0:3, 0:2])

## 5️⃣ Filtering Data

- Extract rows where a condition is true.
- Use boolean expressions inside the DataFrame `[]`.

In [None]:
# Filter rows where Age > 25
age_above_25 = df[df['Age'] > 25]
print("Rows where Age > 25:")
print(age_above_25)

# Filter rows where City is 'Mumbai'
city_mumbai = df[df['City'] == 'Mumbai']
print("Rows where City is Mumbai:")
print(city_mumbai)

### Filtering with multiple conditions (AND / OR)

In [None]:
# AND condition: Age > 25 AND City is Mumbai
age_and_city = df[(df['Age'] > 25) & (df['City'] == 'Mumbai')]
print("Rows where Age > 25 AND City is Mumbai:")
print(age_and_city)

# OR condition: Age < 23 OR City is Kolkata
age_or_city = df[(df['Age'] < 23) | (df['City'] == 'Kolkata')]
print("Rows where Age < 23 OR City is Kolkata:")
print(age_or_city)

## 6️⃣ Bonus: Adding a new column

You can add new columns based on existing data.

In [None]:
# Adding a new column 'Age Group' based on Age
df['Age Group'] = ['Young' if age < 30 else 'Old' for age in df['Age']]
print("DataFrame with new 'Age Group' column:")
print(df)

---
## 🎯 Student Practice Tasks
Try to solve these problems yourself:
1. Create a DataFrame with columns: `Product`, `Price`, `Quantity` for 4 products.
2. Select only the `Price` column.
3. Slice the DataFrame to show the first 2 rows.
4. Filter products where Price > 150.
5. Add a new column `Total Cost` = Price * Quantity.
6. Filter products with `Total Cost` greater than 300.

---

## ❓ MCQs (Multiple Choice Questions)

**Q1: How do you select the column 'Age' from the DataFrame?**
a) df.Age
b) df['Age']
c) Both a and b
d) None of the above

**Answer:** c) Both a and b

**Q2: Which method is used for position-based row selection?**
a) .loc
b) .iloc
c) []
d) .select

**Answer:** b) .iloc

**Q3: How do you filter rows where Age is greater than 30?**
a) df[df.Age > 30]
b) df[df['Age'] > 30]
c) Both a and b
d) None of the above

**Answer:** c) Both a and b
