# Day 16 â€” Data Selection in Pandas

---

## Objectives
- Learn different ways to select data in Pandas
- Understand indexing, slicing, and filtering
- Use `.loc[]`, `.iloc[]`, and boolean masks
- Work with conditional selection and multiple conditions
- Explore `.isin()` and `.between()` for filtering

---

## 1. Selecting Columns


In [None]:
import pandas as pd

# Load dataset
df = pd.read_csv('datasets/employee_data.csv')

# Single column (Series)
ages = df['Age']
print("Single Column (Age):")
print(ages.head())

# Multiple columns (DataFrame)
subset = df[['Name', 'City', 'Salary']]
print("\nMultiple Columns (Name, City, Salary):")
print(subset.head())


## 2. Selecting Rows by Position or Label


In [None]:
# Using .iloc[] (integer position)

# First row
print("First row:")
print(df.iloc[0])

# First 5 rows
print("\nFirst 5 rows:")
print(df.iloc[0:5])

# Last 3 rows
print("\nLast 3 rows:")
print(df.iloc[-3:])


In [None]:
# Using .loc[] (label-based)

# Row with index 2
print("Row with index 2:")
print(df.loc[2])

# Multiple rows and columns
print("\nRows 0 to 4, columns Name and Salary:")
print(df.loc[0:4, ['Name', 'Salary']])


## 3. Boolean Indexing (Conditional Selection)


In [None]:
# Select employees with Age > 30
older_employees = df[df['Age'] > 30]
print("Employees Age > 30:")
print(older_employees.head())

# Multiple conditions: Age > 30 and Salary > 50000
filtered = df[(df['Age'] > 30) & (df['Salary'] > 50000)]
print("\nEmployees Age > 30 and Salary > 50000:")
print(filtered.head())


## 4. Using .isin() and .between()


In [None]:
# Employees in specific cities
cities = df[df['City'].isin(['NY', 'LA'])]
print("Employees in NY or LA:")
print(cities.head())

# Employees with Salary between 40000 and 60000
salary_range = df[df['Salary'].between(40000, 60000)]
print("\nEmployees with Salary between 40000 and 60000:")
print(salary_range.head())


## 5. Selecting Specific Rows & Columns


In [None]:
# Select specific rows and columns
result = df.loc[df['Age'] > 30, ['Name', 'City', 'Salary']]
print("Name, City, Salary for employees Age > 30:")
print(result.head())


## 6. Resetting and Setting Index


In [None]:
# Set 'EmployeeID' as index
df.set_index('EmployeeID', inplace=True)
print("Set EmployeeID as index:")
print(df.head())

# Reset index to default integer index
df.reset_index(inplace=True)
print("\nReset index to default:")
print(df.head())


## 7. Practice Exercises
1. Select all employees with Age < 35.  
2. Select employees from 'Chicago' or 'LA'.  
3. Filter employees with Salary > 50000 and Age < 40.  
4. Select only 'Name' and 'Salary' columns for employees Age > 30.  
5. Reset index after setting 'EmployeeID' as index.  

---

## End of Day 16 notebook
