# Pandas Indexing and Slicing Exercises

## Indexing Exercises

**Exercise 1 (Easy):** Create a DataFrame with columns 'Name', 'Age', and 'Score'. Use `.loc[]` to select a single row by index label. Return all of Bob's information.

In [1]:
import pandas as pd

# Exercise 1: Basic .loc[] indexing
df = pd.DataFrame({
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'Score': [88, 92, 78]
})

In [5]:
# Your code here...
df.loc[1]

Name     Bob
Age       30
Score     92
Name: 1, dtype: object

**Exercise 2 (Easy):** Using `.iloc[]`, select the element at row 0, column 2 from the DataFrame above. What score does Alice have?

In [11]:
# your code here...
df.iloc[0,2]
print(f"Alice's score is {df.iloc[0,2]}.")

Alice's score is 88.


**Exercise 3 (Medium):** Create a DataFrame with a custom string index (e.g., 'row_a', 'row_b', 'row_c') and columns 'X', 'Y', 'Z'. Use `.loc[]` to select multiple rows by index labels. Return rows 'row_a' and 'row_c'.

In [18]:
# Exercise 3: Multiple row selection with .loc[]
df3 = pd.DataFrame({
    'X': [10, 20, 30],
    'Y': [100, 200, 300],
    'Z': [1000, 2000, 3000]
}, index=['row_a', 'row_b', 'row_c'])


In [19]:
# your code here...
df3.loc[["row_a" , "row_c"]]

Unnamed: 0,X,Y,Z
row_a,10,100,1000
row_c,30,300,3000


**Exercise 4 (Medium):** Given a DataFrame with numerical data, use boolean indexing to select all rows where the 'Age' column value is greater than 28. How many rows match this condition?

In [20]:
# Exercise 4: Boolean indexing
df4 = pd.DataFrame({
    'Name': ['Alice', 'Bob', 'Charlie', 'Diana'],
    'Age': [25, 30, 35, 28],
    'Score': [88, 92, 78, 95]
})

In [22]:
# your code here ...
df4[df4['Age']>28]
print("there are 2 rows with age > 28.")

there are 2 rows with age > 28.


**Exercise 5 (Challenging):** Create a multi-indexed DataFrame (hierarchical index with 'Department' and 'Employee' levels). Use `.loc[]` with multiple index levels to select all rows from a specific department, then filter for scores above 85. Return the result.

In [24]:
# Exercise 5: Multi-level indexing with .loc[]
index = pd.MultiIndex.from_tuples([
    ('Sales', 'Alice'),
    ('Sales', 'Bob'),
    ('HR', 'Charlie'),
    ('HR', 'Diana'),
    ('IT', 'Eve')
], names=['Department', 'Employee'])

df5 = pd.DataFrame({
    'Score': [88, 92, 78, 95, 89]
}, index=index)


In [None]:
#your code
idx = pd.IndexSlice
df5.loc[idx['HR', :], :].loc[df5['Score'] > 85]
# I used chatgpt for this one, it kept giving me a boolean unaligning error
# defining idx as pd.IndexSlice fixed it

Unnamed: 0_level_0,Unnamed: 1_level_0,Score
Department,Employee,Unnamed: 2_level_1
HR,Diana,95


## Slicing Exercises

**Exercise 1 (Easy):** Create a DataFrame with 10 rows. Use `.iloc[]` to slice and return rows 2 through 5 (inclusive). What are the first and last row indices returned?

In [28]:
# Exercise 1: Basic row slicing with .iloc[]
df_slice1 = pd.DataFrame({
    'A': range(10, 20),
    'B': range(100, 110),
    'C': range(1000, 1010)
})


In [29]:
# your code here...
df_slice1.iloc[2:6]

Unnamed: 0,A,B,C
2,12,102,1002
3,13,103,1003
4,14,104,1004
5,15,105,1005


**Exercise 2 (Easy):** Using the DataFrame from Exercise 1, slice to select columns 'A' and 'C' for all rows. Use `.iloc[]` with column indices.

In [32]:
# your code here...
df_slice1.iloc[:,::2]

Unnamed: 0,A,C
0,10,1000
1,11,1001
2,12,1002
3,13,1003
4,14,1004
5,15,1005
6,16,1006
7,17,1007
8,18,1008
9,19,1009


**Exercise 3 (Medium):** Create a DataFrame with a DatetimeIndex. Use `.loc[]` to slice data between two date strings (e.g., '2024-01-10' to '2024-01-20'). Return all rows in that date range.

In [33]:
# Exercise 3: Date range slicing with .loc[]
date_index = pd.date_range('2024-01-01', periods=31, freq='D')
df_slice3 = pd.DataFrame({
    'Temperature': range(20, 51),
    'Humidity': range(30, 61)
}, index=date_index)


In [34]:
# your code here...
df_slice3.loc['2024-01-01':'2024-01-20']

Unnamed: 0,Temperature,Humidity
2024-01-01,20,30
2024-01-02,21,31
2024-01-03,22,32
2024-01-04,23,33
2024-01-05,24,34
2024-01-06,25,35
2024-01-07,26,36
2024-01-08,27,37
2024-01-09,28,38
2024-01-10,29,39


**Exercise 4 (Medium):** Create a DataFrame and use `.iloc[]` to select a rectangular region: rows 3 to 7 and columns 1 to 3. Return the shape and values of this sub-DataFrame.

In [35]:
# Exercise 4: 2D rectangular slicing with .iloc[]
df_slice4 = pd.DataFrame({
    'Col0': range(100, 110),
    'Col1': range(200, 210),
    'Col2': range(300, 310),
    'Col3': range(400, 410),
    'Col4': range(500, 510)
})


In [36]:
# your code here...
df_slice4.iloc[3:8,1:4]

Unnamed: 0,Col1,Col2,Col3
3,203,303,403
4,204,304,404
5,205,305,405
6,206,306,406
7,207,307,407


**Exercise 5 (Challenging):** Create a DataFrame with a string index. Use `.loc[]` to slice between two index labels (e.g., from 'item_3' to 'item_8'), then apply a column filter to keep only numeric values above a threshold. Return the filtered result.

In [37]:
# Exercise 5: Complex slicing with label ranges and filtering
df_slice5 = pd.DataFrame({
    'Price': [15, 25, 35, 45, 55, 65, 75, 85, 95, 105],
    'Quantity': [10, 20, 30, 40, 50, 60, 70, 80, 90, 100],
    'Status': ['In Stock', 'Out', 'In Stock', 'Out', 'In Stock', 'Out', 'In Stock', 'Out', 'In Stock', 'Out']
}, index=[f'item_{i}' for i in range(1, 11)])


In [None]:
# your code here...
df_slice5.loc['item_2':'item_9'].loc[df_slice5['Price']>50]
#used the .loc in the .loc thing I learned from ChatGPT to make this one line

Unnamed: 0,Price,Quantity,Status
item_5,55,50,In Stock
item_6,65,60,Out
item_7,75,70,In Stock
item_8,85,80,Out
item_9,95,90,In Stock
