# Session 14: Practice - Pandas Basics

Time to put your Pandas knowledge to the test! This practice session will help reinforce what you learned about creating DataFrames, exploring data, and working with files.

## Instructions

- Complete each exercise in the provided code cells
- Run your code to verify it works
- Some exercises have hints - try without them first!
- Expected outputs are shown to help you verify your solutions

In [None]:
# Setup: Run this cell first!
import pandas as pd
import json

---
## Exercise 1: Create a Series

Create a Pandas Series called `temperatures` containing the following daily high temperatures (in Celsius): 22, 25, 19, 28, 24, 21, 23.

Use the days of the week (Monday through Sunday) as the index.

In [None]:
# Your code here


**Expected output:**
```
Monday       22
Tuesday      25
Wednesday    19
Thursday     28
Friday       24
Saturday     21
Sunday       23
dtype: int64
```

---
## Exercise 2: Series Operations

Using the `temperatures` Series from Exercise 1:
1. Find the average temperature
2. Find which days had temperatures above 23 degrees
3. Get the temperature on Friday

In [None]:
# Your code here
# 1. Average temperature


In [None]:
# 2. Days above 23 degrees


In [None]:
# 3. Friday's temperature


---
## Exercise 3: Create a DataFrame from a Dictionary

Create a DataFrame called `movies` with the following data:

| title | year | rating | genre |
|-------|------|--------|-------|
| The Matrix | 1999 | 8.7 | Sci-Fi |
| Inception | 2010 | 8.8 | Sci-Fi |
| The Godfather | 1972 | 9.2 | Crime |
| Pulp Fiction | 1994 | 8.9 | Crime |
| Forrest Gump | 1994 | 8.8 | Drama |

In [None]:
# Your code here


---
## Exercise 4: Create a DataFrame from a List of Dictionaries

Create a DataFrame called `restaurants` using a list of dictionaries with the following data:

| name | cuisine | rating | price_range |
|------|---------|--------|-------------|
| La Bodega | Spanish | 4.5 | $$ |
| Tokyo Ramen | Japanese | 4.2 | $ |
| Pasta Palace | Italian | 4.0 | $$ |
| Burger Joint | American | 3.8 | $ |

In [None]:
# Your code here


---
## Exercise 5: Explore a DataFrame

Using the `movies` DataFrame from Exercise 3, answer these questions:
1. How many rows and columns does it have?
2. What are the column names?
3. What are the data types of each column?
4. Display the first 3 rows

In [None]:
# Your code here
# 1. Shape


In [None]:
# 2. Column names


In [None]:
# 3. Data types


In [None]:
# 4. First 3 rows


---
## Exercise 6: DataFrame Info and Describe

Create a DataFrame called `sales` with the following data, then:
1. Use `info()` to see the DataFrame summary
2. Use `describe()` to see statistics

| salesperson | region | q1_sales | q2_sales | q3_sales | q4_sales |
|-------------|--------|----------|----------|----------|----------|
| Maria | North | 15000 | 18000 | 22000 | 25000 |
| Carlos | South | 12000 | 14000 | 16000 | 19000 |
| Ana | East | 20000 | 22000 | 19000 | 24000 |
| Pedro | West | 18000 | 21000 | 23000 | 27000 |
| Laura | North | 16000 | 17000 | 20000 | 22000 |

In [None]:
# Create the sales DataFrame


In [None]:
# 1. info()


In [None]:
# 2. describe()


---
## Exercise 7: Access Columns

Using the `sales` DataFrame:
1. Get just the `salesperson` column
2. Get the `salesperson` and `region` columns together
3. Create a new column `total_sales` that sums all quarterly sales

In [None]:
# 1. Single column


In [None]:
# 2. Multiple columns


In [None]:
# 3. Create total_sales column


---
## Exercise 8: Save to CSV

Save the `movies` DataFrame to a file called `movies.csv`. Make sure NOT to include the index.

In [None]:
# Your code here


---
## Exercise 9: Read from CSV

Read the `movies.csv` file you just created into a new DataFrame called `movies_loaded`. Verify it loaded correctly by displaying it.

In [None]:
# Your code here


---
## Exercise 10: Work with JSON

1. Save the `restaurants` DataFrame to a JSON file called `restaurants.json` using the 'records' orientation
2. Read it back into a new DataFrame called `restaurants_loaded`

In [None]:
# 1. Save to JSON


In [None]:
# 2. Read from JSON


---
## Exercise 11: Handle Problematic CSV

The code below creates a problematic CSV file. Read it correctly, handling:
- Semicolon separator
- Missing values marked as 'N/A' and '-'

In [None]:
# Run this to create the problematic file
problematic_data = """product;quantity;price
Widget A;100;25.99
Widget B;N/A;15.50
Widget C;75;-
Widget D;50;35.00"""

with open('inventory.csv', 'w') as f:
    f.write(problematic_data)
print("File created!")

In [None]:
# Read the file correctly


In [None]:
# Verify missing values are recognized (should show True where values are missing)


---
## Exercise 12: Read Specific Columns

Read only the `salesperson` and `total_sales` columns from the `sales` DataFrame (you'll need to save it first), then read only those columns.

In [None]:
# Save sales to CSV first


In [None]:
# Read only specific columns


---
## Exercise 13: Sample and Random State

From the `movies` DataFrame:
1. Get 2 random movies (result will be different each time)
2. Get 2 random movies with `random_state=42` (result will be reproducible)

In [None]:
# 1. Random sample


In [None]:
# 2. Reproducible sample


---
## Exercise 14: Create DataFrame with Custom Index

Create a DataFrame called `quarterly_targets` with:
- Index: ['Q1', 'Q2', 'Q3', 'Q4']
- Column `target`: [100000, 120000, 115000, 150000]
- Column `achieved`: [95000, 125000, 118000, 145000]

In [None]:
# Your code here


**Expected output:**
```
    target  achieved
Q1  100000     95000
Q2  120000    125000
Q3  115000    118000
Q4  150000    145000
```

---
## Exercise 15: Comprehensive Challenge

Create a DataFrame called `employees` with the following information about 6 employees:
- Names: Alice, Bob, Charlie, Diana, Eve, Frank
- Departments: IT, HR, IT, Sales, HR, Sales
- Salaries: 55000, 48000, 62000, 51000, 47000, 58000
- Years at company: 3, 5, 2, 7, 4, 1

Then:
1. Display the DataFrame info
2. Show basic statistics
3. Add a column `bonus` that is 10% of salary
4. Save to CSV and JSON (without index)
5. Read the CSV back and display the first 3 rows

In [None]:
# Create the employees DataFrame


In [None]:
# 1. Display info


In [None]:
# 2. Show statistics


In [None]:
# 3. Add bonus column


In [None]:
# 4. Save to CSV and JSON


In [None]:
# 5. Read CSV back and display first 3 rows


---
## Cleanup

Run this cell to remove all the files created during this practice session.

In [None]:
import os

files_to_remove = [
    'movies.csv', 'restaurants.json', 'inventory.csv', 
    'sales.csv', 'employees.csv', 'employees.json'
]

for file in files_to_remove:
    if os.path.exists(file):
        os.remove(file)
        print(f"Removed {file}")

print("\nCleanup complete!")

---
## Summary

Great job completing these exercises! You practiced:

- Creating Series with custom indices
- Creating DataFrames from dictionaries and lists of dictionaries
- Exploring DataFrames with `shape`, `columns`, `dtypes`, `info()`, `describe()`
- Viewing data with `head()`, `tail()`, `sample()`
- Accessing and creating columns
- Reading and writing CSV and JSON files
- Handling problematic file formats

### Next Session

We'll learn about filtering, selecting, and aggregating data in Pandas!