# Pandas Notebook 1: DataFrames Made Easy
"Like Excel, but with superpowers!"

## What You'll Learn  
1. **Pandas Series**: Smart lists  
2. **DataFrames**: Fancy tables  
3. **Basic Operations**: Filtering & math  
4. **Real Use**: COVID case analysis  

In [1]:
import pandas as pd

# Create a Series (like a NumPy array with labels)
fruits = pd.Series([5, 2, 7], index=["🍎", "🍌", "🍊"], name="Fruit Count")
print(fruits)

🍎    5
🍌    2
🍊    7
Name: Fruit Count, dtype: int64


In [2]:
# Create a DataFrame from a dictionary
data = {
    "City": ["Paris", "Tokyo", "NYC"],
    "Population (M)": [2.1, 9.3, 8.8],
    "Language": ["French", "Japanese", "English"]
}

df = pd.DataFrame(data)
print("\nCity Data:")
print(df)


City Data:
    City  Population (M)  Language
0  Paris             2.1    French
1  Tokyo             9.3  Japanese
2    NYC             8.8   English


In [3]:
# Math operations
print("\nDouble population:\n", df["Population (M)"] * 2)

# Filtering
big_cities = df[df["Population (M)"] > 3]
print("\nBig cities:\n", big_cities)


Double population:
 0     4.2
1    18.6
2    17.6
Name: Population (M), dtype: float64

Big cities:
     City  Population (M)  Language
1  Tokyo             9.3  Japanese
2    NYC             8.8   English


In [4]:
# Simulated COVID data
covid_data = pd.DataFrame({
    "Date": ["2023-01-01", "2023-01-02", "2023-01-03"],
    "Cases": [120, 135, 158],
    "Deaths": [2, 3, 1]
})

# Calculate death rate
covid_data["Death Rate %"] = (covid_data["Deaths"] / covid_data["Cases"]) * 100
print("\nCOVID Analysis:\n", covid_data)


COVID Analysis:
          Date  Cases  Deaths  Death Rate %
0  2023-01-01    120       2      1.666667
1  2023-01-02    135       3      2.222222
2  2023-01-03    158       1      0.632911


## Practice Time!  
1. Create a Series of temperatures: [22, 25, 19] with cities as index  
2. Make a DataFrame for 3 movies with columns: Title, Year, Rating  
3. Filter movies with Rating > 8  
4. Add a "Decade" column (e.g., 2023 → 2020s)  

*(Solutions in next cell)*  

In [5]:
# 1
temps = pd.Series([22, 25, 19], index=["London", "Paris", "Rome"])

# 2
movies = pd.DataFrame({
    "Title": ["Inception", "Interstellar", "Joker"],
    "Year": [2010, 2014, 2019],
    "Rating": [8.8, 8.6, 8.4]
})

# 3
top_movies = movies[movies["Rating"] > 8.5]

# 4
movies["Decade"] = (movies["Year"] // 10) * 10