# üìà Pandas Practice 4: Visualization & Reshaping Drills

## Focus Areas
Based on your questions from Practice 3, this notebook drills deeper into:
1.  **Melt Mechanics**: More practice on identifying `id_vars` vs `value_vars`.
2.  **Type Conversion**: Converting strings to numbers after melting.
3.  **Dual-Axis Plotting**: Setting up the `make_subplots` structure correctly.
4.  **JSON Navigation**: Accessing nested data (lists vs dictionaries).

**Instructions:**
Try to write the code from scratch. Refer to the "Syntax Reminder" sections if stuck.


## Setup
Run this cell to load the new datasets:

In [1]:
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots

# 1. Wide Data (Temperature by Month)
temp_wide = pd.DataFrame({
    'City': ['NYC', 'LA', 'Chicago'],
    'Region': ['East', 'West', 'Midwest'],
    'Jan': [32, 58, 25],
    'Feb': [35, 60, 28],
    'Mar': [45, 65, 38],
    'Apr': [55, 68, 50]
})

# 2. Long Data (Rainfall)
rainfall = pd.DataFrame({
    'City': ['NYC', 'NYC', 'NYC', 'NYC', 'LA', 'LA', 'LA', 'LA'],
    'Month': ['Jan', 'Feb', 'Mar', 'Apr', 'Jan', 'Feb', 'Mar', 'Apr'],
    'Rain': [3.5, 3.0, 4.0, 4.5, 3.0, 3.2, 2.0, 0.5]
})

# 3. JSON Data (Mock API Response)
weather_api = {
    'status': 'success',
    'data': {
        'forecast': [
            {'day': 'Mon', 'temp': 70},
            {'day': 'Tue', 'temp': 72},
            {'day': 'Wed', 'temp': 68}
        ],
        'location': {'city': 'Austin', 'state': 'TX'}
    }
}

print("Data Loaded!")
temp_wide.head()

Data Loaded!


Unnamed: 0,City,Region,Jan,Feb,Mar,Apr
0,NYC,East,32,35,45,55
1,LA,West,58,60,65,68
2,Chicago,Midwest,25,28,38,50


---
## Topic 1: Melt Drills

### ‚ö†Ô∏è Syntax Reminder
*   **`id_vars`**: The columns you want to KEEP as identifiers (e.g., City, Region).
*   **`var_name`**: What to call the new column created from the headers (e.g., "Month").
*   **`value_name`**: What to call the new column containing the numbers (e.g., "Temperature").


**Q1.1:** Melt `temp_wide` so that 'Jan', 'Feb', etc. become a 'Month' column, and the values become 'Temp'. Keep 'City' and 'Region' as IDs.

In [31]:
# Your answer here
temp_wide
temp_long = temp_wide.melt(
    id_vars= {"City", "Region"},
    var_name= "Month",
    value_name= "Temp",
)

temp_long

Unnamed: 0,Region,City,Month,Temp
0,East,NYC,Jan,32
1,West,LA,Jan,58
2,Midwest,Chicago,Jan,25
3,East,NYC,Feb,35
4,West,LA,Feb,60
5,Midwest,Chicago,Feb,28
6,East,NYC,Mar,45
7,West,LA,Mar,65
8,Midwest,Chicago,Mar,38
9,East,NYC,Apr,55


**Q1.2:** (Tricky) Melt `temp_wide` again, but this time ONLY keep 'City' as the ID. What happens to 'Region'? (Try it and see).

In [4]:
# Your answer here
temp_wide
temp_long = temp_wide.melt(
    id_vars= "City",
    var_name= "Month",
    value_name= "Rainfall",
)

temp_long

Unnamed: 0,City,Month,Rainfall
0,NYC,Region,East
1,LA,Region,West
2,Chicago,Region,Midwest
3,NYC,Jan,32
4,LA,Jan,58
5,Chicago,Jan,25
6,NYC,Feb,35
7,LA,Feb,60
8,Chicago,Feb,28
9,NYC,Mar,45


**Q1.3:** Suppose you have a dataframe `df` with columns `['A', 'B', '2000', '2001']`. Write the code to melt it into `['A', 'B', 'Year', 'Value']`.

In [None]:
# Your answer here
df_melted = df.melt(
    id_vars = {"A", "B"},
    var_name = "Year",
    value_name = "Value",
)

---
## Topic 2: Type Conversion

### ‚ö†Ô∏è Syntax Reminder
*   `df['Col'] = df['Col'].astype(int)` -> For clean numbers.
*   `pd.to_numeric(df['Col'], errors='coerce')` -> Safer, handles bad values (turns them to NaN).


**Q2.1:** Create a dataframe `bad_data` with a column 'Price' containing `['100', '200', '300']` (strings). Convert it to integers.

In [17]:
# Your answer here
bad_data = pd.DataFrame()
bad_data["Price"] = ["100", "200", "123"]
bad_data["Price"] = bad_data["Price"].astype(int)
bad_data.info()


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 1 columns):
 #   Column  Non-Null Count  Dtype
---  ------  --------------  -----
 0   Price   3 non-null      int64
dtypes: int64(1)
memory usage: 156.0 bytes


**Q2.2:** Create a dataframe `messy` with a column 'Year' containing `['2010', '2011', 'oops']`. Use `pd.to_numeric` with `errors='coerce'` to fix it.

In [26]:
# Your answer here
messy = pd.DataFrame()
messy["Year"] = ["2010", "2011", "oops"]
messy["Year"] = pd.to_numeric(messy["Year"], errors = "coerce")
messy


Unnamed: 0,Year
0,2010.0
1,2011.0
2,


---
## Topic 3: Dual-Axis Plotting

### ‚ö†Ô∏è Syntax Reminder
1.  `fig = make_subplots(specs=[[{"secondary_y": True}]])`
2.  `fig.add_trace(go.Scatter(x=..., y=...), secondary_y=False)`
3.  `fig.add_trace(go.Scatter(x=..., y=...), secondary_y=True)`


**Q3.1:** Using the melted temperature data (from Q1.1) and the `rainfall` dataframe: Filter both for 'NYC'. Create a dual-axis plot with 'Temp' on the left and 'Rain' on the right. (X-axis is 'Month').

In [34]:
# Your answer here

temp_nyc = temp_long[temp_long["City"] == "NYC"]
rain_nyc = rainfall[rainfall["City"] == "NYC"]

fig = make_subplots(specs = [[{"secondary_y":True}]])

fig.add_trace(go.Scatter(
    x = temp_nyc["Month"],
    y = temp_nyc["Temp"],
), secondary_y=False)

fig.add_trace(go.Scatter(
    x = rain_nyc["Month"],
    y = rain_nyc["Rain"],
), secondary_y=True)

fig.show()

---
## Topic 4: JSON Navigation

### ‚ö†Ô∏è Syntax Reminder
*   **Dictionary `{}`**: Use `['key']` (e.g., `data['status']`).
*   **List `[]`**: Use `[index]` (e.g., `data['items'][0]`).


**Q4.1:** From the `weather_api` variable loaded in Setup: Retrieve the city name ('Austin').

In [None]:
# Your answer here
weather_api

AttributeError: 'dict' object has no attribute 'head'

**Q4.2:** From `weather_api`: Retrieve the temperature for 'Tue' (72). (Hint: It's inside a list!)

In [None]:
# Your answer here


---
---
# üìù Answer Key



**Q1.1:**
```python
temp_long = temp_wide.melt(id_vars=['City', 'Region'], var_name='Month', value_name='Temp')
```

**Q1.2:**
```python
temp_wide.melt(id_vars=['City'], var_name='Month', value_name='Temp')
# Result: 'Region' gets melted into the 'Temp' column (mixed data), which is usually bad!
```

**Q1.3:**
```python
df.melt(id_vars=['A', 'B'], var_name='Year', value_name='Value')
```

**Q2.1:**
```python
bad_data = pd.DataFrame({'Price': ['100', '200', '300']})
bad_data['Price'] = bad_data['Price'].astype(int)
```

**Q2.2:**
```python
messy = pd.DataFrame({'Year': ['2010', '2011', 'oops']})
messy['Year'] = pd.to_numeric(messy['Year'], errors='coerce')
```

**Q3.1:**
```python
nyc_temp = temp_long[temp_long['City'] == 'NYC']
nyc_rain = rainfall[rainfall['City'] == 'NYC']

fig = make_subplots(specs=[[{"secondary_y": True}]])
fig.add_trace(go.Scatter(x=nyc_temp['Month'], y=nyc_temp['Temp'], name="Temp"), secondary_y=False)
fig.add_trace(go.Scatter(x=nyc_rain['Month'], y=nyc_rain['Rain'], name="Rain"), secondary_y=True)
fig.show()
```

**Q4.1:** `weather_api['data']['location']['city']`

**Q4.2:** `weather_api['data']['forecast'][1]['temp']`
