# üìä Pandas Practice 3: Reshaping & Visualization

## Focus Areas
Based on Lecture 21 & 22, this notebook covers:
1.  **Melt (Reshaping)**: Converting "Wide" data (years as columns) to "Long" data (year as a single column).
2.  **Plotly Express**: Creating line charts and maps.
3.  **Dual-Axis Plots**: Using `plotly.graph_objects` for two Y-axes.
4.  **APIs & JSON**: Loading data from URLs and handling JSON structures.

**Instructions:**
Try to solve the questions using the patterns provided in the notes sections.


## Setup
Run this cell to load the data:

In [1]:
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots

# Sample Data: GDP per Capita (Wide Format)
gdp_wide = pd.DataFrame({
    'Country': ['USA', 'China', 'Japan', 'Germany'],
    'Code': ['USA', 'CHN', 'JPN', 'DEU'],
    '2010': [48000, 4500, 40000, 42000],
    '2011': [49000, 5600, 41000, 44000],
    '2012': [51000, 6300, 42000, 45000],
    '2013': [53000, 7000, 43000, 46000]
})

# Sample Data: Life Expectancy (Long Format)
life_exp = pd.DataFrame({
    'Country': ['USA', 'USA', 'USA', 'China', 'China', 'China'],
    'Year': [2010, 2011, 2012, 2010, 2011, 2012],
    'LifeExp': [78.5, 78.6, 78.7, 74.0, 74.5, 75.0]
})

print("Data Loaded!")
gdp_wide.head()

Data Loaded!


Unnamed: 0,Country,Code,2010,2011,2012,2013
0,USA,USA,48000,49000,51000,53000
1,China,CHN,4500,5600,6300,7000
2,Japan,JPN,40000,41000,42000,43000
3,Germany,DEU,42000,44000,45000,46000


---
## Topic 1: Melt (Wide to Long)

### Concept
When years are columns (2010, 2011...), pandas can't plot them easily. You need to "melt" them into a single 'Year' column.

**Syntax:**
```python
df_long = df.melt(
    id_vars=['Country', 'Code'],      # Columns to KEEP fixed (Identifiers)
    var_name='Year',                  # Name for the new "Variable" column
    value_name='GDP'                  # Name for the new "Value" column
)
```


**Q1.1:** Melt the `gdp_wide` dataframe so that '2010', '2011', etc. become a 'Year' column and the values become 'GDP'. Keep 'Country' and 'Code' as identifiers.

In [8]:
# Your answer here
gdp_long = gdp_wide.melt(
    id_vars=["Country", "Code"],
    var_name = "Year",
    value_name= "GDP",
)

gdp_long

Unnamed: 0,Country,Code,Year,GDP
0,USA,USA,2010,48000
1,China,CHN,2010,4500
2,Japan,JPN,2010,40000
3,Germany,DEU,2010,42000
4,USA,USA,2011,49000
5,China,CHN,2011,5600
6,Japan,JPN,2011,41000
7,Germany,DEU,2011,44000
8,USA,USA,2012,51000
9,China,CHN,2012,6300


**Q1.2:** After melting, the 'Year' column might be strings ('2010'). Convert the 'Year' column to integers.

In [10]:
# Your answer here
gdp_long["Year"] =  gdp_long["Year"].astype(int)

**Q1.3:** Filter the melted dataframe to show only data for 'China'.

In [13]:
# Your answer here
gdp_long[gdp_long["Code"] == "CHN"]

Unnamed: 0,Country,Code,Year,GDP
1,China,CHN,2010,4500
5,China,CHN,2011,5600
9,China,CHN,2012,6300
13,China,CHN,2013,7000


---
## Topic 2: Plotly Express (Line Charts)

### Concept
Create interactive plots easily.

**Syntax:**
```python
fig = px.line(
    data_frame,
    x="Year",
    y="Value",
    color="Country",  # Different line for each country
    title="My Plot Title"
)
fig.show()
```


**Q2.1:** Using the melted GDP dataframe from Q1, create a line chart showing GDP over time. Use 'Country' for color.

In [16]:
# Your answer here

fig =px.line(
    gdp_long,
    x = "Year",
    y = "GDP",
    title = "GDP over time",
    color = "Country",
)

fig.show()

**Q2.2:** Create a line chart for `life_exp` showing Life Expectancy over time, colored by Country.

In [19]:
# Your answer here
life_exp
fig2 = px.line(
    life_exp,
    x = "Year",
    y = "LifeExp",
    color = "Country",
    title = "Life Expectency By Country Over Time",
)

fig2.show()

---
## Topic 3: Dual-Axis Plots (Graph Objects)

### Concept
When you want to plot two different units (e.g., GDP in $ and Life Expectancy in Years) on the same chart, you need two Y-axes.

**Syntax:**
```python
# 1. Create Subplot with secondary Y
fig = make_subplots(specs=[[{"secondary_y": True}]])

# 2. Add First Trace (Left Axis)
fig.add_trace(
    go.Scatter(x=df['Year'], y=df['GDP'], name="GDP"),
    secondary_y=False
)

# 3. Add Second Trace (Right Axis)
fig.add_trace(
    go.Scatter(x=df['Year'], y=df['LifeExp'], name="Life Exp"),
    secondary_y=True
)

# 4. Show
fig.show()
```


**Q3.1:** Create a dual-axis plot for 'USA' only. Plot 'GDP' on the left axis and 'LifeExp' on the right axis. (Hint: Filter both dataframes for USA first!)

In [None]:
# Your answer here
fig = make_subplots(specs =[[{"secondary_y": True}]])

---
## Topic 4: APIs & JSON

### Concept
Loading data directly from the web (JSON format).

**Syntax:**
```python
import requests
response = requests.get("URL_HERE")
data = response.json()  # Converts JSON to Python Dictionary/List
```

**Navigating JSON:**
If `data` is `{'results': [{'name': 'John'}]}`, then:
`data['results'][0]['name']` gets 'John'.


**Q4.1:** (Mock Question) Suppose you have this variable: `data = {'status': 'ok', 'users': [{'id': 1, 'name': 'Alice'}, {'id': 2, 'name': 'Bob'}]}`. Write code to retrieve the name 'Bob'.

In [None]:
# Your answer here
import requests 

---
---
# üìù Answer Key



**Q1.1:**
```python
gdp_long = gdp_wide.melt(id_vars=['Country', 'Code'], var_name='Year', value_name='GDP')
```

**Q1.2:**
```python
gdp_long['Year'] = gdp_long['Year'].astype(int)
```

**Q1.3:**
```python
gdp_long[gdp_long['Country'] == 'China']
```

**Q2.1:**
```python
px.line(gdp_long, x='Year', y='GDP', color='Country', title='GDP Over Time')
```

**Q3.1:**
```python
usa_gdp = gdp_long[gdp_long['Country'] == 'USA']
usa_life = life_exp[life_exp['Country'] == 'USA']

fig = make_subplots(specs=[[{"secondary_y": True}]])
fig.add_trace(go.Scatter(x=usa_gdp['Year'], y=usa_gdp['GDP'], name="GDP"), secondary_y=False)
fig.add_trace(go.Scatter(x=usa_life['Year'], y=usa_life['LifeExp'], name="Life Exp"), secondary_y=True)
fig.show()
```

**Q4.1:**
```python
data['users'][1]['name']
```
