# Data Challenge: Creating Interactive Plotly Visuals

## Targeted KSBs (Knowledge, Skills, and Behaviors)

- **S6** – Demonstrates mastery in creating dynamic visualizations using Python (Plotly)
- **K10** – Applies chart selection principles based on data types, variables, and audience needs
- **S12** – Performs comprehensive data exploration to uncover patterns and relationships

---

## Dataset Description:

This dataset contains information about various Indian sweets, including their ingredients, preparation time, and flavor profile.  You can read more about the data [Here](https://www.kaggle.com/datasets/nehaprabhavalkar/indian-food-101?select=indian_food.csv)

---

## Task 1: Plot a Scatter Plot of Prep Time vs Cook Time
### Objective:
Create a line chart showing the relationship between **prep time** and **cook time** for each food item. Use Plotly to visualize these two variables and identify any patterns.

### Instructions:
1. **Load the dataset** using `pandas`.
2. Use **Plotly Express** to create a **scatterplot**.
3. Set **prep_time** on the x-axis and **cook_time** on the y-axis.
4. **Label** the axes appropriately and add a **title** for clarity.

In [20]:
#Run this cell without changes 
import pandas as pd 
import plotly_express as px

In [21]:
# Read in the data (data/indian_food.csv) -- Hint use pandas to read in the CSV 

df = pd.read_csv("../data/indian_food.csv")

In [22]:
df.head()

Unnamed: 0,name,ingredients,diet,prep_time,cook_time,flavor_profile,course,state,region
0,Balu shahi,"Maida flour, yogurt, oil, sugar",vegetarian,45,25,sweet,dessert,West Bengal,East
1,Boondi,"Gram flour, ghee, sugar",vegetarian,80,30,sweet,dessert,Rajasthan,West
2,Gajar ka halwa,"Carrots, milk, sugar, ghee, cashews, raisins",vegetarian,15,60,sweet,dessert,Punjab,North
3,Ghevar,"Flour, ghee, kewra, milk, clarified butter, su...",vegetarian,15,30,sweet,dessert,Rajasthan,West
4,Gulab jamun,"Milk powder, plain flour, baking powder, ghee,...",vegetarian,15,40,sweet,dessert,West Bengal,East


In [23]:
# Create a scatterplot of prep_time on X-axis & cook_time on Y-axis 

fig = px.scatter(df, x="prep_time", y="cook_time", title="Relationship between Preparation Time and Cooking Time")

# Show the plot
fig.show()

### What insights did you get from Task 1? (Double-click to type answer)

- Looking at the graph, we can see that most of the data are clustered at the bottom. 
- Most of the dishes take less time to prepare but more time to cook
- Except, The are some outliers in the data. Like 720 min to cook a dish is too much. 

## Task 2: Bar Chart of Cook Time by Region

### Objective:
Create a bar chart that shows the average cook time for each region. This will help us understand the cooking time distribution across different regions.  **There is a "weird" bar in the chart why is that the case??**


### Instructions:
- Group the data by the region. (Hint:  may need a df.groupby() method here!)

- Calculate the average cook time for each region.

- Create a bar chart using Plotly to show this average cook time for each region.

- Label the axes and title the chart.

In [7]:
# Task 2: Create a bar chart showing the average cook time by region
# Fill in the code to group by region and calculate the average cook time
df_region_avg = df.groupby("region")["cook_time"].mean().reset_index()

# Create the bar chart
fig = px.bar(df_region_avg, x="region", y="cook_time", title="Average Cook Time by Region")

# Show the plot
fig.show()


### What insights did you get from Task 2? (Double-click to type answer)

In [24]:
df["region"].unique()


array(['East', 'West', 'North', '-1', 'North East', 'South', 'Central',
       nan], dtype=object)

The central region has the highest average cooking time and the North-East region has the lowest. 
Using the `.unique()` in the column region, i was able to see the all the different values inside that column. That basicaly explain why it shows on the graph. 


## Task 3: Pie Chart of Flavor Profile Distribution

### Objective:
Create a pie chart showing the distribution of flavor profiles (e.g., sweet, savory) across the dataset.

### Instructions:
- Use Plotly Express to create a pie chart.

- Plot the flavor_profile column, which will show the distribution of flavor types.

- Ensure the chart is labeled clearly.

In [9]:
# Task 3: Create a pie chart showing the flavor profile distribution
# Fill in the code to create the pie chart -- look up documentation if needed 
fig = px.pie(df,"flavor_profile",title="Flavor Profile")

# Show the plot
fig.show()


### What insights did you get from Task 3? (Double-click to type answer)

- The top 3 Flavor are `Spicy`, `Sweet`, and `-1`
- -1 has a lot of missing flavor!
- Spicy is the most cooked flavor with over 50% of all Flavors
- `Sour` has the least. 