# Data Challenge: Creating Interactive Plotly Visuals

## Targeted KSBs (Knowledge, Skills, and Behaviors)

- **S6** – Demonstrates mastery in creating dynamic visualizations using Python (Plotly)
- **K10** – Applies chart selection principles based on data types, variables, and audience needs
- **S12** – Performs comprehensive data exploration to uncover patterns and relationships

---

## Dataset Description:

This dataset contains information about various Indian sweets, including their ingredients, preparation time, and flavor profile.  You can read more about the data [Here](https://www.kaggle.com/datasets/nehaprabhavalkar/indian-food-101?select=indian_food.csv)

---

## Task 1: Plot a Scatter Plot of Prep Time vs Cook Time
### Objective:
Create a line chart showing the relationship between **prep time** and **cook time** for each food item. Use Plotly to visualize these two variables and identify any patterns.

### Instructions:
1. **Load the dataset** using `pandas`.
2. Use **Plotly Express** to create a **scatterplot**.
3. Set **prep_time** on the x-axis and **cook_time** on the y-axis.
4. **Label** the axes appropriately and add a **title** for clarity.

In [6]:
#Run this cell without changes 
import pandas as pd 
import plotly_express as px

In [7]:
# Read in the data (data/indian_food.csv) -- Hint use pandas to read in the CSV 

df = pd.read_csv('/Users/Marcy_Student/DA2025_Lectures/Mod2/data/indian_food.csv')

In [22]:
# Create a scatterplot of prep_time on X-axis & cook_time on Y-axis 

fig = px.scatter(df, x='prep_time', y='cook_time', title='Indian Food: Prep Time vs Cook Time', 
                 labels = {'cook_time' : 'Cook Time', 'prep_time' : 'Prep Time'})

# Show the plot
fig.show()

### What insights did you get from Task 1? (Double-click to type answer)
Prep Time is almost always shorter than cook time for the indian food in this data set.

## Task 2: Bar Chart of Cook Time by Region

### Objective:
Create a bar chart that shows the average cook time for each region. This will help us understand the cooking time distribution across different regions.  **There is a "weird" bar in the chart why is that the case??**


### Instructions:
- Group the data by the region. (Hint:  may need a df.groupby() method here!)

- Calculate the average cook time for each region.

- Create a bar chart using Plotly to show this average cook time for each region.

- Label the axes and title the chart.

In [23]:
# Task 2: Create a bar chart showing the average cook time by region
# Fill in the code to group by region and calculate the average cook time
df_region_avg = df.groupby('region')['cook_time'].mean().reset_index()

# Create the bar chart
fig = px.bar(df_region_avg, x='region', y='cook_time', title="Average Cook Time by Region",
             labels = {'cook_time': 'Cook Time', 'region': 'Region'})

# Show the plot
fig.show()


### What insights did you get from Task 2? (Double-click to type answer)
The average cook time for most regions is above 30 minutes, except for the North East, which is ~14 minutes. Note that there is another region that's under 30 minutes but the name of the region is -1, which represents a NaN value.

## Task 3: Pie Chart of Flavor Profile Distribution

### Objective:
Create a pie chart showing the distribution of flavor profiles (e.g., sweet, savory) across the dataset.

### Instructions:
- Use Plotly Express to create a pie chart.

- Plot the flavor_profile column, which will show the distribution of flavor types.

- Ensure the chart is labeled clearly.

In [30]:
# Task 3: Create a pie chart showing the flavor profile distribution
# Fill in the code to create the pie chart -- look up documentation if needed 
fig = px.pie(df, names = 'flavor_profile', title = 'Indian Food: Flavor Profile', labels = {'flavor_profile' : 'Flavor Profile', 'spicy' : 'Spicy'}) 

# Show the plot
fig.show()


### What insights did you get from Task 3? (Double-click to type answer)
More than half of the flavor profile's of Indian food are spicy. Spicy and sweet make at least 86.7% of the flavor profile. I say at least because again, there is 11.4% in the -1 category, meaning that there is a NaN value in that column. A very small percentage of Indian food is sour.