# **The Healthy Breakfast Challenge**
 As a nutritionist, you have been hired by local schools to improve the breakfast habits of students. Recent studies have shown that having a nutritious breakfast greatly improves students' energy levels, concentration, and overall performance in school. However, the market is flooded with a variety of cereal brands, each claiming to be the best choice. With the goal of maximizing nutritional value, you need to make informed decisions on which cereal brand to recommend to the local schools.

 **Overview of your Mission**
 - **Milestone 1: Data Cleaning** <br> First up, you will dive into the cereal dataset. Just like detectives, you'll need to understand and clean this data. This means identifying and fixing any mistakes. This ensures you will be working with reliable data. <br>
 - **Milestone 2: Data Visualization** <br> With our data cleaned, it's time to create graphs to show the relationship between different nutritional elements in cereals.
 - **Milestone 3:Data Interpretation** <br> Finally, you need to interpret your findings. This is where you will make sense of your graphs, and will have found the best three cereals to recommend to local schools.

### **Milestone 1: Data Cleaning**

As a nutritionist, your first task is crucial. Before you can make any recommendations, you need to make sure you have a solid foundation — that means starting with clean data. Looking at the cereal dataset below, follow the following instructions:

1. **Spot the clues:** First, scan through the cereal dataset for any bizzare entries. Are there any cereals with missing information? Are there numbers that just don't make sense?
2. **The Clean-up:** Once you identify the bizzare entries, you can start the clean-up process. This can include filling in missing data, correcting inaccuracies, or even removing entries that just don't fit. You may need to research the cereal brand to input the correct data.
3. **Final Inspection:** Take one last look at the cereal dataset so you don't miss anything.

![Data cleaning table should be displayed here](images/table1_cleaning.png)

### **Milestone 2: Data Visualization**
As you've seen, our dataset includes nutritional information for various cereals.

##### <u>**Task 1:**</u> Given the data table below, choose a type of graph that you think will best represent the relationship between **sugars** and **ratings**.

![Display cereal first 15 rows of cereal data](images/table_tograph.png) <br>

##### What type of graph did you choose?  ____________________________________________________

##### <u>**Task 2:**</u> Examine the three graphs below. Choose the graph that you think best represents the relationship between the sugar content and vitamin content in cereals. 

A) Scatter Plot <br>
![Graph 1: Scatter Plot Graph should be displayed here](images/scatter.png)<br>
B) Bar Graph<br>
![Graph 2: Bar Graph should be displayed here](images/bar.png)<br>
C) Line Graph <br>
![Graph 3: Line Graph should be displayed here](images/line.png)<br>


##### Which graph best represented the relationship between sugar and vitamin? _________________



##### <u>**Task 3:**</u> Now, it's time to learn how to code your own graph. Your task is to create a graph that best represents the relationship between **sugars** and **rating**. To code your own graph, follow the steps below

1. Run the set-up code below.

In [9]:
# Set-up Code
import plotly.express as px
import pandas as pd

# Load data
cereal_df = pd.read_csv("csv_files/cereal.csv")

2. Choose the the type of graph you want to visualize. Here are your options:
    - For a <u>bar graph</u>: Write `px.bar`
    - For a <u>line graph</u>: Write `px.line`
    - For a <u>scatter plot</u>: Write `px.scatter`

Example of code:`chosen_graph1 = px.line`

In [16]:
# ** STUDENT SECTION BEGINS **

chosen_graph1 = px.line   # Change as needed.

# ** STUDENT SECTION ENDS **

3. Run the code below to see your graph. 

In [None]:
# Run this code to see you graph
fig = chosen_graph1(
    cereal_df,
    x="sugars",
    y="rating",
    hover_name="name",
    title="Relationship between Sugars and Rating",
)

# Show the plot
fig.show()

# If figure does not show: uncomment the next line.
# fig.write_html("graph1.html") # downloads graph as an html file

4. Hover over the graph, and note your observations. 

<br>

5. Go back to step 2 and try a different graph. Keep repeating step 4 until you code the three types of graphs.

<br>


6. Using the graph you created, answer the following questions: <br> <br>
    A.  Which graph best represents the relationship between **sugars** and **rating**? ______________________________________ <br> <br>
    B. Which cereal had the highest **rating**? _______________________________________________ <br> <br>
    C. Which cereal had the least amount of **sugar** _________________________________________<br> <br>


##### <u>**Task 4:**</u> Now that you know how to create different types of graph, it's time to expand our knowledge. We are now going to learn how to graph relationships between different attributes. Examples of attributes are protein, fat, sugars, etc. Look at the cereal dataset, to determine which attributes to use. This section will help you determine which cereals to recommend to the school board. To code your graph follow the steps below.

1. Run the set-up code below.


In [6]:
# Set-up Code
import plotly.express as px
import pandas as pd

# Load data
cereal_df = pd.read_csv("csv_files/cereal.csv")

2. Choose the the type of graph you want to visualize. Here are your options:
    - For a <u>bar graph</u>: Write `px.bar`
    - For a <u>line graph</u>: Write `px.line`
    - For a <u>scatter plot</u>: Write `px.scatter`

Example of code: `chosen_graph2 = px.line`

In [7]:
# ** STUDENT SECTION BEGINS **

chosen_graph2 = px.scatter  # Change as needed.

# ** STUDENT SECTION ENDS **

3. Choose two attributes to use in your graph. Remember to look at the cereal dataset to choose your attributes!
    - Which attribute is on the x-axis?
    - Which attribute is on the y-axis?

Example of code: The following will create a graph that shows the relationship between **sugars** and **vitamins** <br>
 `x_attribute = "sugars"` <br>
`y_attribute = "vitamins"`

**<span style="color: red">Note: Be careful of spelling!</span>**

In [8]:
# ** STUDENT SECTION BEGINS **

x_attribute = "sugars"      # Change as needed
y_attribute = "vitamins"    # Change as needed

# ** STUDENT SECTION ENDS **

4. Run the code below to see your graph. 

In [None]:
# Run this code to see your graph
fig = chosen_graph2(
    cereal_df,
    x=x_attribute.lower(),
    y=y_attribute.lower(),
    hover_name="name",  # Show cereal name on hover
    title=f"Relationship between {x_attribute.capitalize()} and {y_attribute.capitalize()}",
)

# Show the plot 
fig.show()

# If figure does not show: uncomment the next line.
# fig.write_html("graph1.html") # downloads graph as an html file

4. Use your mouse to hover over the graph. Using this feature, answer the questions in the next section.

5. Based off of your graph, which three cereals should you recommend to local schools?
<br><br>
    1. ____________________________________________________________________ <br> <br>
    2. ____________________________________________________________________ <br> <br>
    3. ____________________________________________________________________ <br> <br>

### **Milestone 3: Data Interpretation**

##### <u>**Task 1:**</u> Given a graph, find cereals that are rich in fiber and protein but low in sugar and fat.

1. Run the code below to generate the graph.

In [None]:
# Run this code to generate graph

# Set-up Code
import plotly.express as px
import pandas as pd

# Load data
cereal_df = pd.read_csv("csv_files/cereal.csv")

# Create graph
graph = px.scatter(
    cereal_df,
    x="fiber",  # Fiber on the x-axis
    y="protein",  # Protein on the y-axis
    size="fat",  # Represent fat content with the size of the marker
    color="sugars",  # Represent sugar level with color
    hover_name="name",  # Show cereal name when you hover over a point
    title="Cereals: Fiber & Protein vs. Sugar & Fat",
    labels={
        "fiber": "Fiber (g)",
        "protein": "Protein (g)",
        "sugars": "Sugar (g)",
        "fat": "Fat (g)",
    },
)

# Show graph
graph.show()

2. Analyze the graph. Use your mouse to hover over the points to see more information. Write down your observations <br> 
    a. What observations can you make about this graph? <br>
    b. What do you notice about the placement of each circle along the the x and y axis? <br>
    c. What do you notice about the size of each circle? <br>
    d. What do you notice about the colour of each circle? <br>



3. Determine which 3 cereals are high in fiber and protein but low in sugar and fat that you can recommend to local schools. <br>
    1. _______________________________________________________ <br> <br>
    2. _______________________________________________________ <br> <br>
    3. _______________________________________________________ <br> <br>
