# Data Science Thursdays: Plotly Express Tutorial

---

In this tutorial we will apply what we just learned about Plotly Express and create visualizations using the Plotly Express library for each scenario.

Let's start out by importing the plotly express library along with the pandas library to read in and process our datasets. Additionally, we may use built-in datasets inside of the Plotly Express library through this tutorial.

If at anytime you need help you can always reference the Plotly Express [documentation](https://plotly.github.io/plotly_express/plotly_express/).

In [None]:
import pandas as pd
import plotly_express as px

# Used to hide warnings
import warnings
warnings.filterwarnings('ignore')

### Question 1: Visualizing the Relation of Gender and BMI on a Stroke Patient Population

**Scenario:** Your manager, Chris, asked you to create a scatter graph using the `stroke.csv` dataset you got last week. 

**1st Visualization:** He said he wants the scatter graph to have `id` on the x-axis and `bmi` on the y-axis. Can you create a scatter graph with the above specifications for Chris?

In [None]:
# Read in your dataset using the pandas library
stroke_df = pd.read_csv("stroke.csv")

stroke_df

In [None]:
# Plot your graph using the plotly_express library
stroke_scatter_one_viz =

stroke_scatter_one_viz

**2nd Visualization:** Chris now wants you to make patient data points in the graph you just created have different colors and symbols. He would like for patients who had a `stroke` to be separated by having a different symbol on the scatter plot. He also would like you to color the patient data points by `gender`. For his last request he would like you to make the graph have a height of `600` and a width of `800`. Can you copy your graph above and paste it below and add these new changes?

In [None]:
stroke_scatter_two_viz = 

stroke_scatter_two_viz

---

### Question 2: Machine Learning Iris Dataset Visualizations

**Scenario:** For you first machine learning project your manager, Veena, has asked you to display several visualizations for the iris dataset found in the plotly_express library.

**1st Visualization:** Create a scatter plot using the `iris_df` where the x-axis has `sepal_width` and y-axis has `sepal_height` and colored by the values in the `species` column.

In [None]:
# The following dataset is already assigned to the variable iris_df
iris_df = px.data.iris()

iris_df

In [None]:
# Add you first visualization here
first_viz = 

first_viz

**2nd Visualization:** Veena now wants you to take the visualization you created above but on the x-axis margin create a `histogram` subplot and on the y-axis margin create a `rug` subplot. 

In [None]:
# Add your visualization here
second_viz = 

second_viz

**3rd Visualization:** For your third visualization, Veena wants you to take the visualization you created above except changed the types of subplots we made before. Instead, she wants you to create a `box` subplot on the x-axis margin and a `violin` subplot on the y-axis margin. Additionally, Veena would like you to add a [Ordinary Least Squares](https://en.wikipedia.org/wiki/Ordinary_least_squares) or `ols` regression trendline to the original plot. 

In [None]:
# Add your visualization here
third_viz = 

third_viz

**4th Visualization:** Veena now wants you to create a scatter matrix plot for `iris_df` where the dimensions are "sepal_width", "sepal_length", "petal_width", "petal_length" and is colored by the "species" column.

In [None]:
# Add your visualization here
fourth_viz = 

fourth_viz

**5th Visualization:** Veena would now like you to create a density heatmap for `iris_df`. The heatmap should have "sepal_width" on its x-axis and "sepal_length" on its y-axis.

In [None]:
# Add 5th visualization here

fifth_viz = 

fifth_viz

**6th Visualization:** For the last visualization, Veena would like you to create a density contour plot where "sepal_width" is on the x-axis and "sepal_length" is on the y-axis. Additionally, the color should be specified by "species".

In [None]:
# Add your visualization below
sixth_viz = 

sixth_viz

---

### Question 3: Exploratory Analysis of Tobacco Use in the United States

**Scenario:** Your managers, Ahsan and Neil, want to conduct an exploratory analysis on tabacco use using the CDC's Behaviorial Risk Factor Surveillance System dataset called `tobacco_use.csv`.

**1st Visualization:** For your first visualization, Neil would like you to make a bar graph that displays the "States" on the x-axis and "Smokers" on the y-axis. This will show us the amount of smokers by State. However, he only wants to see the number of smokers in the year 2010. You can get only smokers from a certain year by using pandas and selecting like this: `tobacco_df[tobacco_df["Year"] == 2000]`.

In [None]:
# Read in your dataset using the pandas library
tobacco_df = # Read in dataset here

tobacco_df

In [None]:
# Add your visualization here
tobacco_viz_one = 

tobacco_viz_one

**2nd Visualization:** Ahsan, wants you to create a similar visualization to Neil's above except using a polar bar graph. He wants `r` to be `Non-smokers`, `theta` to be `State` and `color` to be `Year`.

In [None]:
# Add your visualization here
tobacco_viz_two = 

tobacco_viz_two

--- 

## Question 4: Exploring ChapStick and Competitors in 3D

**Scenario:** Your manager, Eric, wants to conduct an exploratory analysis for the Pfizer Consumer Healthcare product ChapStick. He wants to visualize the amount of units sold in comparison to ChapStick's competitors Carmex Lip Balm, and Burt's Bees Lip Balm in 3 dimenions using the dataset `chapstick_and_comp.csv`. 

**1st Visualization:** For your first visualization Eric would like you to create a 3d scatter plot that displays the `Burt's Beeswax` on the x-axis, `ChapStick` on the y-axis, and `Carmex Lip Balm` on the z-axis. He also wants them to be colored based on the winning product of that day `Winner` and for the size attribute to be based on the `Total` column.

In [None]:
# Read in your dataset using the pandas library
chapstick_and_comp_df = # Read in dataset here

chapstick_and_comp_df

In [None]:
# Add your visualization here
chapstick_viz_one =

chapstick_viz_one

**2nd Visualization:** For Eric's last visualization, he wants you to create a ternary scatter plot very similar to the 1st visualization above. He want's the `hover_name` to be `State` and `size_max` to be set to `15`.

In [None]:
chapstick_viz_two =

chapstick_viz_two