Plotly Intro

What is Plotly?
-Plotly is a free, open source library. It supports several languages including Pythons, JavaScript, R, and MATLAB. 
     
What is Plotly used for?
-It is used for data visualization, web-based graphics, and interactive dashboards.

Basic plotly charts and interactions (non-technical)
-Plotly supports several types of charts - basic, financial, statistical and more
-Charts available: 2D/3D scatter, line, bar, pie, heatmap, candlestick, funnel  and geospatial maps
-Interactions: hover tooltips, zooming, panning, click-events, and legend toggling for data exploration

Difference between Plotly graph objects (non-technical/just the basics) and Plotly Express
-Plotly Express is user-friendly and good for quick and simple charts. It is an in-built package that is less customizable.
-Plotly Graph Objects is for visualizations that are customized or more complex. It is also in-built but is more customizable.

In [25]:
import plotly.express as px
import plotly as pl
import plotly.io as pio
import pandas as pd
import numpy as np

JULIA

In [26]:
#First, if you haven't already please download the data set from Canvas.
#The data set we chose looks at food delivery orders
# We're going to bring it in and name it, we named it food bc we're creative
food = pd.read_csv('daily_food_delivery_orders.csv')
food

Unnamed: 0,order_id,order_date,customer_age,restaurant_type,order_value,delivery_distance_km,delivery_time_minutes,payment_method,delivery_partner_rating,order_status
0,1,2024-11-05,62,Indian,497.51,11.07,79,UPI,3.9,Cancelled
1,2,2024-08-20,35,Bakery,232.32,5.83,69,Wallet,2.7,Cancelled
2,3,2024-02-28,34,Italian,540.82,3.61,70,Wallet,3.4,Cancelled
3,4,2024-05-26,65,Cafe,1197.99,3.66,18,Card,4.6,Cancelled
4,5,2024-09-21,40,Indian,947.03,12.08,57,UPI,4.9,Delayed
...,...,...,...,...,...,...,...,...,...,...
2595,2596,2024-05-20,46,Cafe,738.51,9.12,31,Wallet,4.5,Cancelled
2596,2597,2024-05-15,56,Indian,421.78,8.29,66,Card,2.8,Delayed
2597,2598,2024-10-18,32,Cafe,1009.93,12.80,73,UPI,4.4,Delivered
2598,2599,2024-04-24,55,Bakery,240.97,13.56,56,Cash,4.3,Delivered


In [27]:
# First we're going to make a super basic bar chart
#To get there, were going to clean up the data to make the bar chart 
#We're going to make  bar chart with the frequency of resturant types 
#So we're going to generate a count of different resturant types into a little table
#And name the columns of our table
status_counts = food['restaurant_type'].value_counts().reset_index()
status_counts.columns = ['restaurant_type', 'frequency']


#Now let's turn that table into a bar chart. 
#We're also going ot add a title and labels to help the table easier to read
fig1 = px.bar(
    status_counts,
    x='restaurant_type',
    y='frequency',
    title='Frequency of Restaurant Type',
    labels={'restaurant_type': 'Restaurant Type', 'frequency': 'Count'}
   
)

fig1.show()

#You'll notice if you hover over it it displays the actual count
#But let's jazz this one up!

In [28]:
#We're going to add text on the columns, and color 
#For color what is nice is in plotly you can just name a color like red
#If you also have a exact color in mind you can put the hex code. 

fig1 = px.bar(
    status_counts,
    x='restaurant_type',
    y='frequency',
    title='Frequency of Restaurant Type',
    labels={'restaurant_type': 'Restaurant Type', 'frequency': 'Count'},
    text='frequency',
    color='restaurant_type',
    color_discrete_sequence=["brown", "blue", "red", "yellow", "orange", "green"]
)

fig1.show()
#Plus a fun bonus, if you have the legend on, which we do you can just click one to make it temporarly disappear!
  

DARLENE

In [29]:
status_counts = food['order_status'].value_counts().reset_index()
status_counts.columns = ['order_status', 'count']

# Create pie chart
fig2= px.pie(
    status_counts,
    names='order_status',
    values='count',
)

# Update traces to show count (frequency) in black text
fig2.update_traces(
    textinfo='value',          # total frequency
    textfont_color='black',     # text black
    title='<b>FREQUENCY OF ORDER STATUS</b>', #make title bold


#to italicize title Use
   # title='<i>FREQUENCY OF ORDER STATUS</i>', 
)

fig2.update_traces(
    textinfo='label+value',
    textfont_color='black'
)
fig2.show()

KAITLIN

In [30]:
# Instead of typing food.head(20) over and over, we will assign it a variable of food20
food20 = food.head(20)
food20

Unnamed: 0,order_id,order_date,customer_age,restaurant_type,order_value,delivery_distance_km,delivery_time_minutes,payment_method,delivery_partner_rating,order_status
0,1,2024-11-05,62,Indian,497.51,11.07,79,UPI,3.9,Cancelled
1,2,2024-08-20,35,Bakery,232.32,5.83,69,Wallet,2.7,Cancelled
2,3,2024-02-28,34,Italian,540.82,3.61,70,Wallet,3.4,Cancelled
3,4,2024-05-26,65,Cafe,1197.99,3.66,18,Card,4.6,Cancelled
4,5,2024-09-21,40,Indian,947.03,12.08,57,UPI,4.9,Delayed
5,6,2024-03-16,51,Italian,835.75,3.56,85,UPI,3.8,Delayed
6,7,2024-11-20,52,Cafe,771.83,14.37,80,Card,3.7,Delivered
7,8,2024-11-24,52,Chinese,926.2,12.81,19,UPI,4.4,Delayed
8,9,2024-07-23,38,Bakery,548.11,13.54,42,Cash,4.1,Delayed
9,10,2024-07-01,24,Fast Food,177.18,8.15,29,Cash,2.9,Delivered


In [31]:
# We want to know what the keys are in this dataset.
food20.keys() 

Index(['order_id', 'order_date', 'customer_age', 'restaurant_type',
       'order_value', 'delivery_distance_km', 'delivery_time_minutes',
       'payment_method', 'delivery_partner_rating', 'order_status'],
      dtype='object')

In [32]:
# Lets look at the order date compared to the order value. The order date
# will be on the x axis and the order value will be on the y value.
# With plotly, one thing you can do is include markers, or points, in your code on the graph.
fig = px.line(food20, x="order_date", y="order_value", markers=True)
fig.show()

In [33]:

# we will create a new colum called in_order_dates to put the dates in order
# this code is saying create a new colum called in_order_dates
# we will be using pandas function pd.to_datetime to say to order the content in the colum order_date that is present in the entire food data to be in order
food20['in_order_dates'] = pd.to_datetime(food['order_date'])
food20.sort_values('in_order_dates', inplace = True)



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



In [34]:
# Now, since the code is reading from top down, we should see a nice line graph.
fig = px.line(food20, x="order_date", y="order_value", markers=True)
fig.show()

In [56]:
# Now that we have a x-axis label and y-axis label, we can get to tampering with the line itself.
# We will use the fig.update_traces() function to update what you see visually in graphs.
# First, we will start with the markers. Use marker=dict, this will tell python to pull out the dictionary of things that can be used on the markers.
# For the color, you can type in a general color like red or blue, or you can use a hex code- you can also set the opacity. Size is size.
# On plotly, there is documentation of all types of markers you can use in a dataset.
# Next, for the line, same thing. You can use line=dict to tell python to pull the dictionary of things we can do to this line.
# For now, we can set the width and color of this line
 
 
fig = px.line(food20, x="order_date", y="order_value", markers=True,
    labels=dict(order_value="Order Value", order_date = "Order Date"))
 
fig.update_traces(marker=dict(color="blue", opacity=0.8, size=10, symbol="star-diamond" ),
                  line=dict(width=3, color="red"))
 
fig.show() 

BEN

In [36]:
fig1 = px.scatter(food, x = "delivery_distance_km", y = "delivery_time_minutes")
fig1.show()

#Plot the delivery distance vs the delivery time

In [37]:
fig2 = px.scatter(food, x = "delivery_distance_km", y = "delivery_time_minutes", color = "delivery_partner_rating")
fig2.show()

#Add color

In [38]:
fig3 = px.scatter(food, x = "delivery_distance_km", y = "delivery_time_minutes", color = "delivery_partner_rating", symbol = "order_status")
fig3.show()

#Add the order status

In [39]:
foodSample = food.sample(n = 50)
foodSample.head()

#Take a random sample of the data

Unnamed: 0,order_id,order_date,customer_age,restaurant_type,order_value,delivery_distance_km,delivery_time_minutes,payment_method,delivery_partner_rating,order_status
697,698,2024-07-08,28,Chinese,533.63,9.84,57,UPI,4.4,Cancelled
699,700,2024-04-04,49,Cafe,1146.08,5.67,74,UPI,3.0,Cancelled
163,164,2024-07-27,55,Indian,242.25,0.52,80,Wallet,4.9,Delayed
1303,1304,2024-02-06,32,Cafe,1009.96,11.91,75,Card,4.9,Delivered
1945,1946,2024-12-11,56,Indian,765.96,9.23,68,Card,3.8,Cancelled


In [40]:
fig4 = px.scatter(foodSample, x = "delivery_distance_km", y = "delivery_time_minutes", color = "delivery_partner_rating", symbol = "order_status")
fig4.show()

In [41]:
fig5 = px.scatter(foodSample, x = "delivery_distance_km", y = "delivery_time_minutes", color = "delivery_partner_rating", symbol = "order_status")

fig5.update_traces(marker = dict (size = 20, opacity = 0.5, line = dict (width = 2, color = "Black")))

fig5.show()

#Increase the size and decrease the opacity of the points

In [42]:
fig6 = px.scatter(foodSample, x = "delivery_distance_km", y = "delivery_time_minutes", color = "delivery_partner_rating", symbol = "order_status", title = "Delivery Distance vs Delivery Time", labels = {"delivery_distance_km" : "Delivery Distance (KM)", "delivery_time_minutes" : "Delivery Time (M)", "delivery_partner_rating" : "Delivery Partner Rating"})

fig6.update_traces(marker = dict (size = 20, opacity = 0.5, line = dict (width = 2, color = "Black")))

fig6.show()

#Customize the title and axes. Use the "labels" function to customize the look of the labels.

In [43]:
fig4 = px.scatter(foodSample, x = "delivery_distance_km", y = "delivery_time_minutes", color = "delivery_partner_rating", symbol = "order_status")
fig4.update_layout(legend = {'yanchor': "top", 'y': .98, 'xanchor': 'left', 'x': 1.13})
fig4.show()

In [44]:
fig6_dict = fig6.to_dict()
fig6_dict['layout']['legend']

{'title': {'text': 'order_status'}, 'tracegroupgap': 0}

In [45]:
fig6.update_layout(legend = {'yanchor': "top", 'y': .68, 'xanchor': 'left', 'x': 1.13})
fig6.show()

In [46]:
fig6_dict = fig6.to_dict()
fig6_dict['layout']['legend']

{'title': {'text': 'order_status'},
 'tracegroupgap': 0,
 'yanchor': 'top',
 'y': 0.68,
 'xanchor': 'left',
 'x': 1.13}

In [47]:
fig6.update_layout(legend = {'yanchor': "top", 'y': .68, 'xanchor': 'left', 'x': 1.13}, legend_title_text = "Order Status")
fig6.show()

#Change the legends' titles

In [48]:
## Here we first create a new column called age_group by dividing customer ages into ranges such as under 18, 18–25, 26–35, and so on using pd.cut(). 
## This helps us analyze how different age groups behave. Then we create a histogram to visualize the distribution of order values. 
## The x-axis shows the order value and the bars show how frequently different values occur. We also color the bars by order_status so we can compare things like delivered versus cancelled orders. 
## To make comparisons easier, we split the chart into multiple smaller plots using facet_col, where each plot represents a different age group. 
## Finally, we apply the template so the chart styling stays consistent with the rest of the visualizations.

PLOT_TEMPLATE = "plotly_white" 
# Order status colors 

status_colors = {
    "Delivered": "#2A9D8F",
    "Delayed": "#E9C46A",
    "Cancelled": "#E63946"
} 

food["age_group"] = pd.cut(
    food["customer_age"],
    bins=[0, 18, 25, 35, 45, 60, 100],
    labels=["<18", "18-25", "26-35", "36-45", "46-60", "60+"]
)

fig = px.histogram(
    food,
    x="order_value",
    color="order_status",
    color_discrete_map=status_colors,
    facet_col="age_group",
    facet_col_wrap=3,
    nbins=30,
    title="Order Value Distribution by Age Group (Faceted) and Status",
    labels={"order_value": "Order Value"}
)

fig.update_layout(template=PLOT_TEMPLATE)
fig.show()