 # Plotly 

 - Make some interactive plots.
 - [Reference](https://towardsdatascience.com/why-you-must-use-plotly-express-for-data-visualization-9d47f9182807)

In [1]:
import plotly.express as px

In [6]:
# Basic syntax
'''python
px.graphic_type(dataset, x, y, color, size, title...)
'''

# Creating a basic scatterplot
px.scatter(x=[1,2,3,4,5], y=[6,7,8,9,10], title='My First Graph')

In [7]:
import seaborn as sns
df = sns.load_dataset('tips')

In [8]:
df.head()

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size
0,16.99,1.01,Female,No,Sun,Dinner,2
1,10.34,1.66,Male,No,Sun,Dinner,3
2,21.01,3.5,Male,No,Sun,Dinner,3
3,23.68,3.31,Male,No,Sun,Dinner,2
4,24.59,3.61,Female,No,Sun,Dinner,4


In [10]:
df.describe()

Unnamed: 0,total_bill,tip,size
count,244.0,244.0,244.0
mean,19.785943,2.998279,2.569672
std,8.902412,1.383638,0.9511
min,3.07,1.0,1.0
25%,13.3475,2.0,2.0
50%,17.795,2.9,2.0
75%,24.1275,3.5625,3.0
max,50.81,10.0,6.0


In [11]:
df.shape

(244, 7)

## Univariate

In [12]:
# Histogram
fig = px.histogram(df, x='tip')
fig.update_layout(bargap=0.1)
fig.show()

If you notice in the code above, I assigned my graphic to a variable named `fig`. That practice must be a “standard” in your code because when you want to add a custom update to your visualization — like the gaps between bars in the figure above — you will be able to refer to your graphic as the name of the variable it is assigned to, in this case fig.

In [13]:
#Boxplot
fig = px.box(df, y='tip')
fig.show()

Creating a boxplot is as easy as any other graphic. You can see in the picture above that I just used the basic syntax `px.graph_type(data, variable)` and there it is the graphic. And look how easy it is to see max, min, median: you simply hover over the visual. We can, then, quickly see that this distribution is skewed to the right and there are some outliers in the tips values.

You can also plot a violin plot quickly using the code below.


In [14]:
# Violin plot
px.violin(df, x='tip')


Or plot an Empirical Cumulative Distribution Plot. This graphic shows the distributions of the points and its percentile. 50.4% of the data is below 2.92.


In [16]:
# Violin plot
px.ecdf(df, x='tip')

---
## Multivariate

In this case, we will use two or more variables to compare or analyze relationships between them. Common examples of those are scatterplots, line plots, barplots, multiple boxplots etc.

Let’s check what’s the relationship between `tips` and `total_bill`.

In [18]:
# Scatter plot comparing Tips vs. Total Bill
fig = px.scatter(df, x='total_bill', y='tip', size='total_bill')
fig.show()

It is easy to see that there’s a linear correlation between `total_bill` and `tip` amount. As we increase one, the other also goes up. Notice that we also used total_bill as our size parameter, so as the bill amount increases, the points also increase.

In [19]:
# Including another variable
px.scatter(df, x='total_bill', y='tip', size='total_bill', color='sex')

We can also add the `color` parameter and see how men and women are tipping. In this dataset, it looks like the difference is not very large except for the couple of outliers on the right side. And if we quickly check the numbers, the means are not too far off indeed (men: 3.08, women: 2.83).


To create a bar plot from the means of the tip by party size, first it is needed to create a grouped dataframe and then plot it. Not as practical as plotting with Pandas, where you can just gather everything and plot using a single line of code, but the result is not as beautiful as this one.

In [20]:
# Mean tip by size
mean_tip = df.groupby('size').tip.mean().reset_index()
px.bar(mean_tip, x='size', y='tip')

In [21]:
# 3D Plot total_bill vs. tip vs. size
fig = px.scatter_3d(df, x='total_bill', y='tip', z='size', color='sex')
fig.show()

One of the interesting things of the 3D plot is that you can get some insights that you would probably not get from a 2D plot. Notice that, in this graphic, for the size of the party equals to 2, the reds and blues are more balanced, suggesting that could there be more couples having lunch or dinner. And then we can see that the males are in higher number for the other party sizes.

The 3D graphics are very cool, but they should also be used with care, since they are not so trivial to read as it looks. It is easier to lose your focus when looking at one of those plots.

---

## Sample Dashboard

Using Plotly, it is easy to create a Dashboard. 

The difference from the Plotly Express is that you will have to use `plotly.graph_objects` as `go` instead of plotly express. The gist below in my GitHub has the entire code that you can use as a template to create your own visualizations.


You can see that the code itself is pretty similar to the px graphics. With a couple of adaptations, you can create a nice “dashboard” view of a dataset.

In [22]:
import plotly.graph_objects as go
from plotly.subplots import make_subplots

# Create the Subplots area with 2 rows and 2 columns (4 plots)
fig = make_subplots(rows=4, cols=2,
                    #Graph Types
                    specs=[ [{'type': 'domain'}, {'type': 'xy'}],
                            [{'type': 'xy'}, {'type': 'xy'}],
                            [{'type': 'xy'}, {'type': 'xy'}],
                            [{'type': 'xy'}, {'type': 'xy'}] ],
                    # Titles for each graphic
                    subplot_titles=("| GENDER", "| TIP by PARTY SIZE", 
                                    "| TIP by GENDER", "| TIP by SMOKER/NOT", 
                                    "| TIP vs TOTAL_BILL", "| TIP by DAY", 
                                    "|TIP by TIME", "|TOTAL by TIME"))


# Title for the Dashboard
fig.update_layout(height=800, width=1400, title_text="DASHBOARD : RESTAURANT")
fig.update_layout(showlegend=False)


# Plot 1 | Clients by Gender
df_sex = df.groupby('sex').total_bill.count().reset_index()
fig.add_trace(go.Pie(labels=df_sex.sex, values=df_sex.total_bill),
              row=1, col=1)

# Plot 2 | Median spending by Party Size
median_tip = df.groupby('size').tip.median().reset_index()
fig.add_trace(go.Bar(x= median_tip['size'], y=median_tip.tip, name='(Size, MEDIAN)'),
              row=1, col=2)


# Plot 3 | Tips by Gender
fig.add_trace(go.Box(x=df.sex, y=df.tip),
              row=2, col=1)


# Plot 4 | TIP by SMOKER/NOT
tip_smoker = df.groupby('smoker').tip.median().reset_index()
fig.add_trace(go.Bar(x= tip_smoker['smoker'], y=median_tip.tip, name='(Smoker, MEDIAN)'),
              row=2, col=2)

# Plot 5 | Tips vs. Total_bill
fig.add_trace(go.Scatter(x=df.total_bill, y=df.tip, mode='markers', marker=dict(size=10, opacity=.5)),
              row=3, col=1)

# Plot 6 | Tips vs. Day
tip_day = df.groupby('day').tip.median().reset_index()
fig.add_trace(go.Bar(x= tip_day['day'], y=tip_day.tip, name='(Day, MEDIAN)'),
              row=3, col=2)

# Plot 7 | Tips vs. Time
tip_time = df.groupby('time').tip.median().reset_index()
fig.add_trace(go.Bar(x= tip_time['time'], y=tip_time.tip, name='(Time, MEDIAN)'),
              row=4, col=1)

# Plot 8 | Total vs. Time
tip_time2 = df.groupby('time').total_bill.median().reset_index()
fig.add_trace(go.Bar(x= tip_time2['time'], y=tip_time2.total_bill, name='(Day, MEDIAN)'),
              row=4, col=2)

fig.show()