Plotly library: Plotly's Python graphing library makes interactive, publication-quality graphs online. Examples of how to make line plots, scatter plots, area charts, bar charts, error bars, box plots, histograms, heatmaps, subplots, multiple-axes, polar charts, and bubble charts.

In [1]:
import pandas as pd
import plotly.express as px

In [2]:
diamonds = pd.read_csv("diamonds.csv")

In [3]:
diamonds

Unnamed: 0,carat,cut,color,clarity,depth,table,price,x,y,z
0,0.23,Ideal,E,SI2,61.5,55.0,326,3.95,3.98,2.43
1,0.21,Premium,E,SI1,59.8,61.0,326,3.89,3.84,2.31
2,0.23,Good,E,VS1,56.9,65.0,327,4.05,4.07,2.31
3,0.29,Premium,I,VS2,62.4,58.0,334,4.20,4.23,2.63
4,0.31,Good,J,SI2,63.3,58.0,335,4.34,4.35,2.75
...,...,...,...,...,...,...,...,...,...,...
53935,0.72,Ideal,D,SI1,60.8,57.0,2757,5.75,5.76,3.50
53936,0.72,Good,D,SI1,63.1,55.0,2757,5.69,5.75,3.61
53937,0.70,Very Good,D,SI1,62.8,60.0,2757,5.66,5.68,3.56
53938,0.86,Premium,H,SI2,61.0,58.0,2757,6.15,6.12,3.74


In [4]:
diamonds.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 53940 entries, 0 to 53939
Data columns (total 10 columns):
 #   Column   Non-Null Count  Dtype  
---  ------   --------------  -----  
 0   carat    53940 non-null  float64
 1   cut      53940 non-null  object 
 2   color    53940 non-null  object 
 3   clarity  53940 non-null  object 
 4   depth    53940 non-null  float64
 5   table    53940 non-null  float64
 6   price    53940 non-null  int64  
 7   x        53940 non-null  float64
 8   y        53940 non-null  float64
 9   z        53940 non-null  float64
dtypes: float64(6), int64(1), object(3)
memory usage: 4.1+ MB


In [5]:
diamonds.describe()

Unnamed: 0,carat,depth,table,price,x,y,z
count,53940.0,53940.0,53940.0,53940.0,53940.0,53940.0,53940.0
mean,0.79794,61.749405,57.457184,3932.799722,5.731157,5.734526,3.538734
std,0.474011,1.432621,2.234491,3989.439738,1.121761,1.142135,0.705699
min,0.2,43.0,43.0,326.0,0.0,0.0,0.0
25%,0.4,61.0,56.0,950.0,4.71,4.72,2.91
50%,0.7,61.8,57.0,2401.0,5.7,5.71,3.53
75%,1.04,62.5,59.0,5324.25,6.54,6.54,4.04
max,5.01,79.0,95.0,18823.0,10.74,58.9,31.8


# Histograms in Plotly Express

A histogram is probably the very first visual people learn.

It orders the values of a distribution and puts them into bins. Then, bars are used to represent how many values fall into each bin

In [6]:
fig = px.histogram(
  diamonds,
  x="price",
  title="Histogram of diamond prices",
  width=600,
  height=400,
)

fig.show()

# Plotly Express bar charts

let's see the mean price for each diamond cut category. First, we use pandas groupby

In [7]:
mean_prices = (
    diamonds.groupby("cut")["price"].mean().reset_index()
)

mean_prices

Unnamed: 0,cut,price
0,Fair,4358.757764
1,Good,3928.864452
2,Ideal,3457.54197
3,Premium,4584.257704
4,Very Good,3981.759891


In [8]:
fig = px.bar(mean_prices, x="cut", y="price")

fig.show()

we didn't put any labels on the plot! Let's fix that using the update_layout function, which can modify many aspects of a figure after it has been created.

In [9]:
fig.update_layout(
  title="Average diamond prices for each cut category",
  xaxis_title="",
  yaxis_title="Mean price ($)",
)

# Scatterplots in Plotly Express

We can use a scatterplot (px.scatter) which plots all diamonds in the dataset as dots. The position of the dots is determined by their corresponding price and carat values

In [10]:
fig = px.scatter(diamonds, x="price", y="carat")
fig.update_layout(
  title="Price vs. Carat",
  xaxis_title="Price ($)",
  yaxis_title="Carat",
)
fig.show()

 Plotting only ~10% of the dataset

In [11]:
fig = px.scatter(
  diamonds.sample(5000), x="price", y="carat"
)
fig.update_layout(
  title="Price vs. Carat",
  xaxis_title="Price ($)",
  yaxis_title="Carat",
)
fig.show()

In [12]:
sample = diamonds.sample(3000)

In [13]:
sample

Unnamed: 0,carat,cut,color,clarity,depth,table,price,x,y,z
49709,0.57,Ideal,G,VVS2,61.9,56.0,2147,5.30,5.29,3.28
52614,0.70,Good,D,SI1,62.6,62.0,2546,5.61,5.64,3.52
37985,0.39,Ideal,G,VVS1,61.8,54.4,1008,4.69,4.73,2.92
39606,0.26,Premium,H,VS1,62.3,59.0,491,4.10,4.06,2.54
20481,1.50,Good,F,SI2,62.6,64.0,8820,7.25,7.19,4.52
...,...,...,...,...,...,...,...,...,...,...
10667,0.90,Ideal,F,VS2,62.0,56.0,4838,6.23,6.15,3.84
10813,1.01,Fair,G,VS1,66.8,61.0,4864,6.15,6.06,4.08
26589,2.03,Premium,I,VS2,60.5,62.0,16309,8.22,8.17,4.96
37027,0.30,Good,F,VVS1,63.7,59.0,965,4.24,4.17,2.68


In [14]:
fig = px.scatter(
  sample, x="price", y="carat", template="ggplot2"
)
fig.show()

In [15]:
fig = px.scatter(sample, x="price", y="carat", color="cut")
fig.show()

In [16]:
fig = px.scatter(sample, x="price", y="x", size="carat")
fig.show()

In [17]:
fig = px.scatter(
  sample, x="price", y="x", size="carat", color="cut"
)
fig.show()

# Pie Chart

In [20]:
# plotting the pie chart
fig = px.pie(diamonds, values="price", names="cut")

# showing the plot
fig.show()