# Data Visualization with Plotly

Plotly is an open-source library that can be used for data visualization and understanding data simply and easily

Some of the graphs that can be made with Plotly includes line plots, scatter plots, area charts, bar charts, error bars, box plots, histograms, heatmaps, subplots, multiple-axes, polar charts, and bubble charts.

Installation

In [None]:
%pip install plotly

Package Structure

- plotly.plotly: Acts as the interface between the local machine and Plotly. It contains functions that require a response from Plotly’s server

- plotly.graph.objects: Contains the objects (Figure, layout, data, and the definition of the plots like scatter plot, line chart) that are responsible for creating the plots.  The Figure can be represented either as dict or instances of plotly.graph_objects.Figure and these are serialized as JSON before it gets passed to plotly.js

- plotly.express: Creates the entire Figure at once. It uses the graph_objects internally and returns the graph_objects.Figure instance

In [20]:
import plotly.express as px 
import numpy as np
import pandas as pd

In [7]:
# Creating the Figure instance
fig = px.line(x=[1,2, 3], y=[1, 2, 3]) 

# Print the Figure instance
print(fig)

Figure({
    'data': [{'hovertemplate': 'x=%{x}<br>y=%{y}<extra></extra>',
              'legendgroup': '',
              'line': {'color': '#636efa', 'dash': 'solid'},
              'marker': {'symbol': 'circle'},
              'mode': 'lines',
              'name': '',
              'orientation': 'v',
              'showlegend': False,
              'type': 'scatter',
              'x': array([1, 2, 3]),
              'xaxis': 'x',
              'y': array([1, 2, 3]),
              'yaxis': 'y'}],
    'layout': {'legend': {'tracegroupgap': 0},
               'margin': {'t': 60},
               'template': '...',
               'xaxis': {'anchor': 'y', 'domain': [0.0, 1.0], 'title': {'text': 'x'}},
               'yaxis': {'anchor': 'x', 'domain': [0.0, 1.0], 'title': {'text': 'y'}}}
})


Display the figure

In [12]:
fig.show()

Define dataframes of datasets

In [27]:
df = px.data.iris() 
tdf = pd.read_csv("./timesdata.csv")

In [28]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 6 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   sepal_length  150 non-null    float64
 1   sepal_width   150 non-null    float64
 2   petal_length  150 non-null    float64
 3   petal_width   150 non-null    float64
 4   species       150 non-null    object 
 5   species_id    150 non-null    int64  
dtypes: float64(4), int64(1), object(1)
memory usage: 7.2+ KB


In [29]:
tdf.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2603 entries, 0 to 2602
Data columns (total 14 columns):
 #   Column                  Non-Null Count  Dtype  
---  ------                  --------------  -----  
 0   world_rank              2603 non-null   object 
 1   university_name         2603 non-null   object 
 2   country                 2603 non-null   object 
 3   teaching                2603 non-null   float64
 4   international           2603 non-null   object 
 5   research                2603 non-null   float64
 6   citations               2603 non-null   float64
 7   income                  2603 non-null   object 
 8   total_score             2603 non-null   object 
 9   num_students            2544 non-null   object 
 10  student_staff_ratio     2544 non-null   float64
 11  international_students  2536 non-null   object 
 12  female_male_ratio       2370 non-null   object 
 13  year                    2603 non-null   int64  
dtypes: float64(4), int64(1), object(9)
memor

Line chart with Iris dataset

In [14]:

fig = px.line(df, x="species", y="petal_width") 

fig.show()

Bar Chart

In [15]:
fig = px.bar(df, x="sepal_width", y="sepal_length") 
 
fig.show()

Histogram

In [16]:
fig = px.histogram(df, x="sepal_length", y="petal_width") 

fig.show()

Scatter Plot

In [17]:
fig = px.scatter(df, x="species", y="petal_width") 

fig.show()

In [18]:
fig = px.scatter(df, x="species", y="petal_width", 
                 size="petal_length", color="species") 
fig.show()

In [21]:
import plotly.graph_objects as go 
 
 
# Creating the X, Y value that will
# change the values of Z as a function
feature_x = np.arange(0, 50, 2) 
feature_y = np.arange(0, 50, 3) 
 
# Creating 2-D grid of features 
[X, Y] = np.meshgrid(feature_x, feature_y) 
 
Z = np.cos(X / 2) + np.sin(Y / 4) 
 
# plotting the figure
fig = go.Figure(data =
    go.Contour(x = feature_x, y = feature_y, z = Z)) 
 
fig.show()

In [22]:
import plotly.graph_objects as go 
 
# Data to be plotted
x = np.outer(np.linspace(-2, 2, 30), np.ones(30)) 
y = x.copy().T 
z = np.cos(x ** 2 + y ** 2) 
 
# plotting the figure
fig = go.Figure(data=[go.Surface(x=x, y=y, z=z)]) 
 
fig.show()

Using Graph Objects

In [33]:
# import graph objects as "go"
import plotly.graph_objs as go

# prepare data frame
df = tdf.iloc[:100,:]

# Creating traces
trace1 = go.Scatter(
                    x = df.world_rank,
                    y = df.citations,
                    mode = "lines",
                    name = "citations",
                    marker = dict(color = 'rgba(16, 112, 2, 0.8)'),
                    text= df.university_name)

trace2 = go.Scatter(
                    x = df.world_rank,
                    y = df.teaching,
                    mode = "lines+markers",
                    name = "teaching",
                    marker = dict(color = 'rgba(80, 26, 80, 0.8)'),
                    text= df.university_name)
data = [trace1, trace2]
layout = dict(title = 'Citation and Teaching vs World Rank of Top 100 Universities',
              xaxis= dict(title= 'World Rank',ticklen= 5,zeroline= False)
             )
fig = dict(data = data, layout = layout)
iplot(fig)

In [38]:
# prepare data frames
df2014 = tdf[tdf.year == 2014].iloc[:3,:]

# create trace1 
trace1 = go.Bar(
                x = df2014.university_name,
                y = df2014.citations,
                name = "citations",
                marker = dict(color = 'rgba(255, 174, 255, 0.5)',
                             line=dict(color='rgb(0,0,0)',width=1.5)),
                text = df2014.country)
# create trace2 
trace2 = go.Bar(
                x = df2014.university_name,
                y = df2014.teaching,
                name = "teaching",
                marker = dict(color = 'rgba(255, 255, 128, 0.5)',
                              line=dict(color='rgb(0,0,0)',width=1.5)),
                text = df2014.country)
data = [trace1, trace2]
layout = go.Layout(barmode = "group")
fig = go.Figure(data = data, layout = layout)
iplot(fig)