# Interactive Visualization with Plotly

Plotly is a Python library used for creating interactive visualizations (graphs & charts). Unlike Matplotlib and Seaborn which create static images, Plotly renders an HTML document and uses JavaScript under the hood to enable interactivity. Plotly also offers a large selection of chart types to choose from.

## Creating Figures and Adding Interactive Elements

Like Matplotlib, Plotly provides several low-level functions for creating and customizing figures. While they offer fine grained control over various aspects of a graph, they're quite verbose and can be cumbersome to use. We'll start out by using Plotly express, a high-level API similar to Seaborn, that allows creating and customizing charts with a single line of code.

Plotly express is often imported using the alias `px`.

In [4]:
import plotly.express as px
import pandas as pd

In [5]:
population_csv_url = 'https://gist.githubusercontent.com/aakashns/bbd36fbd7c0be266f0c875ad2006a9fd/raw/1763ee47c8919995c4115bb063c99511ced34712/population.csv'
population_df = pd.read_csv(population_csv_url, index_col = 'Year')
population_df[:5]

Unnamed: 0_level_0,Aruba,Afghanistan,Angola,Albania,Andorra,Arab World,United Arab Emirates,Argentina,Armenia,American Samoa,...,Virgin Islands (U.S.),Vietnam,Vanuatu,World,Samoa,Kosovo,"Yemen, Rep.",South Africa,Zambia,Zimbabwe
Year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1960,54211.0,8996973.0,5454933.0,1608800.0,13411.0,92197753.0,92418.0,20481779.0,1874121.0,20123.0,...,32500.0,32670039.0,63689.0,3031438000.0,108629.0,947000.0,5315355.0,17099840.0,3070776.0,3776681.0
1961,55438.0,9169410.0,5531472.0,1659800.0,14375.0,94724510.0,100796.0,20817266.0,1941492.0,20602.0,...,34300.0,33666110.0,65705.0,3072481000.0,112105.0,966000.0,5393036.0,17524533.0,3164329.0,3905034.0
1962,56225.0,9351441.0,5608539.0,1711319.0,15370.0,97334442.0,112118.0,21153052.0,2009526.0,21253.0,...,35000.0,34683407.0,67794.0,3125457000.0,115776.0,994000.0,5473671.0,17965725.0,3260650.0,4039201.0
1963,56695.0,9543205.0,5679458.0,1762621.0,16412.0,100034179.0,125130.0,21488912.0,2077578.0,22034.0,...,39800.0,35721217.0,69946.0,3190564000.0,119559.0,1022000.0,5556766.0,18423161.0,3360104.0,4178726.0
1964,57032.0,9744781.0,5735044.0,1814135.0,17469.0,102832760.0,138039.0,21824425.0,2145001.0,22854.0,...,40800.0,36779999.0,72115.0,3256065000.0,123342.0,1050000.0,5641597.0,18896307.0,3463213.0,4322861.0


In pandas, using square brackets [] after a DataFrame name allows you to select rows or columns. When you use a slice like [:5], it indicates that you want to select rows starting from the beginning (the colon before the 5 means "from the start") up to, but not including, the row at index 5. Since the DataFrame is indexed by 'Year', this effectively selects the first 5 rows based on the Year index.

Let's use `px.line` to create a line chart showing the population of Hungary from 1960 to 2019.

In [8]:
?px.line

In [7]:
px.line(population_df['Hungary'], title = 'Population')

Note the following:

* `px.line` automatically picks the index of the series as the X-axis.
* You can hover over any point on the line to view the exact value.
* You can Zoom in and out using the controls to take a closer look at specific areas of the chart.
* There are several other controls e.g pan, autoscale, download PNG etc.

For fine grained control over various aspects of the chart, we can use the Figure object returned by `px.line`. Let's change the axis labels, chart colors, and ensure that the y axis starts at 0.

In [9]:
fig = px.line(population_df['Hungary'])

In [10]:
# Set axis & legend labels
fig.update_layout(
    title = "Year-Wise Population",
    xaxis_title = "Year",
    yaxis_title = "Population",
    legend_title = "Country",
    plot_bgcolor = "#ffcc9c",
    font = dict(
        family = "Times New Roman",
        size = 14,
        color = "#cc3e0e"
    )
)

# Start the Y axis from 0
fig.update_yaxes(rangemode = 'tozero')

Here's a list of properties you can set using `update_layout`: https://plotly.com/python/reference/layout/

Plotly also has built-in support for Pandas dataframes.

In [12]:
europe_df = population_df[['Hungary', 'Czech Republic', 'Switzerland']]
europe_df.head()

Unnamed: 0_level_0,Hungary,Czech Republic,Switzerland
Year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1960,9983967.0,9602006.0,5327827.0
1961,10029321.0,9586651.0,5434294.0
1962,10061734.0,9624660.0,5573815.0
1963,10087947.0,9670685.0,5694247.0
1964,10119835.0,9727804.0,5789228.0


In [14]:
fig = px.line(europe_df,
              title = 'Population',
              color_discrete_sequence = ["aquamarine", "cornflowerblue", "goldenrod"])

fig.update_layout(yaxis_title = 'Population',
                  legend_title = 'Countries',
                  font_size = 14)

fig.update_yaxes(rangemode = 'tozero')

fig.show()

Note that apart from providing RGB hexcodes for colors, we can also use named CSS colors: https://www.w3schools.com/cssref/css_colors.asp

Switching from a line chart to a bar chart is simply a matter of replacing `plt.line` with `plt.bar`.

In [15]:
px.bar(population_df[['Bangladesh', 'Pakistan']],
       title = "Population",
       barmode = "group")

## A quick tour of popular interactive charts

Plotly express provides more than 30 figure for creating different types of figures. Let's explore some popular interactive visualization techniques. We'll use the [built-in datasets](https://plotly.com/python-api-reference/generated/plotly.express.data.html) from `px.data` to demonstrate their usage.