# Introduction to Plotly

[Plotly](https://plot.ly/python/) is an online data visualization and analytics tool. Although most users use Plotly through its web interface, the company also offers Python, R, Javascript, and MATLAB APIs. These APIs allow you to build interactive and production-quality visualizations with just a few lines of code.

This workbook will showcase some of the basic functionalities of Plotly, in case it is useful to you for future projects (including the final project for this class).

## First Steps

By default, Plotly requires an API key that is tied to a registered account. To get around this, we use the `plotly.offline` package. To get Plotly to work within the Jupyter notebook, we also have to call `init_notebook_mode()`.

In [1]:
import plotly.offline as py
py.init_notebook_mode(connected=True)

## Plotly Basics

A Plotly figure consists of one or more `Trace` objects. Each `Trace` object specifies how to draw some part of the data. For example, to make a scatterplot of the values `x_data` and `y_data`, we would write `Scatter(x=x_data, y=y_data)`, which produces one kind of `Trace` object.

There are two ways to specify a Plotly figure:

1. Pass a list of `Trace` objects to `py.iplot()`.
2. Pass a `Figure` object to `py.iplot()`. A `Figure` object consists of a list of `Trace` objects and a `Layout` object that allows you to customize the layout.

In [2]:
from plotly.graph_objs import *

## First Plotly Plot

For starters, let's recreate the basic plots that we made in Matplotlib and Altair. Recall that we made a scatterplot of beginning salary (`Bsal`) vs. 1977 salary (`Sal77`) of the Harris Bank employees, where each point was colored according to the employee's gender (`Sex`).

In [3]:
# Read in the data set.
import pandas as pd
data = pd.read_csv("/data/harris.csv")
data.head()

Unnamed: 0,Bsal,Sal77,Sex,Senior,Age,Educ,Exper
0,5040,12420,Male,96,329,15,14.0
1,6300,12060,Male,82,357,15,72.0
2,6000,15120,Male,67,315,15,35.5
3,6000,16320,Male,97,354,12,24.0
4,6000,12300,Male,66,351,12,56.0


First, let's create a basic scatterplot that does not split on the `Sex` variable.

In [8]:
traces = [
    Scatter({
        'x':data.Bsal,
        'y':data.Sal77,
        'mode':'markers'
        })
    
]

py.iplot(traces)

This plot has neither an x- or y-axis label, nor a title. To customize these aspects of the plot, we need to create a `Layout` object to hold this metadata and then pass this to a `Figure` object.

In [10]:
layout = Layout({
        'title':'Beggining Salary v. 1977 Salary',
        'xaxis':{'title':'Beggining Salary'},
        'yaxis':{'title':'1977 Salary'}
    })

fig = Figure(data = traces, layout = layout)

py.iplot(fig)

Now, let's split the data on `Sex`. Unfortunately, this is not the easiest thing to do in Plotly, but the graphic that Plotly produces is worth the effort!

In [14]:
split_traces = [] 

for sex in data.Sex.unique():
    split_traces.append(
        Scatter({
                'x':data.Bsal[data.Sex == sex],
                'y':data.Sal77[data.Sex == sex],
                'mode':'markers',
                'name':sex
            }))

split_fig = Figure(data = split_traces, layout=layout)

py.iplot(split_fig)

We now have a quality graphic depicting the relationship between beginning salary and 1977 salary, with different colors indicating the employee's sex. If you click on a label in the legend, it will hide the observations with that value. This is useful, as it allows us to focus on a specific subset of observations.

## Interacting with Plots

Next we will split our data further based on education. When you did this with Altair, you ended up having several graphs side by side. To compare them, you had to scroll back and forth. This can be cumbersome. Let's implement a graph with a drop-down menu that allows us to select which graph we want to see.

In [15]:
# Create list of combinations of values that the data will be split on.
combinations = []

education_values = data.Educ.unique()
education_values.sort()
sex_values = data.Sex.unique()
sex_values.sort()

for sex in sex_values:
    for educ in education_values:
        combinations.append((educ, sex))
        
combinations

[(8, 'Female'),
 (10, 'Female'),
 (12, 'Female'),
 (15, 'Female'),
 (16, 'Female'),
 (8, 'Male'),
 (10, 'Male'),
 (12, 'Male'),
 (15, 'Male'),
 (16, 'Male')]

In [None]:
# Copy calculation of split_traces.
further_split_traces = []

In [None]:
# Create buttons.

buttons = []


In [None]:
# Create a layout, as earlier.
interactive_layout = Layout()

In [None]:
# Create a new figure and plot it.
interactive_fig = Figure(data=, layout=)
py.iplot(interactive_fig)

With further editing, we could clean up the legend, but since this is only an introduction to Plotly, we won't pursue that here. Hopefully, this tutorial has convinced you of the power of Plotly!

## Additional Examples

Listed below are some more exciting examples of graphics that you can make with Plotly. If any of them catch your fancy, copy the code into the notebook and run it to produce the graphic!
- [Animations](https://plot.ly/python/animations/)
- [3D Surfaces](https://plot.ly/python/3d-surface-plots/)
- [3D Scatter Plot](https://plot.ly/python/3d-scatter-plots/)
- [3D Parametric Surfaces](https://plot.ly/python/3d-parametric-plots/)
- [Geographical Maps](https://plot.ly/python/maps/)
- [Subplots](https://plot.ly/python/mixed-subplots)