# Intro to Plotly

This is intended to be a crash course in using plotly that will give you just enough to complete the afternoon assignement.  Much more can be done with plotly beyond the absolute basics shown in this notebook and I encourage you to check out some of the possibilitys [here](https://plot.ly/python/).

First thing we need to do is install Plotly using `pip`...

```bash
pip install plotly
```

Or we could simply upgrade plotly if we already have an out-of-date version installed...

```bash
pip install plotly --upgrade
```

For this demo we will be creating a simple scatter plot in an offline Jupyter notebook.  We also need to create a [Plotly account](https://plot.ly/feed/) to get our API keys.  Once we have our API keys we can configure our computer to automatically read these keys in when using plotly like so...

```python
# We need to configure our system to use our API keys which you will find in
# the API keys section of the settings of your Plotly account
# NOTE: This only needs to be done once
import plotly.tools as tls
tls.set_credentials_file(username='your_username', api_key='your_api_key')
```

Now let's dive in and make a really simple scatter plot...


### Further information

[Getting Started with Plotly](https://plot.ly/python/getting-started/) 

[Python Scatter Plots](https://plot.ly/python/line-and-scatter/)

In [1]:
import numpy as np
import pandas as pd

# First let's read in some data
df = pd.read_csv('data/balance.csv')

# Get rid of the index columns
df.drop('Unnamed: 0', axis=1, inplace=True)

In [2]:
# Import in plotly
import plotly.plotly as py
import plotly.graph_objs as go

In [3]:
# Let's create a scatter plot of Income vs. Limit

# First let's create a trace
trace = go.Scatter(
    x = df['Income'],
    y = df['Limit'],
    mode = 'markers'
)

data = [trace]

# Configure our Layout
layout = go.Layout(
    title='Income Vs. Credit Limit',
    xaxis=dict(title= 'Income'),
    yaxis=dict(title= 'Credit Limit')
)

# Create the figure
fig = go.Figure(data=data, layout=layout)

# Plot and embed in ipython notebook!
py.iplot(fig, filename='Income-vs-Limit')

# Going a little further...

## Histograms

For more info see [this histogram tutorial](https://plot.ly/python/histogram-tutorial/).

For our first histogram let's plot the percentage of points that fall in each bin:

In [4]:
x_data = df['Limit']
num_bins = 15
tr = go.Histogram(x=x_data, histnorm='probability', 
                xbins=dict(start=x_data.min(), 
                           size= (x_data.max() - x_data.min()) / num_bins, 
                           end= x_data.max()),
                marker=dict(color='rgba(93, 164, 214, 0.5)'))

data = [tr]

title = "Probability Histogram of Income"

layout = dict(
            title=title,
            autosize= True,
            bargap= 0.015,
            height= 600,
            width= 700,       
            hovermode= 'x',
            xaxis=dict(
                title= 'Income',
                autorange= True,
                zeroline= False),
            yaxis= dict(
                title= 'Percent of Data in Bin',
                autorange= True,
                showticklabels= True,
           ))

fig1 = go.Figure(data=data, layout=layout)
py.iplot(fig1)

We could also drop off the `histnorm` argument to just get the counts of points in a particular bin:

In [5]:
x_data = df['Limit']
num_bins = 15
tr = go.Histogram(x=x_data,
                xbins=dict(start=x_data.min(), 
                           size= (x_data.max() - x_data.min()) / num_bins, 
                           end= x_data.max()),
                marker=dict(color='rgba(93, 164, 214, 0.5)'))

data = [tr]

title = "Histogram without specifying 'histnorm'"

layout = dict(
            title=title,
            autosize= True,
            bargap= 0.015,
            height= 600,
            width= 700,       
            hovermode= 'x',
            xaxis=dict(
                title= 'Income',
                autorange= True,
                zeroline= False),
            yaxis= dict(
                title= 'Number of Points in Bin',
                autorange= True,
                showticklabels= True,
           ))

fig1 = go.Figure(data=data, layout=layout)
py.iplot(fig1)

## Box Plots

For more information view this [Box Plot tutorial](https://plot.ly/python/box-plots/)

In [6]:
ethnicities = df['Ethnicity'].unique()
data = [df.loc[df['Ethnicity']==eth, 'Limit'] for eth in ethnicities]

# Some random colors to use
colors = ['rgba(93, 164, 214, 0.5)', 'rgba(255, 144, 14, 0.5)', 
          'rgba(44, 160, 101, 0.5)', 'rgba(255, 65, 54, 0.5)', 
          'rgba(207, 114, 255, 0.5)', 'rgba(127, 96, 0, 0.5)']

trs = []
for eth, income, c in zip(ethnicities, data, colors):
    tr = go.Box(
        y=income,
        name=eth,
        marker=dict(color=c)
    )
    trs.append(tr)
    
py.iplot(trs)

We could also plot a jitter scatter of our data next to the box plot...

In [7]:
ethnicities = df['Ethnicity'].unique()
data = [df.loc[df['Ethnicity']==eth, 'Limit'] for eth in ethnicities]

# Some random colors to use
colors = ['rgba(93, 164, 214, 0.5)', 'rgba(255, 144, 14, 0.5)', 
          'rgba(44, 160, 101, 0.5)', 'rgba(255, 65, 54, 0.5)', 
          'rgba(207, 114, 255, 0.5)', 'rgba(127, 96, 0, 0.5)']

trs = []
for eth, income, c in zip(ethnicities, data, colors):
    tr = go.Box(
        y=income,
        name=eth,
        marker=dict(color=c),
        boxpoints='all',
        jitter=0.4
    )
    trs.append(tr)
    
py.iplot(trs)