### Plotly chart wrappers (python)
    
#### Motivation
This is a set of functions wrapping the powerful customization options of Plotly charts into single-function calls with a few parameters that should produce good-looking charts covering at least 50% of the typical charting of a data analyst. The rationale behind writing the wrappers is twofold:  
  
1) To streamline analyst work and make the generation of lucid charts accessible even to analysts who are not familiar with the intricacies of Plotly's Python library  
2) To set up a consistent visual style that would be easily customizable to fit any corporate design by pre-defining colors and font styles used in all the charts
  
#### Contents

0) Style setup  
1) Bar charts (stacked, grouped, percentage)  
2) Line charts  
3) Scatter plots  
4) Box plots 
  
#### Reference  
  
1) [Plotly reference](https://plot.ly/python/)  
2) [IBM Sample Datasets](https://www.ibm.com/communities/analytics/watson-analytics-blog/guide-to-sample-datasets/)

In [384]:
# Get the sample data

import urllib.request

url = "https://community.watsonanalytics.com/wp-content/uploads/2015/03/WA_Fn-UseC_-Marketing-Campaign-Eff-UseC_-FastF.csv"
download_path = "data/WA_Fn-UseC_-Marketing-Campaign-Eff-UseC_-FastF.csv"

try:
    dat = pd.read_csv(download_path)
except:
    urllib.request.urlretrieve(url, download_path)
    dat = pd.read_csv(download_path)

In [385]:
import pandas as pd
import numpy as np

import plotly.offline as py
from plotly.graph_objs import *
from plotly import tools
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot

init_notebook_mode(connected=True)
%matplotlib inline

In [386]:
dat.head()

Unnamed: 0,MarketID,MarketSize,LocationID,AgeOfStore,Promotion,week,SalesInThousands
0,1,Medium,1,4,3,1,33.73
1,1,Medium,1,4,3,2,35.67
2,1,Medium,1,4,3,3,29.03
3,1,Medium,1,4,3,4,39.25
4,1,Medium,2,5,2,1,27.81


In [387]:
dat.describe()

Unnamed: 0,MarketID,LocationID,AgeOfStore,Promotion,week,SalesInThousands
count,548.0,548.0,548.0,548.0,548.0,548.0
mean,5.715328,479.656934,8.50365,2.029197,2.5,53.466204
std,2.877001,287.973679,6.638345,0.810729,1.119055,16.755216
min,1.0,1.0,1.0,1.0,1.0,17.34
25%,3.0,216.0,4.0,1.0,1.75,42.545
50%,6.0,504.0,7.0,2.0,2.5,50.2
75%,8.0,708.0,12.0,3.0,3.25,60.4775
max,10.0,920.0,28.0,3.0,4.0,99.65


In [388]:
sales_per_market_and_promo = dat.groupby(["MarketID","Promotion"])["SalesInThousands"].sum().reset_index()

In [389]:
sales_per_market_and_promo.head()

Unnamed: 0,MarketID,Promotion,SalesInThousands
0,1,1,814.38
1,1,2,603.04
2,1,3,407.87
3,2,1,262.4
4,2,3,1219.87


### 0. Style setup: Fonts and colors

### 1. Bar charts

In [390]:
# Generate a discrete colorscale for "the rest"

def generate_discrete_scl(rgb_start, rgb_end, n_levels):
    """
    Generates an RGB colorscale of n_levels between the specified
    start and end colors provided as 3-number tuples: (0,0,0) 
    Returns a list of RGB color code tuples of size n_levels
    ['rgb(229,245,249)',...]
    """

    col_list = zip(rgb_start,rgb_end)

    scl_rgb = []
    for elem in col_list:

        try:
            incr = (elem[1] - elem[0])//(n_levels-1)

            scl_single = []
            x = rgb_start[0]
            scl_single.append(x)

            for item in range(1,n_levels):
                x = item*incr
                scl_single.append(x)
            scl_single[-1]=rgb_end[0]
            scl_rgb.append(scl_single)

        except:
            print("Incorrect number of levels!")

    scl_out = ["rgb" + str(item) for item in list(zip(scl_rgb[0],scl_rgb[1],scl_rgb[2]))]
    return (scl_out)

In [391]:
def generate_title_string(x,y,group):
    return y + ' per ' + x + ' grouped by ' + group


Fonts:  
These include "Arial", "Balto", "Courier New", "Droid Sans",, "Droid Serif", "Droid Sans Mono", "Gravitas One", "Old Standard TT", "Open Sans", "Overpass", "PT Sans Narrow", "Raleway", "Times New Roman".

In [392]:
# Overall style setting
style_config = dict(
font_family = "Helvetica, Droid Sans",
font_size = 14,
accent_1_color = "rgb(230,85,13)",
accent_2_color = "rgb(49,130,189)",
output_type = "plot") # enum: "html", "plot": "html" returns code for further rendering, "plot" draws a plot inline


In [393]:
# Chart 

def plotly_bar(config, df, x, y, group, barmode, accent_1=None, accent_2=None, 
               title = None, xaxis_title = None, yaxis_title = None):

    groups =df[group].unique()
    if (accent_1 is not None) or (accent_2 is not None):
        rest = sorted([item for item in groups if item not in [accent_1, accent_2]])[::-1]
    else:
        rest = sorted([item for item in groups])[::-1]
    scl = generate_discrete_scl((37,37,37),(200,200,200),len(rest))
    rest_colors = dict(zip(rest,scl))

    traces = []

    # All other traces
    for item in rest:

        text = list(df[df[group]==item][y])
        trace_idx = 0

        traces.append(
        Bar(x = df[df[group]==item][x],
            y = df[df[group]==item][y],
            name = str(item),
            marker=dict(
            color=list(np.repeat(rest_colors[item],len(df[df[group]==item][y])))
           )
           )
        )
        trace_idx = trace_idx + 1

    # Append traces for both accents, the order is important for rendering
    # Only these accents get text overlay and custom colors
    if config["accent_2_color"] is None:
        accent_2_color = "rgb(49,130,189)"
    if config["accent_1_color"] is None:
        accent_1_color = "rgb(230,85,13)"

    # Accent 2
    if accent_2 is not None:
        traces.append(
            Bar(x = df[df[group]==accent_2][x],
                y = df[df[group]==accent_2][y],
                name = str(accent_2),
                text = df[df[group]==accent_2][y].round(1),textposition = 'auto',constraintext="none",
                marker=dict(color=list(np.repeat(config["accent_2_color"],len(df[df[group]==accent_2][y]))))
               )
            )

    # Accent 1
    if accent_1 is not None:
        traces.append(
            Bar(x = df[df[group]==accent_1][x],
                y = df[df[group]==accent_1][y],
                name = str(accent_1),
                text = df[df[group]==accent_1][y].round(1),textposition = 'auto',constraintext="none",
                marker=dict(color=list(np.repeat(config["accent_1_color"],len(df[df[group]==accent_1][y]))))
               )
            )

    data = traces
    
    if xaxis_title is None: 
        xaxis_title = x
    if yaxis_title is None: 
        yaxis_title = y
    if title is None: 
        title = generate_title_string(x,y,group)

    
    layout = Layout(
        title=title,
        xaxis=dict(
            title= xaxis_title,
            titlefont=dict(
                size=config["font_size"]-1,
                color='rgb(107, 107, 107)'
            ),
            tickfont=dict(
                size=config["font_size"]-1,
                color='rgb(107, 107, 107)'
            )
        ),
        yaxis=dict(
            title=yaxis_title,
            titlefont=dict(
                size=config["font_size"]-1,
                color='rgb(107, 107, 107)'
            ),
            tickfont=dict(
                size=config["font_size"]-1,
                color='rgb(107, 107, 107)'
            )
        ),
        legend=dict(orientation="h",x=0,y=-0.22,
            bgcolor='rgba(0, 0, 0, 0)',
            bordercolor='rgba(0, 0, 0, 0)'
        ),
        margin=dict(t=100),
        barmode=barmode,bargap=0.15,bargroupgap=0.1,
        hovermode="closest", hoverlabel = dict(font=dict(family=config["font_family"], size=config["font_size"])),
        font=dict(family=config["font_family"], size=config["font_size"]))

    
    fig = Figure(data=data, layout=layout)

    if config["output_type"] == "html":
        print("HTML output not enabled yet") 
    else:
        py.iplot(fig, filename='bar')
        

In [394]:
# General chart setup
df = sales_per_market_and_promo
x = "Promotion" # "Promotion"
y = "SalesInThousands"
group =  "MarketID"   # variable to make groups
barmode = "group"   # enum: "group", "stack", "relative" (for positive and negative values) 

# Accents (selected levels of grouping variable) 
accent_1 = 1
accent_2 = 2

plotly_bar(config=style_config, df=df, x=x, y=y,group=group,barmode=barmode,accent_1=1,accent_2=5)

In [395]:
plotly_bar(config=style_config, df=df, x=x, y=y,group=group,barmode="stack",accent_1=1,accent_2=5)

### 2. Line charts

In [396]:
def plotly_line(config, df, x, y, group, accent_1=None, accent_2=None, accent_linemode=None,  
               title = None, xaxis_title = None, yaxis_title = None):

    groups = df[group].unique()
    if (accent_1 is not None) or (accent_2 is not None):
        rest = sorted([item for item in groups if item not in [accent_1, accent_2]])[::-1]
    else:
        rest = sorted([item for item in groups])[::-1]
    scl = generate_discrete_scl((37,37,37),(200,200,200),len(rest))
    rest_colors = dict(zip(rest,scl))

    traces = []

    # All other traces
    for item in rest:

        text = list(df[df[group]==item][y])
        trace_idx = 0

        traces.append(
        Scatter(x = df[df[group]==item][x],
            y = df[df[group]==item][y],
            name = str(item),
            mode = 'lines',
            line=dict(
            color=rest_colors[item],shape='spline',smoothing=0.5))
        )
        trace_idx = trace_idx + 1

    # Append traces for both accents, the order is important for rendering
    # Only these accents get text overlay and custom colors
    if config["accent_2_color"] is None:
        accent_2_color = "rgb(49,130,189)"
    if config["accent_1_color"] is None:
        accent_1_color = "rgb(230,85,13)"

    # Accent 2
    if accent_2 is not None:
        traces.append(
            Scatter(x = df[df[group]==accent_2][x],
                y = df[df[group]==accent_2][y],
                name = str(accent_2),
                mode = accent_linemode,
                text = df[df[group]==accent_2][y].round(1),textposition = 'top middle',
                line=dict(color=config["accent_2_color"], width = 3,shape='spline',
                          smoothing=0.5),
                marker=dict(size=8),textfont=dict(color=config["accent_2_color"])
               )
            )

    # Accent 1
    if accent_1 is not None:
        traces.append(
            Scatter(x = df[df[group]==accent_1][x],
                y = df[df[group]==accent_1][y],
                name = str(accent_1),
                mode = accent_linemode,
                text = df[df[group]==accent_1][y].round(1),textposition = 'top middle',
                line=dict(color=config["accent_1_color"], width = 3,shape='spline',
                          smoothing=0.5),
                marker=dict(size=8),textfont=dict(color=config["accent_1_color"])
               )
            )

    data = traces
    
    if xaxis_title is None: 
        xaxis_title = x
    if yaxis_title is None: 
        yaxis_title = y
    if title is None: 
        title = generate_title_string(x,y,group)

    
    layout = Layout(
        title=title,
        xaxis=dict(
            title= xaxis_title,
            titlefont=dict(
                size=config["font_size"]-1,
                color='rgb(107, 107, 107)'
            ),
            tickfont=dict(
                size=config["font_size"]-1,
                color='rgb(107, 107, 107)'
            )
        ),
        yaxis=dict(
            title=yaxis_title,
            titlefont=dict(
                size=config["font_size"]-1,
                color='rgb(107, 107, 107)'
            ),
            tickfont=dict(
                size=config["font_size"]-1,
                color='rgb(107, 107, 107)'
            )
        ),
        legend=dict(orientation="h",x=0,y=-0.22,
            bgcolor='rgba(0, 0, 0, 0)',
            bordercolor='rgba(0, 0, 0, 0)'
        ),
        margin=dict(t=100),
#         barmode=barmode,bargap=0.15,bargroupgap=0.1,
        hovermode="closest", hoverlabel = dict(font=dict(family=config["font_family"], size=config["font_size"])),
        font=dict(family=config["font_family"], size=config["font_size"]))

    
    fig = Figure(data=data, layout=layout)

    if config["output_type"] == "html":
        print("HTML output not enabled yet") 
    else:
        py.iplot(fig, filename='line')
        

In [397]:
plotly_line(config=style_config, df=df, x="MarketID", y="SalesInThousands", 
            group="Promotion",accent_1=3,accent_2=None, accent_linemode="lines+markers+text")

In [398]:
plotly_line(config=style_config, df=df, x="Promotion", y="SalesInThousands", 
            group="MarketID",accent_1=2,accent_2=3, accent_linemode="lines+markers+text")

### 3. Scatter plots

In [399]:
def plotly_scatter(config, df, x, y, group, accent_1=None, accent_2=None, 
               title = None, xaxis_title = None, yaxis_title = None):

    groups = df[group].unique()
    if (accent_1 is not None) or (accent_2 is not None):
        rest = sorted([item for item in groups if item not in [accent_1, accent_2]])[::-1]
    else:
        rest = sorted([item for item in groups])[::-1]
    scl = generate_discrete_scl((37,37,37),(200,200,200),len(rest))
    rest_colors = dict(zip(rest,scl))

    traces = []

    # All other traces
    for item in rest:

        text = list(df[df[group]==item][y])
        trace_idx = 0

        traces.append(
        Scatter(x = df[df[group]==item][x],
            y = df[df[group]==item][y],
            name = str(item),
            mode = 'markers',
            marker=dict(color=rest_colors[item]))
        )
        trace_idx = trace_idx + 1

    # Append traces for both accents, the order is important for rendering
    # Only these accents get text overlay and custom colors
    if config["accent_2_color"] is None:
        accent_2_color = "rgb(49,130,189)"
    if config["accent_1_color"] is None:
        accent_1_color = "rgb(230,85,13)"

    # Accent 2
    if accent_2 is not None:
        traces.append(
            Scatter(x = df[df[group]==accent_2][x],
                y = df[df[group]==accent_2][y],
                name = str(accent_2),
                mode = "markers",
                marker=dict(color=config["accent_2_color"])
               )
            )

    # Accent 1
    if accent_1 is not None:
        traces.append(
            Scatter(x = df[df[group]==accent_1][x],
                y = df[df[group]==accent_1][y],
                name = str(accent_1),
                mode = "markers",
                marker=dict(color=config["accent_1_color"])
               )
            )

    data = traces
    
    if xaxis_title is None: 
        xaxis_title = x
    if yaxis_title is None: 
        yaxis_title = y
    if title is None: 
        title = generate_title_string(x,y,group)

    
    layout = Layout(
        title=title,
        xaxis=dict(
            title= xaxis_title,
            titlefont=dict(
                size=config["font_size"]-1,
                color='rgb(107, 107, 107)'
            ),
            tickfont=dict(
                size=config["font_size"]-1,
                color='rgb(107, 107, 107)'
            )
        ),
        yaxis=dict(
            title=yaxis_title,
            titlefont=dict(
                size=config["font_size"]-1,
                color='rgb(107, 107, 107)'
            ),
            tickfont=dict(
                size=config["font_size"]-1,
                color='rgb(107, 107, 107)'
            )
        ),
        legend=dict(orientation="h",x=0,y=-0.22,
            bgcolor='rgba(0, 0, 0, 0)',
            bordercolor='rgba(0, 0, 0, 0)'
        ),
        margin=dict(t=100),
        hovermode="closest", hoverlabel = dict(font=dict(family=config["font_family"], size=config["font_size"])),
        font=dict(family=config["font_family"], size=config["font_size"]))

    
    fig = Figure(data=data, layout=layout)

    if config["output_type"] == "html":
        print("HTML output not enabled yet") 
    else:
        py.iplot(fig, filename='scatter')
        

In [400]:
plotly_scatter(config=style_config, df=dat, x="AgeOfStore", y="SalesInThousands", 
            group="MarketID",accent_1=2,accent_2=3)

In [401]:
plotly_scatter(config=style_config, df=dat, x="week", y="SalesInThousands", 
            group="Promotion")

To be fixed:  
  
4) Scatter and other charts - make group optional - skip the colorscale and just plot dark grey   
5) Extend for HTML div output: compare (https://stackoverflow.com/questions/36262748/python-save-plotly-plot-to-local-file-and-insert-into-html)  
6) Extend to further chart types (one function per chart type)  
7) Grouping variable name is missing from the legend

Low prio / nice to have: 
- Further enhancement for scl generation (values over 255, levels over 255.., avoid negative values)
- X Axis ticks for categorical variables    
- Add % of group total option into text for the accents  
 
