## Dynamic Visualizations with Plotly (Rui Wang)

In [None]:
import plotly 
import plotly.plotly as py
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import plotly.graph_objs as go
import plotly 
plotly.tools.set_credentials_file(username='Rui1521', api_key='pzyDXfPbHKuTcOUg8Uvg')

## 4. 

In [34]:
print("Total number of observations:", data.shape[0])
print("Number of Unique values per column:")
# Print out number of unique values in each column
for item in data.columns:
    print(item, len(data[item].unique()))

Total number of observations: 851264
Number of Unique values per column:
date 1762
symbol 501
open 97522
close 98520
low 97470
high 97784
volume 171073


All columns are value columns. There is no column that do not have duplicate values. So all columns are unsuitable as keys. But date and symbol can be used together as the key of this table.

## 5.

At the high level, the task is to analyze the stock prices over time. People are able to discover the trend and stability of stock prices over time. At the middle level, the vis can also be used by stock investors to look up the historical information about stock prices in any time interval. At the low level, the task is that a user should easily identify the close stock prices at any timestamp and compare prices among different stocks. The targets of users are finding the trends and individual values, as well as the similarity and differences among multiple stocks. Since the closing price represents the most up-to-date valuation of a security until trading commences again on the next trading day. Here we make plots of the close prices of five stocks over time.

### 6. 

1. Line Chart.


2. Streamgraph.

## 7 & 8

### Visualization One

In [84]:
data = pd.read_csv("nyse/prices-split-adjusted.csv")
# Only analyze and visualize five of these stocks.
stocks = ['AAPL', 'CLX', 'ETR', 'MCK', 'WMT']
AAPL = data[data.symbol == 'AAPL']
CLX = data[data.symbol == 'CLX']
ETR = data[data.symbol == 'ETR']
MCK = data[data.symbol == 'MCK']
WMT = data[data.symbol == 'WMT']

In [97]:
# Line Chart
trace_AAPL = go.Scatter(
    x=AAPL.date,
    y=AAPL.close,
    name = "AAPL Close",
    line = dict(color = '#17BECF'),
    opacity = 0.8)

trace_ETR = go.Scatter(
    x=ETR.date,
    y=ETR.close,
    name = "ETR Close",
    line = dict(color = '#7F7F7F'),
    opacity = 0.8)

trace_MCK = go.Scatter(
    x=MCK.date,
    y=MCK.close,
    name = "MCK Close",
    line = dict(color = "#607d8b"),
    opacity = 0.8)

trace_WMT = go.Scatter(
    x=WMT.date,
    y=WMT.close,
    name = "WMT Close",
    line = dict(color = "#795548"),
    opacity = 0.8)

trace_CLX = go.Scatter(
    x=CLX.date,
    y=CLX.close,
    name = "CLX Close",
    line = dict(color = "#E91E63"),
    opacity = 0.8)

data = [trace_AAPL, trace_ETR, trace_MCK, trace_WMT, trace_CLX]

# Range Slider and Range Selector.
layout = dict(
    title= "Line Chart: Close Price(AAPL, CLX, ETR, MCK, WMT)",
    xaxis=dict(
        # Range Selector.
        rangeselector=dict(
            buttons=list([
                dict(count=1,
                     label='1m',
                     step='month',
                     stepmode='backward'),
                dict(count=6,
                     label='6m',
                     step='month',
                     stepmode='backward'),
                dict(step='all')
            ])
        ),
        # Range Slider 
        rangeslider=dict(
            visible = True
        ),
        type='date'
    ),
    yaxis=dict(
        title='Stock Price',
        titlefont=dict(
            family='Courier New, monospace',
            size=18,
            color='#7f7f7f'
        )
    ),
)

fig = dict(data=data, layout=layout)
py.iplot(fig, filename = "Multiple Line Plots")

a. Line Chart. For each stock, we have an ordered key attribute "date" and a quantitative value "close". Line chart can encode the data by connecting dots generated by these two attributes with line mark and position channel. It can show trends of the quantitative value over times. It is very suitable for the abstract task of spotting trends.

 
b. MCK stock price rapidly increased starting from 2013 but gradually decreased in recent years. Other four stock prices are relatively stable from 2011 to 2016.


c. Interactivity: Range Slider and Range Selector. These two interactivies can help users easily to find any time interval they want to analyze. That makes this vis become a tool that can support investigation at multiple levels of detail, ranging from a very high-level overview to a fully detailed view of a small part of it.

### Visualization Two

In [113]:
# Stacked Area Plot
trace_AAPL = go.Scatter(
    x=AAPL.date,
    y=AAPL.close,
    name = "AAPL Close",
    opacity = 0.8,
    stackgroup='one',
    fill='tozeroy')

trace_ETR = go.Scatter(
    x=ETR.date,
    y=ETR.close,
    name = "ETR Close",
    fill='tonexty',
    stackgroup='one',
    opacity = 0.8)

trace_MCK = go.Scatter(
    x=MCK.date,
    y=MCK.close,
    name = "MCK Close",
    fill='tonexty',
    stackgroup='one',
    opacity = 0.8)

trace_WMT = go.Scatter(
    x=WMT.date,
    y=WMT.close,
    name = "WMT Close",
    fill='tonexty',
    stackgroup='one',
    opacity = 0.8)

trace_CLX = go.Scatter(
    x=CLX.date,
    y=CLX.close,
    name = "CLX Close",
    fill='tonexty',
    stackgroup='one',
    opacity = 0.8)


data = [trace_AAPL, trace_ETR, trace_MCK, trace_WMT, trace_CLX]

# Update Button
updatemenus = list([
    dict(type="buttons",
         active=-1,
         buttons=list([
             dict(label = 'Reset',
             method = 'relayout',
             args = [{'visible': [True, True, True, True,True]},
                     {'title': "Stacked Area Chart: Close Price(AAPL, CLX, ETR, MCK, WMT)"}]),
            dict(label = 'AAPL',
                 method = 'update',
                 args = [{'visible': [True, False, False, False, False]},
                         {'title': 'AAPL'}]),
            dict(label = 'ETR',
                 method = 'update',
                 args = [{'visible': [False, True, False, False, False]},
                         {'title': 'ETR'}]),
            dict(label = 'MCK',
                 method = 'update',
                 args = [{'visible': [False, False, True, False, False]},
                         {'title': 'MCK'}]),
            dict(label = 'WMT',
                 method = 'update',
                 args = [{'visible': [False, False, False, True, False]},
                         {'title': 'WMT'}]), 
            dict(label = 'CLX',
             method = 'update',
             args = [{'visible': [False, False, False, False, True]},
                     {'title': 'CLX'}])
            
        ]),
    )
])

# Range Slider and Range Selector.
layout = dict(
    title= "Stacked Area Chart: Close Price(AAPL, CLX, ETR, MCK, WMT)",
    xaxis=dict(
        rangeselector=dict(
            buttons=list([
                dict(count=1,
                     label='1m',
                     step='month',
                     stepmode='backward'),
                dict(count=6,
                     label='6m',
                     step='month',
                     stepmode='backward'),
                dict(step='all')
            ])
        ),
        rangeslider=dict(
            visible = True
        ),
        type='date'
    ),
    yaxis=dict(
        title='Stock Price',
        titlefont=dict(
            family='Courier New, monospace',
            size=18,
            color='#7f7f7f'
        )
    ),
    updatemenus=updatemenus
)

fig = dict(data=data, layout=layout)
py.iplot(fig, filename='stacked-area-plot')

a. Stacked Area Chart. It is perfect for multidimensional table with one ordered key attribute and multiple quantitative attributes. Area Chart can encode the data with area mark and area channel. By stacking area charts on top of each other, we can get a visual summation of time-series values.

b. MCK stock price rapidly increased starting from 2013 but gradually decreased in recent years. Other four stock prices are relatively stable from 2011 to 2016. And all five stocks shows similarly short term trends.

c. Apart from Range Slider and Range Selector. Even though Stacked Area Chart can give a visual summation of time-series values, stacking may make it difficult to accurately interpret trends. So I add an update button, so useres can select which stock they want to analyze. 