# Dynamic Visualizations with Plotly  
## Qinyu Chen

In [8]:
# setting
import plotly.plotly as py
import plotly.graph_objs as go
plotly.tools.set_credentials_file(username='crowwwww56', api_key='uZl6muzM8nEjWEtaub9h')
import pandas as pd
import numpy as np

### 1. Label the 7 columns in the dataset as key or value

In [64]:
# load dataset
prices = pd.read_csv("prices.csv")
prices.head()

Unnamed: 0,date,symbol,open,close,low,high,volume
0,2016-01-05 00:00:00,WLTW,123.43,125.839996,122.309998,126.25,2163600.0
1,2016-01-06 00:00:00,WLTW,125.239998,119.980003,119.940002,125.540001,2386400.0
2,2016-01-07 00:00:00,WLTW,116.379997,114.949997,114.93,119.739998,2489500.0
3,2016-01-08 00:00:00,WLTW,115.480003,116.620003,113.5,117.440002,2006300.0
4,2016-01-11 00:00:00,WLTW,117.010002,114.970001,114.089996,117.330002,1408600.0


In [129]:
prices.shape

(851264, 7)

In [136]:
print(len(prices['date'].unique()))
print(len(prices['symbol'].unique()))
print(len(prices['open'].unique()))
print(len(prices['close'].unique()))
print(len(prices['low'].unique()))
print(len(prices['high'].unique()))
print(len(prices['volume'].unique()))

3524
501
72707
73628
72900
73299
171073


After checking the uniqueness of columns, it turns out that none of the columns is unique. According to VAD page 59, this table is a simple flat table where each item corresponds to a row in the table. The key is completely implicit. It's the index of the row. Columns 'date','symbol','open', 'close', 'low','high' and 'volume' are all value.

### 2. Come up with a task a user might be interested in performing with this dataset

User: Cheryl has been offered three jobs from big technology companies: Google, Apple, and Microsoft. She wants to work in a company that is on the growing. She suggests that we can look at the stock close price trend to determine which company is arising.  

Actions that define user goal: At high-level, Cheryl wants to discover the stock close price evolution trends of these companies. At mid-level, Cheryl wants to look up the detailed stock close prices. At lower-level, Cheryl wants to compare which trend seems the best.  

Task abstraction: Compare three companies' stock close price evolution trends.

### 3. Visualization1

In [74]:
# preprare data for plotting
google = prices[prices['symbol']=='GOOGL']
apple = prices[prices['symbol']=='AAPL']
microsoft = prices[prices['symbol']=='MSFT']

In [127]:
# close-only line chart
trace_google = go.Scatter(
    x=google.date,
    y=google.close,
    name = "Google Close Price",
    line = dict(color = '#1ecbba'),
    opacity = 1)

trace_apple = go.Scatter(
    x=apple.date,
    y=apple.close,
    name = "Apple Close Price",
    line = dict(color = '#f0a616'),
    opacity = 1)

trace_microsoft = go.Scatter(
    x=microsoft.date,
    y=microsoft.close,
    name = "Microsoft Close Price",
    line = dict(color = '#e45a42'),
    opacity = 1)

data = [trace_google,trace_apple,trace_microsoft]

updatemenus = list([
    dict(type="buttons",
         active=-1,
         buttons=list([
            dict(label = 'All',
                 method = 'update',
                 args = [{'visible': [True, True, True]},
                         {'title': 'Stock close price from 2010 to 2016'}]),
            dict(label = 'Google',
                 method = 'update',
                 args = [{'visible': [True, False, False]},
                         {'title': 'Google stock close price from 2010 to 2016'}]),
            dict(label = 'Apple',
                 method = 'update',
                 args = [{'visible': [False, True, False]},
                         {'title': 'Apple stock close price from 2010 to 2016'}]),
            dict(label = 'Microsoft',
                 method = 'update',
                 args = [{'visible': [False, False, True]},
                         {'title': 'Microsoft stock close price from 2010 to 2016'}])
        ]),
    )
])

layout = dict(
    title='Stock close price from 2010 to 2016', 
    updatemenus=updatemenus,
#    xaxis=dict(
#       rangeslider=dict(
#            visible = True
#        ),
#       type='date'
#   )
)

fig = dict(data=data, layout=layout)
py.iplot(fig, filename = "Stock close price from 2010 to 2016")

a. For this visualization, I use lines as marks to encode the trend, which is pretty clear when it comes to time series data. I use color as channels to encode the different three companies. According to the VAD page 249, 'For small regions, designers should use bright, highly saturated colors to ensure that the color coding is distinguishable.' So I choose three colors with high saturation. Also since the company name is a categorical datatype, so I use colors that are clearly separated, that is red, yellow and blue.  

b. The plot is showing that basically, Google's stock close price is higher than Apple's. And Apple's is higher than Microsoft. For Google and Appel, there seems to be a huge decrease in stock close price in 2014. Since that, Google has the highest increase rate. 

c. The first interaction is that when hovering on the graph, the user can see the specific number in the trends. This interaction makes the graph more readable. Second interaction is that there is a navigation on the left side of the graph, which navigates user from comparison graph to the detailed graph of each company. Third interaction is user can zoom in and zoom out the graph use '+' and '-' buttons on the top right so that they can see what happened during a more or less detailed period. 

### 4. Visualization2

In [None]:
# prepare data for plotting
google = prices[prices['symbol']=='GOOGL']
apple = prices[prices['symbol']=='AAPL']
microsoft = prices[prices['symbol']=='MSFT']

In [137]:
# close-only stacked area chart
trace_google = dict(
    x=google.date,
    y=google.close,
    hoverinfo='x+y',
    mode='lines',
    name = "Google Close Price",
    line=dict(width=1,
              color='#3CAEA3'),
    stackgroup='one'
)
trace_apple = dict(
    x=apple.date,
    y=apple.close,
    hoverinfo='x+y',
    mode='lines',
    name = "Apple Close Price",
    line=dict(width=1,
              color='#446cdf'),
    stackgroup='one'
)
trace_microsoft = dict(
    x=microsoft.date,
    y=microsoft.close,
    hoverinfo='x+y',
    mode='lines',
    name = "Microsoft Close Price",
    line=dict(width=1,
              color='#dd78cd'),
    stackgroup='one'
)

data = [trace_google, trace_apple, trace_microsoft]

updatemenus = list([
    dict(type="buttons",
         active=-1,
         buttons=list([
            dict(label = 'All',
                 method = 'update',
                 args = [{'visible': [True,True, True]},
                         {'title': 'Stock close price from 2010 to 2016'}]),
            dict(label = 'Google',
                 method = 'update',
                 args = [{'visible': [True, False, False]},
                         {'title': 'Google stock close price from 2010 to 2016'}]),
            dict(label = 'Apple',
                 method = 'update',
                 args = [{'visible': [False, True, False]},
                         {'title': 'Apple stock close price from 2010 to 2016'}]),
            dict(label = 'Microsoft',
                 method = 'update',
                 args = [{'visible': [False, False, True]},
                         {'title': 'Microsoft stock close price from 2010 to 2016'}])
        ]),
    )
])

layout = dict(
    title='Stock close price from 2010 to 2016', 
    updatemenus=updatemenus,
#    xaxis=dict(
#        rangeslider=dict(
#            visible = True
#        ),
#        type='date'
#    )
)

fig = dict(data=data,layout=layout)
py.iplot(fig, filename='Stock close price from 2010 to 2016', validate=False)

a. For this visualization, I use area as marks to encode the price trend, and stacked the area so that it can be useful for comparing multiple variables changing over an interval. I use color as channels to encode the different three companies. According to the VAD page 249, 'When colored regions are large, as in backgrounds, the design guidelineis the opposite: use low-saturation colors; that is, pastels.' So I choose three colors with low saturation. Also since the company name is a categorical datatype, so I use colors that are clearly separated, that is pink, blue and green.  

b. The plot is showing that Google's stock close price is higher than Apple's and Apple's is higher than Microsoft.

c. The first interaction is that when hovering on the graph, the user can see the specific number in the trends. This interaction makes the graph more readable. Second interaction is that there is a navigation on the left side of the graph, which navigates user from comparison graph to the detailed graph of each company. Third interaction is user can zoom in and zoom out the graph use '+' and '-' buttons on the top right so that they can see what happened during a more or less detailed period. 