In [55]:
import numpy as np
import math
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import plotly
import plotly.graph_objs as go
import plotly.plotly as py
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
import warnings
warnings.filterwarnings('ignore')

# Load Dataset

In [56]:
NYSEprices = pd.read_csv('prices.csv')

# View the first 5 rows of the dataset
NYSEprices.head()

Unnamed: 0,date,symbol,open,close,low,high,volume
0,2016-01-05 00:00:00,WLTW,123.43,125.839996,122.309998,126.25,2163600.0
1,2016-01-06 00:00:00,WLTW,125.239998,119.980003,119.940002,125.540001,2386400.0
2,2016-01-07 00:00:00,WLTW,116.379997,114.949997,114.93,119.739998,2489500.0
3,2016-01-08 00:00:00,WLTW,115.480003,116.620003,113.5,117.440002,2006300.0
4,2016-01-11 00:00:00,WLTW,117.010002,114.970001,114.089996,117.330002,1408600.0


In [57]:
# View the last 5 rows of the dataset
NYSEprices.tail()

Unnamed: 0,date,symbol,open,close,low,high,volume
851259,2016-12-30,ZBH,103.309998,103.199997,102.849998,103.93,973800.0
851260,2016-12-30,ZION,43.07,43.040001,42.689999,43.310001,1938100.0
851261,2016-12-30,ZTS,53.639999,53.529999,53.27,53.740002,1701200.0
851262,2016-12-30 00:00:00,AIV,44.73,45.450001,44.41,45.59,1380900.0
851263,2016-12-30 00:00:00,FTV,54.200001,53.630001,53.389999,54.48,705100.0


Based on the above preview of dataset and content from Munzner 2.6, we can find that the combination of "date" + "symbol" is unique for each item. So the "date" and "symbol" columns should be both labeled as keys. And all other columns can be values depending on the tasks. They are unsuitable as keys because they are quantitative, then there is nothing to prevent them from having the same values for multiple items.

Key columns: date, symbol

Value columns: open, close, low, high, volume

# Visualization Task

The task I create is to **discover** the **trends** for open and close price of stock WLTW over time. Based on the content from Visual Encoding lecture slides, the action of my task is to **analyze->consume->discover** data, and the targets are the **trends**. Thus we need to create visualizations to **discover** the data.   

# Visualization 1

In [58]:
# Take a subset with respect to stock WLTW
WLTWdata = NYSEprices.loc[NYSEprices['symbol'] == 'WLTW']

# Change format into year-month-day since all time are 00:00:00
WLTWdata['date'] = pd.to_datetime(WLTWdata['date'])

# Create Line Charts to discover the trends
Date = pd.Series.tolist(WLTWdata['date'])
openPrice = pd.Series.tolist(WLTWdata['open'])
closePrice = pd.Series.tolist(WLTWdata['close'])

trace1 = go.Scatter(
    x = Date,
    y = openPrice,
    mode = 'markers + lines',
    name = 'open price'
)
trace2 = go.Scatter(
    x = Date,
    y = closePrice,
    mode = 'markers + lines',
    name = 'close price'
)
layout = go.Layout(
    title='Open and Close price of stock WLTW over time',
    dragmode='select',
    width=900,
    height=700,
    autosize=False,
    hovermode='closest',
    xaxis = dict(title = 'Date'),
    yaxis = dict(title = 'Price'))
data = [trace1, trace2]
fig1 = go.Figure(data=data, layout=layout)
py.iplot(fig1, filename='Line Chart-WLTW')

My visualization 1 applies the Line Charts to show the open and close price of stock WLTW over time. As mentioned in previous part, the "date" and "symbol" columns both belong to keys, and one key is fixed in the task, which is "symbol = WLTW". Then the other key "date" is presented on x-axis and the values "open price", "close price" are encoded in plot with corresponding marks(points, lines) & channels(position, color). I choose to use Line Chart since it is suitable and effective for viewing the trend of continuous variables as time proceeds.

The plot shows that there is an overall increase trend for both open and close price of stock WLTW through year 2016. And we can see that there is a large portion of overlap between the two price lines.

My visualization adopts the default interactivity settings in Plotly library (hover, drag, zoom). Because of the overlap part in plot, the visualization enables viewers to check corresponding (x,y) coordinates when mouse hovering on a point. It also enables viewers to drag, zoom in then examine data for certain parts, helping them discover trend of prices within specific time ranges.     

# Visualization 2

In [60]:
# Create Parallel Bar Charts to discover the trends
from plotly import tools

trace1 = go.Bar(
    x=Date,
    y=openPrice,
    name='open price'
)
trace2 = go.Bar(
    x=Date,
    y=closePrice,
    name='close price'
)

data = [trace1, trace2]

fig2 = tools.make_subplots(rows=2, cols=1) # Create two subplots for comparison

fig2.append_trace(trace1, 1, 1)
fig2.append_trace(trace2, 2, 1)
fig2['layout'].update(
    title = 'Open and Close price of stock WLTW (grouped by weeks)',
    hovermode='closest'
)
fig2['layout']['xaxis1'].update(title='Date')
fig2['layout']['yaxis1'].update(title='Price')
fig2['layout']['xaxis2'].update(title='Date')
fig2['layout']['yaxis2'].update(title='Price')
py.iplot(fig2, filename='Parallel Bar Charts-WLTW')

This is the format of your plot grid:
[ (1,1) x1,y1 ]
[ (2,1) x2,y2 ]



My visualization 2 applies the Parallel Bar Charts to separately show the open and close price of stock WLTW over time. Since the stock exchange market is closed during weekends, then the price data are naturally grouped by weeks (all legal business days in a week) when we plot them in a Bar Chart, where the gaps represent the weekends. So it becomes convenient to discover the weekly price trends, which has more practical meaning to certain viewers. Also the Parallel Bar Charts can show clear comparison between open and close price as well as avoid overlap that we saw in previous plot.

The plot shows that if the open price presents an overall increase/decrease trend in a week, then the close price commonly presents the same overall trend in that week.

My visualization adopts the default interactivity settings in Plotly library (hover, zoom, etc.). Because bar chart doesn't show specific price values, the visualization enables viewers to check corresponding [date, price] information when mouse hovering on a bar. It also enables viewers to zoom in then examine data for certain parts, helping them better discover or compare the weekly trend of two prices.

