## Comparing Stock Movement

The dataset of this exercise contains temporal stock price data.   
This means we'll be looking at data over a range of time.   
> The dataset for this exercise can be downloaded and added into the data folder from here:   
https://www.kaggle.com/dgawlik/nyse#prices.csv

### Loading our dataset

In [99]:
# make bokeh display figures inside the notebook
import pandas as pd
from bokeh.io import output_notebook
from typing import List
output_notebook()

In [100]:
# loading the Dataset with geoplotlib
dataset = pd.read_csv('./data/stock_prices.csv')

### Data Exploratory

Dataset Description:
- date: shows the date and time the info of the stock was recorded
- symbol: name of the stock
- open: price of the stock when it was open
- close: price of the stock when it was close
- high: the high price of the stock in a given time period
- low: the low price of the stock in a given time period
- volume: the number of shares traded in a given time period

In [101]:
# looking at the dataset
dataset.head()

Unnamed: 0,date,symbol,open,close,low,high,volume
0,2016-01-05 00:00:00,WLTW,123.43,125.839996,122.309998,126.25,2163600.0
1,2016-01-06 00:00:00,WLTW,125.239998,119.980003,119.940002,125.540001,2386400.0
2,2016-01-07 00:00:00,WLTW,116.379997,114.949997,114.93,119.739998,2489500.0
3,2016-01-08 00:00:00,WLTW,115.480003,116.620003,113.5,117.440002,2006300.0
4,2016-01-11 00:00:00,WLTW,117.010002,114.970001,114.089996,117.330002,1408600.0


In [102]:
# Number of stocks in the dataset
print('There are {} stocks in the dataset'.format(len(dataset['symbol'].unique())))

There are 501 stocks in the dataset


The date column contains both date and time. I'll only need the date information so let's delete the time info and put the remaining into a new variable.

In [103]:
# mapping the date of each row to only the year-month-day format
from datetime import datetime

def shorten_time_stamp(timestamp: str) -> str:
    """Take as input a string and return a string in a Y-M-D format"""
#     shortened = timestamp[0]
    
    if len(timestamp) > 10:
        parsed_date=datetime.strptime(timestamp, '%Y-%m-%d %H:%M:%S')
        timestamp=datetime.strftime(parsed_date, '%Y-%m-%d')
    
    return timestamp

dataset['short_date'] = dataset['date'].apply(lambda x: shorten_time_stamp(x))


Here's the first five rows of the dataset. The short_date column no longer contains info about hours, minutes and seconds.

In [104]:
dataset.head()

Unnamed: 0,date,symbol,open,close,low,high,volume,short_date
0,2016-01-05 00:00:00,WLTW,123.43,125.839996,122.309998,126.25,2163600.0,2016-01-05
1,2016-01-06 00:00:00,WLTW,125.239998,119.980003,119.940002,125.540001,2386400.0,2016-01-06
2,2016-01-07 00:00:00,WLTW,116.379997,114.949997,114.93,119.739998,2489500.0,2016-01-07
3,2016-01-08 00:00:00,WLTW,115.480003,116.620003,113.5,117.440002,2006300.0,2016-01-08
4,2016-01-11 00:00:00,WLTW,117.010002,114.970001,114.089996,117.330002,1408600.0,2016-01-11


---

### Building an interactive visualization

The goal is to create an interactive visualization that allows users to compare the performances of two stocks within a specific period like below. There are multiple components in the plot. First of all, two dropdown menus that allow users to choose which stocks to be compared against each other. In addition, there is one slider that users can interact with to choose from which date to which date to display the stocks' performances. Lastly, there are two radio buttons, enabling the users to choose what graphs they want to see. 

<img src="./candle_plot.png" width=500 align="center"/>

The plot itself shows the performance of the specified stock in a given period of time. The red bar signals a decrease in the values of the stock for a given date while the green indicates an increase on the other hand. To add in more interactivity, the legends will hide the display of the performance of a stock on click.


I'll be using widgets and interact from to display interactive plots in jupyter notebook

In [105]:
# importing the necessary dependencies 
from bokeh.plotting import figure, show
from ipywidgets import interact, widgets

In [86]:
# Creating a plot depicting the rise and fall of a stock overtime
def add_candle_plot(plot, stock_name, stock_range, color):
    # if open < close then the stock increases and decreases otherwise
    inc_1 = stock_range.close > stock_range.open
    dec_1 = stock_range.open > stock_range.close
    # Define the width of the bar
    w = 0.5

    plot.segment(stock_range['short_date'], stock_range['high'], 
                 stock_range['short_date'], stock_range['low'], 
                 color="grey")
    # If the stock increases, plot the bar representing the performance in green
    # The muted_alpha parameter allows the user to disable display a stock's performance on click.
    # If muted_alpha = 0, no bars will be shown.
    plot.vbar(stock_range['short_date'][inc_1], w, 
              stock_range['high'][inc_1], stock_range['close'][inc_1], 
              fill_color="blue", line_color="black",
              legend=('Mean price of ' + stock_name), muted_alpha=0.2)
    # If the stock decreases, plot the bar representing the performance in red
    plot.vbar(stock_range['short_date'][dec_1], w, 
              stock_range['high'][dec_1], stock_range['close'][dec_1], 
              fill_color="red", line_color="black",
              legend=('Mean price of ' + stock_name), muted_alpha=0.2)
    # Plot a line representing the mean performnace of the stock in a given period
    stock_mean_val=stock_range[['high', 'low']].mean(axis=1)
    plot.line(stock_range['short_date'], stock_mean_val, 
              legend=('Mean price of ' + stock_name), muted_alpha=0.2,
              line_color=color, alpha=0.5)

In [106]:
# Building the whole plot
def get_plot(stock_1: str, stock_2: str, date: widgets.SelectionRangeSlider, value: widgets.RadioButtons) -> figure:
    """Return an interactive plot to compare stocks' performances
    Input
    ----
    stock1, stock2: str
        Names of the stocks
    data: widgets.SelectionRangeSlider
        A range of dates
    value: either open-close or volume
        Which plot to display
    Output
    -----
        Return figure with specified features"""
    # Get values of stock1 and stock2 in the dataset
    stock_1 = dataset[dataset['symbol'] == stock_1]
    stock_2 = dataset[dataset['symbol'] == stock_2]
    
    # Get the name of stock1 and stock2
    stock_1_name=stock_1['symbol'].unique()[0]
    stock_2_name=stock_2['symbol'].unique()[0]
    # Get the range of dates that will be displayed in the plot
    stock_1_range=stock_1[(stock_1['short_date'] >= date[0]) & (stock_1['short_date'] <= date[1])]
    stock_2_range=stock_2[(stock_2['short_date'] >= date[0]) & (stock_2['short_date'] <= date[1])]

    # Define the figure
    plot=figure(title='Stock prices', 
                     x_axis_label='Date', 
                     x_range=stock_1_range['short_date'], 
                     y_axis_label='Price in $USD',
                     plot_width=800, 
                     plot_height=500)
    
    plot.xaxis.major_label_orientation = 1
    plot.grid.grid_line_alpha=0.5
    
    
    # Radio button values:
    # If open-close, display sotck's performance
    if value == 'open-close':
        add_candle_plot(plot, stock_1_name, stock_1_range, 'blue')
        add_candle_plot(plot, stock_2_name, stock_2_range, 'green')
    # If volume, display stock's volume    
    if value == 'volume':
        plot.line(stock_1_range['short_date'], stock_1_range['volume'], 
                  legend_label=stock_1_name, muted_alpha=0.2)
        plot.line(stock_2_range['short_date'], stock_2_range['volume'], 
                  legend_label=stock_2_name, muted_alpha=0.2,
                  line_color='yellow')
    
    # Interactive legends, if clicked will mute the specified stock
    plot.legend.click_policy="mute"
    
    return plot


We want to **start implementing our visualization here**.   

In the following cells, we will extract the necessary data which will be provided to the widget elements.   
In the first cell we want to extract the following information:
- a list of unique stock names that are present in the dataset
- a list of all short_dates that are in 2016
- a sorted list of unique dates generated from the previous list of dates from 2016
- a list with the values `open-close` and `volume`

Once we have this information in place, we can start building our widgets.

In [98]:
# extracing the necessary data
stock_names=dataset['symbol'].unique()
# We'll be using only data from 2016 on
dates_2016=dataset[dataset['short_date'] >= '2016-01-01']['short_date']
unique_dates_2016=sorted(dates_2016.unique())
value_options=['open-close', 'volume']


As mentioned above, the plot will have several interactive features including:
- two `Dropdown`s with which users can select two stocks that should be compared to each other
    - the first dropdown defaults to the `AAPL` stock selected, named "Compare: "
    - the second dropdown defaults to the `AON` stock selected, named "to: "
    
    
- a `SelectionRange` which will allow us to select a range of dates from the extracted list of unique 2016 dates
    - by default, the first 25 dates will be selected, named "From-To"
    - make sure to disable the `continuous_update` parameter here
    - adjust the layout width to 500px to make sure the dates are displayed correctly
    
    
- a `RadioButton` group that provides the options "open-close" and "volume"
    - by default, "open-close" will be selected, named "Metric"

In [109]:
# setting up the interaction elements
# Define dropdown menus to choose which stock to display
drp_1=widgets.Dropdown(options=stock_names,
                       value='AAPL',
                       description='Compare:')

drp_2=widgets.Dropdown(options=stock_names,
                       value='AON',
                       description='to:')

# Slider to get the range of dates to display the performance of the chosen stocks
range_slider=widgets.SelectionRangeSlider(options=unique_dates_2016, 
                                          index=(0,25), 
                                          continuous_update=False,
                                          description='From-To',
                                          layout={'width': '500px'})

# Displaying the volume or the open-close values of the stock
value_radio=widgets.RadioButtons(options=value_options,
                                 value='open-close',
                                 description='Metric')

The interact decorator will take in values to display interactively with each parameter as the input's names accordingly.

In [110]:
# creating the interact method 
@interact(stock_1=drp_1, stock_2=drp_2, date=range_slider, value=value_radio)
def get_stock_for_2016(stock_1, stock_2, date, value):
    show(get_plot(stock_1, stock_2, date, value))

interactive(children=(Dropdown(description='Compare:', index=4, options=('WLTW', 'A', 'AAL', 'AAP', 'AAPL', 'A…