# Exploring volatility & streaming price action

### By Saeed Amen (@thalesians) - Managing Director & Co-founder of the Thalesians

There are many methods to measure risk. Volatility is one of these measures. In this study, we seek to understand how volatility behaves! We shall examine how sampling frequency impacts volatility measurements for EUR/USD using intraday data. Does measuring volatility using 5 minute data change the result compared to using 60 minute data? Also how do realised and implied volatility differ historically for S&P500? Lastly, we shall be demonstrating how to do a streaming data plot of EUR/USD.

We shall be using Python, together with Plotly for plotting. Plotly is a free web-based platform for making graphs. You can keep graphs private, make them public, and run Plotly on your [Plotly Enterprise on your own servers](https://plot.ly/product/enterprise/). You can find more details [here](https://plot.ly/python/getting-started/).

We shall be using market data from Bloomberg (using Brian Smith's TIA wrapper) and Yahoo. We have also written the code to allow importation of market data from CSV files (for example if you've like to use intraday data from an FX broker). For more information on how to access Bloomberg via Python, please take a look at [this](https://plot.ly/ipython-notebooks/ukelectionbbg/), where I discussed how to use Bloomberg data to do an event study for price action around UK general elections. I have reused some of the code from that earlier project and also added a few extra features to some of that code. I have written this code for Python 2.7, although, much of it should work without too much modification if you are running Python 3.4+.

### Python scripts

* bbg_com - low level interaction with BBG COM object, which we've adapted (which we are simply calling from Brian Smith's TIA project)
* datadownloader - wrapper for BBG COM, Quandl, Yahoo and CSV access to data
* plothelper - reusuable functions for interacting with Plotly
* volstudy - computations to analysis realised volatility
* streamingfxstudy - doing live plotting of market prices

### Downloading the market data

We create the DataDownloader class, which acts a wrapper for various market data sources. The idea is to decouple the precise vendor implementations with our higher level code. For live quotes we use "download_live_quote" with Yahoo support.

In [1]:
# for time series manipulation
import pandas
import pandas.io.data as web

import datetime

class DataDownloader:
    def download_live_quote(self, vendor_ticker, source):
        if source == 'Yahoo':
            from urllib2 import urlopen
            response = urlopen('http://finance.yahoo.com/d/quotes.csv?s=' + vendor_ticker + '&f=sl1d1t1c1ohgv&e=.csv')
            html = response.read()

            split = html.split(",")

            price = split[1]

        return price

Next, we write code to download historic market data. For daily historic data we use "download_time_series" which has support for Bloomberg, Quandl, Yahoo and CSV data sources. For intraday historic data, Bloomberg and CSV data sources are implemented. Obviously, if your CSV files have different date formats, you can edit the date formatter code below.

In [2]:
    def download_time_series(self, vendor_ticker, pretty_ticker, start_date, source, csv_file = None,
                             freq = 'daily', freq_no = 1):
        if not(isinstance(start_date, list)):
            start_date = [start_date]

        if freq == 'daily':
            if source == 'Quandl':
                import Quandl
                # Quandl requires API key for large number of daily downloads
                # https://www.quandl.com/help/api
                spot = Quandl.get(vendor_ticker)
                spot = pandas.DataFrame(data = spot['Value'], index = spot.index)
                spot.columns = [pretty_ticker]
            elif source == 'Yahoo':
                finish_date = datetime.datetime.utcnow()
                finish_date = datetime.datetime(finish_date.year, finish_date.month, finish_date.day, 0, 0, 0)

                spot = web.DataReader(vendor_ticker, 'yahoo', start_date[0], finish_date)
                spot = pandas.DataFrame(data = spot['Close'].values, index = spot.index, columns = [pretty_ticker])

                spot.index = pandas.DatetimeIndex(spot.index)

            elif source == 'Bloomberg':
                from egthalesians.plotly.helper.bbg_com import HistoricalDataRequest
                req = HistoricalDataRequest([vendor_ticker], ['PX_LAST'], start = start_date[0])
                req.execute()

                spot = req.response_as_single()
                spot.columns = [pretty_ticker]
            elif source == 'CSV':
                dateparse = lambda x: pandas.datetime.strptime(x, '%Y-%m-%d')

                # in case you want to use a source other than Bloomberg/Quandl
                spot = pandas.read_csv(csv_file, index_col=0, parse_dates=0, date_parser=dateparse)

        elif freq == 'intraday':
            if source == 'Bloomberg':
                from bbg_com import IntrdayBarRequest
                req = IntrdayBarRequest(vendor_ticker, freq_no, start = start_date[0])

                req.execute()

                spot = req.response
                spot.columns = [pretty_ticker + '.' + x for x in spot.columns]

                spot = pandas.DataFrame(data = spot[pretty_ticker + ".close"].values,
                                        index = spot.index, columns = [pretty_ticker + ".close"])

            elif source == 'CSV':
                dateparse = lambda x: pandas.datetime.strptime(x, '%Y-%m-%d %H:%M:%S')

                # in case you want to use a source other than Bloomberg/Quandl etc
                try:
                    spot = pandas.read_csv(csv_file, index_col = 0, parse_dates = 0, date_parser = dateparse)
                except:
                    dateparse = lambda x: pandas.datetime.strptime(x, '%d/%m/%Y %H:%M:%S')
                    spot = pandas.read_csv(csv_file, index_col = 0, parse_dates = 0, date_parser = dateparse)

        return spot
    
    DataDownloader.download_time_series = download_time_series

Next we create the PlotHelper class, which deals with all the interaction with Plotly, assuming our input data is in Pandas dataframes. Alternatively, we could have used Jorge Santos' very convenient [Cufflinks library](https://github.com/santosjorge/cufflinks) which makes it easy to plot Pandas dataframes in Plotly. As a first step, we create a simple function to parse dates, which will be useful later on when it comes to defining start dates for our historical data downloads.

In [3]:
# for dates
import datetime

# for plotting data
import plotly
from plotly.graph_objs import *

class PlotHelper:
    def parse_dates(self, str_dates):
        # parse_dates - parses string dates into Python format
        #
        # str_dates = dates to be parsed in the format of day/month/year
        #

        dates = []

        for d in str_dates:
            dates.append(datetime.datetime.strptime(d, '%d/%m/%Y'))

        return dates

Our next function converts Pandas dataframes into Traces, which can be plotted by Plotly. We also have several parameters to control the way the plot will look, such as being able to specify whether or not to display a legend, the colors of the lines, the width of lines, whether to add markers etc. We utilise the ColorLover library to create graduated palettes. This is by no means an exhaustive list of the properties we can set on Plotly, but should make our plots a bit more exciting.

In [4]:
    def convert_df_plotly(self, dataframe, axis_no = 1, color_def = ['default'],
                          special_line = 'Mean', showlegend = True, addmarker = False, gradcolor = None):
        # convert_df_plotly - converts a Pandas data frame to Plotly format for line plots
        # dataframe = data frame due to be converted
        # axis_no = axis for plot to be drawn (default = 1)
        # special_line = make lines named this extra thick
        # color_def = color scheme to be used (default = ['default']), colour will alternate in the list
        # showlegend = True or False to show legend of this line on plot
        # addmarker = True or False to add markers
        # gradcolor = Create a graduated color scheme for the lines
        #
        # Also see http://nbviewer.ipython.org/gist/nipunreddevil/7734529 for converting dataframe to traces
        # Also see http://moderndata.plot.ly/color-scales-in-ipython-notebook/

        x = dataframe.index

        traces = []

        # will be used for market opacity for the markers
        increments = 0.95 / float(len(dataframe.columns))

        if gradcolor is not None:
            try:
                import colorlover as cl
                color_def = cl.scales[str(len(dataframe.columns))]['seq'][gradcolor]
            except:
                print('Check colorlover installation...')

        i = 0

        for key in dataframe:
            scatter = plotly.graph_objs.Scatter(
                        x = x,
                        y = dataframe[key].values,
                        name = key,
                        xaxis = 'x' + str(axis_no),
                        yaxis = 'y' + str(axis_no),
                        showlegend = showlegend)

            # only apply color/marker properties if not "default"
            if color_def[i % len(color_def)] != "default":
                if special_line in str(key):
                    # special case for lines labelled "mean"
                    # make line thicker
                    scatter['mode'] = 'lines'
                    scatter['line'] = plotly.graph_objs.Line(
                                color = color_def[i % len(color_def)],
                                width = 2
                            )

                else:
                    line_width = 1

                    # set properties for the markers which change opacity
                    # for markers make lines thinner
                    if addmarker:
                        opacity = 0.05 + (increments * i);
                        scatter['mode'] = 'markers+lines'
                        scatter['marker'] = plotly.graph_objs.Marker(
                                    color=color_def[i % len(color_def)],  # marker color
                                    opacity = opacity,
                                    size = 5)
                        line_width = 0.2

                    else:
                        scatter['mode'] = 'lines'

                    scatter['line'] = plotly.graph_objs.Line(
                            color = color_def[i % len(color_def)],
                            width = line_width)

                i = i + 1

            traces.append(scatter)

        return traces
    
    PlotHelper.convert_df_plotly = convert_df_plotly

The "create_layout" function controls the overall layout of our plot and properties such as axes labels and the size of the overall plot.

In [5]:
    def create_layout(self, title, xaxis, yaxis, width = -1, height = -1):
        # create_layout - populates a layout object
        # title = title of the plot
        # xaxis = xaxis label
        # yaxis = yaxis label
        # width (optional) = width of plot
        # height (optional) = height of plot
        #

        layout = Layout(
                    title = title,
                    xaxis = plotly.graph_objs.XAxis(
                        title = xaxis,
                        showgrid = False
                ),
                    yaxis = plotly.graph_objs.YAxis(
                        title= yaxis,
                        showline = False
                )
            )

        if width > 0 and height > 0:
            layout['width'] = width
            layout['height'] = height

        return layout
    
    PlotHelper.create_layout = create_layout

### Volatility study

#### How does changing the sampling frequency affect realised volatility?

We have now written the functions for getting market data and also some helper functions when it comes to generating Plotly charts. Here, we shall do our computations to generate the actual data to plot. We first need set our Plotly username and API key.

In [6]:
# for time series/maths
import pandas
import math
import datetime
from datetime import timedelta

# for plotting data
import plotly
import plotly.plotly as py
from plotly.graph_objs import *

postfix = "prod" # what to put at the end of the Plotly URL

def vol_study():
    # Learn about API authentication here: https://plot.ly/python/getting-started
    # Find your api_key here: https://plot.ly/settings/api
    plotly_username = "thalesians"
    plotly_api_key = "8f18dbgilh"

    plotly.tools.set_credentials_file(username = plotly_username, api_key = plotly_api_key)

The next step is to download our market data, using the DataDownloader class. We can either use Bloomberg or a CSV file (simply comment out the data source you are not using). Our aim is to download a few months of EUR/USD 1 minute data.

In [7]:
    ticker = 'EURUSD' # will use in plot titles later (and for creating Plotly URL)

    ##### Download intraday EUR/USD data from Bloomberg or CSV file
    source = "Bloomberg"
    source = "CSV"

    csv_file = None

    plot_helper = PlotHelper()

    data_downloader = DataDownloader()
    start_date = datetime.datetime.utcnow() - timedelta(days = 120)
    freq = 'intraday'

    if source == 'Bloomberg':
        vendor_ticker = 'EURUSD BGN Curncy'
    elif source == 'CSV':
        vendor_ticker = 'EURUSD'
        csv_file = 'EURUSD.csv'

    spot = data_downloader.download_time_series(vendor_ticker, ticker, start_date, source, csv_file = csv_file, freq = freq)

With the data downloaded, we can do our computation. We first downsample the data into the correct frequency (1 min, ..., 60 mins). Once this is done, we calculate spot returns. The overnight rolling realised volatility is then calculated (ie. over the past 24 hour window). We annualise the volatility we calculate, as is general market practice. If we are annualising realised volatility calculated using a 1 minute window, we need to multiply by the square root of 252 * 1440 (given there are 252 business days in a year and 1440 minutes in every day). This is not necessarily the most "accurate" way to annualise volatility, but this approach tends to be most common in the market.

In [8]:
    #### Calculate 1 day realised vol on EUR/USD data from Bloomberg using different data frequency (1 min, ..., 60 min)
    minute_freq = [1, 5, 10, 30, 60]

    realised_vol = None

    for min in minute_freq:
        spot_min = spot.loc[spot.index.minute % min == 0]
        rets = spot_min / spot_min.shift(1) - 1
        realised_vol_min = pandas.rolling_std(rets, 1440.0 / min) * math.sqrt(252.0 * (1440.0 / min)) * 100
        realised_vol_min.columns = [str(min) + 'min']
        if realised_vol is None: realised_vol = realised_vol_min
        else:
            realised_vol = realised_vol.join(realised_vol_min, how = 'outer')

With the data now computed, the final step is to plot with Plotly. We first reduce the number of point to plot to every hour, to make it quicker to plot. We then set the title and create a Figure object. We use the PlotHelper class to convert the DataFrame into Traces which can be plotted using Plotly.

In [9]:
    # reduce the number of points to plot
    realised_vol = realised_vol.loc[realised_vol.index.minute % 60 == 0]

    xaxis = 'Date'
    yaxis = 'Daily Realised Vol'
    source_label = "Source: @thalesians/BBG"

    # Using varying shades of blue for each line (helped by colorlover library)

    title = ticker + ' Realised Vol 1D Window' + '<BR>' + source_label
    realised_vol.index = pandas.to_datetime(realised_vol.index.values)

    # also apply graduated color scheme of blues (from light to dark)
    # see httF://moderndata.plot.ly/color-scales-in-ipython-notebook/ for details on colorlover package
    # which allows you to set scales
    fig = Figure(data = plot_helper.convert_df_plotly(realised_vol, gradcolor = 'Blues', addmarker=False),
                 layout = plot_helper.create_layout(title, xaxis, yaxis),
    )

Finally, we set the filename and display the plot by calling the iplot function. We find that generally speaking the higher the frequency of data, the higher the volatility we calculate. At frequencies such as 1 minute, we also face problems related to what is known as the bid/ask bounce.

In [10]:
    filename = 'realised-vol-freq-' + str(ticker) + str(postfix)
    py.iplot(fig, filename = filename)

#### Comparing realised and implied volatility

We now do a different volatility study, which involves looking at S&P500 and VIX. We calculate the realised volatility of S&P500 comparing it to VIX, which is a measure of implied volatility on S&P500 (VIX takes different parts of the implied vol curve for options written on S&P500). We can think of implied volatility as the market's expectation for future realised volatility. We need to load data to begin with, to start our analysis. In this instance, we shall use Yahoo to download data, again using our DataDownloader class. We look at the past year of data.

In [11]:
    #### Calculate Realised Vol on S&P500 data from Yahoo and compare with VIX index
    source = 'Yahoo'

    start_date = datetime.datetime.utcnow() - timedelta(days = 365)
    spx = data_downloader.download_time_series('^GSPC', 'S&P500', start_date, source)
    vix = data_downloader.download_time_series('^VIX', 'VIX', start_date, source)

Now the data is loaded in Pandas dataframe, we calculate the 1M rolling realised volatility on S&P500. We shift it back 20 working days (approximately a month), so the implied volatility (VIX) and realised volatility are aligned, over the same period of time. Strictly speaking, we should take into account a holiday calendar to do this more accurately.

In [12]:
    # calculate realised vol on S&P500 (and shift it to be aligned to VIX - implied vol)
    spx_realised_vol = pandas.rolling_std(spx / spx.shift(1) - 1, 20) * math.sqrt(252) * 100
    spx_realised_vol = spx_realised_vol.shift(-20)

    vol = spx_realised_vol.join(vix, how = 'outer')

We now plot the two lines using Plotly! We see that generally speaking VIX is higher than S&P500. This difference is known as the volatility risk premium. Given that implied volatility is the market's expectation for future realised volatility, it is an unknown quantity. The risk premium is there because of this, to compensate selling of this "insurance". When people sell options, they are trying to harvest this premium. Whilst, this might generally be a profitable strategy, during market crisis, it can have quite sizable drawdowns. During market crises, realised volatility often ends up being much higher then implied volatility.

In [13]:
    source_label = "Source: @thalesians/Yahoo"
    title = "Comparing S&P500 1M realised vol with VIX" + '<BR>' + source_label
    xaxis = 'Date'
    yaxis = 'Vol'

    fig = Figure(data = plot_helper.convert_df_plotly(vol, addmarker = True),
                 layout = plot_helper.create_layout(title, xaxis, yaxis),
    )

    py.iplot(fig, filename = 'sp500-vix-comparison-' + str(postfix))

### Now for something completely different (kind of!) - streaming FX charts with Plotly

Here we take a break from volatility and instead focus on something totally different, plotting live data! We shall use Yahoo as our datasource for live EUR/USD spot data. We have used FX markets, because well, I like FX (worked in FX markets for a decade!) and also because the market is open for a large amount of the the week (from Sunday evening to Friday evening). Obviously, if you run this code during the weekend, when there are no live prices, it won't be that exciting! We shall be using the live quote function from the DataDownloader class which we wrote earlier. 


Below, we show an animated GIF of the chart in action, which we have prerun to illustrate what the output should look like. We've cropped it to focus on the moving chart.

In [14]:
from IPython.display import Image
Image(url='http://imgur.com/cf9oM8H.gif', width=700)

As a first step, we set our Plotly credentials, including our stream ID. You can get your API key and stream ID from the Plotly website (URLs are listed below in the code).

In [15]:
# for dates
import datetime
import time

# for plotting data
import plotly
import plotly.plotly as py
from plotly.graph_objs import *

postfix = "prod"

def streaming_fx_study():

    # Learn about API authentication here: https://plot.ly/python/getting-started
    # Find your api_key here: https://plot.ly/settings/api
    plotly_username = "thalesians"
    plotly_api_key = "8f18dbgilh"
    
    plotly.tools.set_credentials_file(username=plotly_username, api_key=plotly_api_key)

The next step is to create the various elements of our plot. We have used a similar template to write our code as [here](https://plot.ly/python/streaming-line-tutorial/). We assume that our streaming plot will run for 120 seconds / 120 updates (you can obviously change this, although be aware that having too many point in a plot will make it slow to update). We then create our stream (we need to pass our stream ID for this). Once that is done we create a Scatter object which at present will have no data in it, and then the Layout and finally the Figure object.

In [16]:
    # Learn about stream id here at: http://help.plot.ly/documentation/python/streaming-tutorial/
    # Find your stream_id here: https://plot.ly/settings/api
    stream_id = "murp1zhvit"
    
    # Code below based on https://plot.ly/python/streaming-line-tutorial/
    max_points = 120

    data_downloader = DataDownloader()

    ticker = 'EURUSD'; vendor_ticker = 'EURUSD=X'

    # Make instance of stream id object
    stream = plotly.graph_objs.Stream(
            token = stream_id,            # (!) link stream id to 'token' key
            maxpoints = max_points        # (!) keep a max of 80 pts on screen
    )

    trace1 = plotly.graph_objs.Scatter(
        x = [],
        y = [],
        mode = 'lines',
        stream = stream            # (!) embed stream id, 1 per trace
    )

We can now send our Figure object to Plotly.

In [17]:
    data = Data([trace1])
    source_label = "Source: @thalesians/Yahoo"
    title = ticker + '<BR>' + source_label
    
    # Add title to layout object
    layout = Layout(title = title)

    # Make a figure object
    fig = Figure(data = data, layout = layout)

    # Send fig to Plotly, initialize streaming plot, open new tab
    py.iplot(fig, filename=ticker + "-stream" + postfix)

We then open up a stream to Plotly. Every second we read in a live quote from Yahoo and then push it to the stream object. If we run this outside of FX market hours, it won't be that exciting! Also, obviously, this graph will only update whilst, we are pushing day to it. Whilst, we have used Yahoo, you of course use any other data source for which you have the appropriate licence.

In [18]:
    # Make instance of the Stream link object, with same stream id as Stream id object
    s = py.Stream(stream_id)

    # Open the stream
    s.open()

    #### Grab live FX prices from Yahoo & plot point by point
    #### till termination
    passes = 0
    
    while passes < max_points:
        x = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S.%f')

        try:
            y = data_downloader.download_live_quote(vendor_ticker, 'Yahoo')
            s.write(dict(x = x, y = y))
        except: pass

        time.sleep(1)   # sleep for 1 second
        passes = passes + 1

### Biography

Saeed Amen is the managing director and co-founder of the Thalesians. He has a decade of experience creating and successfully running systematic trading models at Lehman Brothers, Nomura and now at the Thalesians. Independently, he runs a systematic trading model with proprietary capital. He is the author of Trading Thalesians – What the ancient world can teach us about trading today (Palgrave Macmillan). He graduated with a first class honours master’s degree from Imperial College in Mathematics & Computer Science. He is also a fan of Python and has written an extensive library for financial market backtesting called PyThalesians, which is partially open sourced - available on the [Thalesians GitHub page](https://github.com/thalesians)

Follow the Thalesians on Twitter @thalesians and get my book on Amazon [here](http://www.amazon.co.uk/Trading-Thalesians-Saeed-Amen/dp/113739952X). You can also join our Thalesians Meetup.com group [here](http://www.meetup.com/thalesians) - we do quant finance events in a number of cities including London, New York, Budapest, Prague and Frankfurt.

The Thalesians website can be found [here](http://www.thalesians.com).