### Market data refresh 

### Input Description

RAW OHLC data.

### Output  

Clean OHLC data in a hdf store

### Operations

This code takes a financial market data file and runs it through a processing pipeline. The following operations are carried out :

- Localise the time data to market time
- Merge with existing RAW data based on datetime
- Save the resulting RAW data to HDF5

In [1]:
#!pip install --upgrade "../../quantutils"
import json, os, pandas
import quantutils.dataset.pipeline as ppl
from quantutils.api.datasource import MarketDataStore
mds = MarketDataStore("../datasources")

In [4]:
def refreshMarketData(root, datasource_file):

    # Loop over datasources...
    # TODO: In chronological order

    datasources = json.load(open(root + "/" + datasource_file))
    
    for datasource in datasources:

        DS_path = root + "/" + datasource["name"] + "/"
        SRC_path = DS_path + "raw/"

        for market in datasource["markets"]:

            for source in market["sources"]:

                # Loop over any source files...
                for infile in os.listdir(SRC_path):

                    if infile.lower().startswith(source["name"].lower()):

                        print("Adding " + infile + " to " + market["name"] + " table")

                        # Load RAW data (assume CSV)
                        newData = pandas.read_csv(SRC_path + infile,
                                                  index_col=datasource["index_col"],
                                                  parse_dates=datasource["parse_dates"],
                                                  header=None,
                                                  names=["Date", "Time", "Open", "High", "Low", "Close"],
                                                  usecols=range(0, 6),
                                                  skiprows=datasource["skiprows"],
                                                  dayfirst=datasource["dayfirst"]
                                                  )

                        if newData is not None:

                            newData = ppl.localize(newData, datasource["timezone"], "UTC")

                            # TODO: Call Price Store API to append data
                            mds.appendHDF(source["name"], newData, source["sample_unit"], update=True)

In [5]:
refreshMarketData("../datasources", "datasources.json")

Adding WallSt-hourly-120217.txt to DOW table
Converting from Europe/London to UTC
Resampling to H periods
Appending data...
Adding WallSt-hourly-011116.txt to DOW table


  check_attribute_name(name)


Converting from Europe/London to UTC
Resampling to H periods
Re-writing table data for update...
Adding WallSt-hourly-210618.txt to DOW table
Converting from Europe/London to UTC
Resampling to H periods
Appending data...
Adding WallSt-hourly-230718.txt to DOW table
Converting from Europe/London to UTC
Resampling to H periods
Appending data...
Adding WallSt-hourly-050517.txt to DOW table
Converting from Europe/London to UTC
Resampling to H periods
Re-writing table data for update...
Adding WallSt-hourly-140518.txt to DOW table
Converting from Europe/London to UTC
Resampling to H periods
Re-writing table data for update...
Adding WallSt-hourly-040417.txt to DOW table
Converting from Europe/London to UTC
Resampling to H periods
Re-writing table data for update...
Adding WallSt-hourly-030818.txt to DOW table
Converting from Europe/London to UTC
Resampling to H periods
Re-writing table data for update...
Adding WallSt-hourly-160517.txt to DOW table
Converting from Europe/London to UTC
Resam

  check_attribute_name(name)


Converting from Europe/London to UTC
Resampling to H periods
Re-writing table data for update...
Adding SP500-hourly-210618.txt to SPY table
Converting from Europe/London to UTC
Resampling to H periods
Appending data...
Adding SP500-hourly-200318.txt to SPY table
Converting from Europe/London to UTC
Resampling to H periods
Re-writing table data for update...
Adding SP500-hourly-180418.txt to SPY table
Converting from Europe/London to UTC
Resampling to H periods
Re-writing table data for update...
Adding SP500-hourly-140518.txt to SPY table
Converting from Europe/London to UTC
Resampling to H periods
Re-writing table data for update...
Adding SP500-hourly-040618.txt to SPY table
Converting from Europe/London to UTC
Resampling to H periods
Re-writing table data for update...
Adding SP500-hourly-030818.txt to SPY table
Converting from Europe/London to UTC
Resampling to H periods
Appending data...
Adding SP500-hourly-230718.txt to SPY table
Converting from Europe/London to UTC
Resampling t

  check_attribute_name(name)


Converting from US/Eastern to UTC
Resampling to 5min periods
Re-writing table data for update...
Adding D&J-IND_130101_141231.csv to DOW table
Converting from US/Eastern to UTC
Resampling to 5min periods
Re-writing table data for update...
Adding SANDP-500_161003_180319.csv to SPY table
Converting from US/Eastern to UTC
Resampling to 5min periods
Appending data...
Adding SANDP-500_150101_170519.csv to SPY table


  check_attribute_name(name)


Converting from US/Eastern to UTC
Resampling to 5min periods
Re-writing table data for update...
Adding SANDP-500_130101_141231.csv to SPY table
Converting from US/Eastern to UTC
Resampling to 5min periods
Re-writing table data for update...
