![RedTen](http://i.imgur.com/pI7TE3w.png)

## Predicting the SPY's Future Closing Price with a Multi-Model Forecast

Creating many machine learning models to predict future price movements from Redis.

![If only managing my portfolio was this easy...](http://i.imgur.com/2ectZu8.png)

### How?

1. Uses pricing metrics (hlocv)
1. Streamline development and deployment of machine learning forecasts by storing large, pre-trained models living in Redis
1. Custom rolled dataset (takes about 7 hours per 1 ticker)
1. Technical indicators



### Why?

1. Took too long to manually rebuild the dataset, and build + tune new models
1. Improve model accuracy by tracking success (situational/seasonal risks)
1. Wanted simple, consistent delivery of results
1. Service layer for abstracting model implementation
1. Multi-tenant, distributed machine learning cloud
1. Team needed Jupyter integration
1. Data security - so it had to run on-premise and cloud

__Now it takes 30 minutes to build the dataset and 5 minutes to make new predictions__

## Sample SPY Multi-Model Forecast

![Combined HLOCV forecast](http://i.imgur.com/iraWaZV.png)

## Setup the Environment

Load the shared core, methods, and environment before starting processing

In [1]:
from __future__ import print_function
import sys, os, requests, json, datetime

# Load the environment and login the user
from src.common.load_redten_ipython_env import user_token, user_login, csv_file, run_job, core, api_urls, ppj, rt_url, rt_user, rt_pass, rt_email, lg, good, boom, anmt, mark, ppj, uni_key, rest_login_as_user, rest_full_login, wait_for_job_to_finish, wait_on_job, get_job_analysis, get_job_results, get_analysis_manifest, get_job_cache_manifest, build_prediction_results, build_forecast_results, get_job_cache_manifest, search_ml_jobs, show_logs, show_errors, ipyImage, ipyHTML, ipyDisplay, pd, np

## Configure the job

In [2]:
# dataset name is the ticker
ds_name = "SPY"

# Label and description for job
title = str(ds_name) + " Forecast v5 - " + str(uni_key())
desc = "Forecast simulation - " + str(datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"))

# Whats the algorithm model you want to use?
algo_name = "xgb-regressor"

# If your dataset is stored in redis, you can pass in the location
# to the dataset like: <redis endpoint>:<key>
rloc = "" 

# If your dataset is stored in S3, you can pass in the location 
# to the dataset like: <bucket>:<key>
sloc = "" 

# During training what ratio of tests-vs-training do you want to use?
# Trade off smarts vs accuracy...how smart are we going?
test_ratio = 0.1

# Customize dataset samples used during the analysis using json dsl
sample_filter_rules = {}

# What column do you want to predict values?
target_column_name = "FClose"

# What columns can the algorithms use for training and learning?
feature_column_names = [ "FHigh", "FLow", "FOpen", "FClose", "FVolume" ]

# values in the Target Column
target_column_values = [ "GoodBuys", "BadBuys", "Not Finished" ] 

# How many units ahead do you want to forecast?
units_ahead_set = [ 5, 10, 15, 20, 25, 30 ]
units_ahead_type = "Days"

# Prune non-int/float columns as needed: 
ignore_features = [       
                "Ticker",
                "Date",
                "FDate",
                "FPrice",
                "DcsnDate",
                "Decision"
            ]

# Set up the XGB parameter
# https://github.com/dmlc/xgboost/blob/master/doc/parameter.md
train_xgb = {
                "learning_rate" : 0.20, 
                "num_estimators" : 50, 
                "sub_sample" : 0.20, 
                "col_sample_by_tree" : 0.90, 
                "col_sample_by_level" : 1.0, 
                "objective" : "reg:linear",
                "max_depth" : 3,
                "max_delta_step" : 0,
                "min_child_weight" : 1, 
                "reg_alpha" : 0, 
                "reg_lambda" : 1,
                "base_score" : 0.6,
                "gamma" : 0,
                "seed" : 42, 
                "silent" : True
            } 

# Predict new price points during the day
predict_row = {
                "High"   : 250.82,
                "Low"    : 245.54,
                "Open"   : 247.77,
                "Close"  : 246.24,
                "Volume" : 77670266
            }

## Start Forecasting

In [3]:
job_id = None # on success, this will store the actively running job's id
csv_file = ""
post_data = {
            "predict_this_data" : predict_row,
            "title" : title,
            "desc" : desc,
            "ds_name" : ds_name,
            "target_column_name" : target_column_name,
            "feature_column_names" : feature_column_names,
            "ignore_features" : ignore_features,
            "csv_file" : csv_file,
            "rloc" : rloc,
            "sloc" : sloc,
            "algo_name" : algo_name,
            "test_ratio" : test_ratio,
            "target_column_values" : target_column_values,
            "label_column_name" : target_column_name, 
            "prediction_type" : "Forecast",
            "ml_type" : "Playbook-UnitsAhead",
            "train" : train_xgb,
            "tracking_type" : "",
            "units_ahead_set" : units_ahead_set,
            "units_ahead_type" : units_ahead_type,
            "forecast_type" : "ETFPriceForecasting",
            "sample_filters" : sample_filter_rules,
            "predict_units_back" : 90, # how many days back should the final chart go?
            "send_to_email" : [ "jay.p.h.johnson@gmail.com" ] # comma separated list
        }

anmt("Running job: " + str(title))

auth_headers = { 
            "Content-type": "application/json",
            "Authorization" : "JWT " + str(user_token)
        }

job_response = run_job(post_data=post_data, headers=auth_headers)

if job_response["status"] != "valid":
    boom("Forecast job failed with error=" + str(job_response["status"]))
else:
    if "id" not in job_response["data"]:
        boom("Failed to create new forecast job")
    else:
        job_id = job_response["data"]["id"]
        job_status = job_response["data"]["status"]
        lg("Started Forecast job=" + str(job_id) + " with current status=" + str(job_status))
# end of if job was valid or not

[33mRunning job: SPY Forecast v5 - 34d6bf175cad4aaf9234aa06e95f4e4[0m
Started Forecast job=558 with current status=requested


## Wait for the job to finish

In [4]:
job_data = {}
job_report = {}

# Should hook this up to a randomized image loader...
ipyDisplay(ipyImage(url="https://media.giphy.com/media/l397998l2DT0ogare/giphy.gif"))

job_res = {}
if job_id == None:
    boom("Failed to start a new job")
else:
    job_res = wait_on_job(job_id)

    if job_res["status"] != "SUCCESS":
        boom("Job=" + str(job_id) + " failed with status=" + str(job_res["status"]) + " err=" + str(job_res["error"]))
    else:
        job_data = job_res["record"]
        anmt("Job Report:")
        lg(ppj(job_data), 5)
# end of waiting

[32mWaiting on job=558 url=https://redten.io/ml/558/[0m
[32mJob=558 is training - Step 3/10[0m
[32mJob=558 is analyzing - Step 5/10[0m
[32mJob=558 is caching - Step 6/10[0m
[32mJob=558 is plotting - Step 7/10[0m
[32mJob=558 is uploading - Step 9/10[0m
[32mJob=558 completed[0m
[33mJob Report:[0m
[32m{
    "job": {
        "algo_name": "xgb-regressor",
        "control_state": "active",
        "created": "2017-05-26 08-02-13",
        "csv_file": "",
        "desc": "Forecast simulation - 2017-05-26 08:02:13",
        "ds_name": "SPY",
        "feature_column_names": [
            "FHigh",
            "FLow",
            "FOpen",
            "FClose",
            "FVolume"
        ],
        "id": 558,
        "ignore_features": [
            "Ticker",
            "Date",
            "FDate",
            "FPrice",
            "DcsnDate",
            "Decision"
        ],
        "images": [
            {
                "author_name": null,
                "desc": null,

## Get Forecast Accuracies

In [5]:
job_report = {}
if job_id == None:
    boom("Failed to start a new job")
else:
    # Get the analysis, but do not auto-show the plots
    job_report = get_job_analysis(job_id, show_plots=False)
    if len(job_report) == 0:
        boom("Job=" + str(job_id) + " failed")
    else:
        lg("")
    # if the job failed
# end of get job analysis

# Build the forecast accuracy dictionary from the analysis
# and show the forecast dataframes
acc_results = build_forecast_results(job_report)
for col in acc_results:
    col_node = acc_results[col]
    
    predictions_df = col_node["predictions_df"]
    date_predictions_df = col_node["date_predictions_df"]
    train_predictions_df = col_node["train_predictions_df"]
    
    lg("--------------------------------------------------")
    # for all columns in the accuracy dictionary:  
    # successful predictions above 90%...how's that error rate though?
    if col_node["accuracy"] > 0.90: 
        good("Column=" + str(col) + " accuracy=" + str(col_node["accuracy"]) + " mse=" + str(col_node["mse"]) + " num_predictions=" + str(len(col_node["date_predictions_df"].index)))
    # successful predictions between 90% and 80%...how's that error rate though?
    elif 0.90 > col_node["accuracy"] > 0.80:
        lg("Column=" + str(col) + " accuracy=" + str(col_node["accuracy"]) + " mse=" + str(col_node["mse"]) + " num_predictions=" + str(len(col_node["date_predictions_df"].index)))
    else:
        boom("Column=" + str(col) + " is not very accurate: accuracy=" + str(col_node["accuracy"]) + " mse=" + str(col_node["mse"]) + " num_predictions=" + str(len(col_node["predictions_df"].index)))
    # end of header line
    
    # show the timeseries forecast
    ipyDisplay(date_predictions_df)
    
    lg("")
# end of showing prediction results

[32mGetting analysis for job=558 url=https://redten.io/ml/analysis/558/[0m
[32mSUCCESS - GET Analysis Response Status=200 Reason=OK[0m
[32mFound Job=558 analysis[0m

--------------------------------------------------
[32mColumn=FOpen_10 accuracy=0.99848923125 mse=1.87511875363 num_predictions=50[0m


Unnamed: 0,COpen,Date,FOpen
0,236.64,2017-02-27,237.073318
1,236.67,2017-02-28,236.157669
2,238.39,2017-03-01,237.216782
3,239.56,2017-03-02,235.210175
4,238.17,2017-03-03,233.699661
5,237.5,2017-03-06,237.479095
6,237.71,2017-03-07,235.920334
7,237.34,2017-03-08,237.982605
8,236.7,2017-03-09,237.568695
9,237.97,2017-03-10,235.905457



--------------------------------------------------
[32mColumn=FLow_5 accuracy=0.998086039927 mse=1.85004769663 num_predictions=55[0m


Unnamed: 0,CLow,Date,FLow
0,236.35,2017-02-27,236.929581
1,236.02,2017-02-28,236.010742
2,238.37,2017-03-01,233.294571
3,238.21,2017-03-02,232.308624
4,237.73,2017-03-03,234.968628
5,237.01,2017-03-06,236.273605
6,237.71,2017-03-07,236.508759
7,236.4,2017-03-08,233.406143
8,235.74,2017-03-09,235.658737
9,236.59,2017-03-10,231.818314



--------------------------------------------------
[32mColumn=FOpen_15 accuracy=0.998673216637 mse=1.36712346617 num_predictions=45[0m


Unnamed: 0,COpen,Date,FOpen
0,236.64,2017-02-27,238.666595
1,236.67,2017-02-28,237.613739
2,238.39,2017-03-01,237.458069
3,239.56,2017-03-02,236.777634
4,238.17,2017-03-03,238.085297
5,237.5,2017-03-06,237.372849
6,237.71,2017-03-07,234.226608
7,237.34,2017-03-08,235.627625
8,236.7,2017-03-09,236.55275
9,237.97,2017-03-10,234.387222



--------------------------------------------------
[32mColumn=FHigh_25 accuracy=0.99852625367 mse=2.07955530003 num_predictions=35[0m


Unnamed: 0,CHigh,Date,FHigh
0,237.31,2017-02-27,238.076721
1,236.95,2017-02-28,234.146927
2,240.32,2017-03-01,237.973175
3,239.57,2017-03-02,237.950897
4,238.61,2017-03-03,237.457062
5,238.12,2017-03-06,238.205643
6,237.71,2017-03-07,237.868561
7,237.64,2017-03-08,233.265198
8,237.24,2017-03-09,234.387634
9,238.02,2017-03-10,237.551636



--------------------------------------------------
[32mColumn=FOpen_30 accuracy=0.998227319751 mse=3.31048708482 num_predictions=30[0m


Unnamed: 0,COpen,Date,FOpen
0,236.64,2017-02-27,234.693985
1,236.67,2017-02-28,236.030411
2,238.39,2017-03-01,235.485245
3,239.56,2017-03-02,237.560745
4,238.17,2017-03-03,235.975784
5,237.5,2017-03-06,235.504822
6,237.34,2017-03-08,235.342697
7,236.7,2017-03-09,236.80101
8,237.97,2017-03-10,237.499802
9,237.62,2017-03-13,234.917587



--------------------------------------------------
[32mColumn=FHigh_20 accuracy=0.99830741251 mse=1.25659633163 num_predictions=40[0m


Unnamed: 0,CHigh,Date,FHigh
0,237.31,2017-02-27,237.670395
1,236.95,2017-02-28,235.0
2,240.32,2017-03-01,235.114883
3,239.57,2017-03-02,238.374283
4,238.61,2017-03-03,236.344513
5,238.12,2017-03-06,237.315735
6,237.71,2017-03-07,236.214111
7,237.64,2017-03-08,237.246307
8,237.24,2017-03-09,238.396393
9,238.02,2017-03-10,236.479446



--------------------------------------------------
[32mColumn=FVolume_20 accuracy=0.908394205285 mse=1.16457020726e+14 num_predictions=40[0m


Unnamed: 0,CVolume,Date,FVolume
0,56515440.0,2017-02-27,65846012.0
1,96961938.0,2017-02-28,83015680.0
2,149158170.0,2017-03-01,64974404.0
3,70245978.0,2017-03-02,68879856.0
4,81974300.0,2017-03-03,66677188.0
5,55391533.0,2017-03-06,101378352.0
6,393822.0,2017-03-07,99078096.0
7,78168795.0,2017-03-08,97593016.0
8,90683918.0,2017-03-09,64936852.0
9,81991652.0,2017-03-10,95077096.0



--------------------------------------------------
Column=FVolume_25 accuracy=0.875114335535 mse=5.61678331942e+13 num_predictions=35


Unnamed: 0,CVolume,Date,FVolume
0,56515440.0,2017-02-27,109613136.0
1,96961938.0,2017-02-28,84890608.0
2,149158170.0,2017-03-01,56550484.0
3,70245978.0,2017-03-02,64852992.0
4,81974300.0,2017-03-03,78665056.0
5,55391533.0,2017-03-06,62343776.0
6,393822.0,2017-03-07,65068232.0
7,78168795.0,2017-03-08,73619968.0
8,90683918.0,2017-03-09,86562520.0
9,81991652.0,2017-03-10,83784728.0



--------------------------------------------------
[32mColumn=FLow_30 accuracy=0.998277970848 mse=2.96286991645 num_predictions=30[0m


Unnamed: 0,CLow,Date,FLow
0,236.35,2017-02-27,235.59082
1,236.02,2017-02-28,236.062424
2,238.37,2017-03-01,235.890472
3,238.21,2017-03-02,236.847061
4,237.73,2017-03-03,234.398666
5,237.01,2017-03-06,234.975235
6,236.4,2017-03-08,234.611038
7,235.74,2017-03-09,235.61763
8,236.59,2017-03-10,236.238098
9,237.24,2017-03-13,233.991272



--------------------------------------------------
[32mColumn=FClose_20 accuracy=0.998412598936 mse=1.17535948971 num_predictions=40[0m


Unnamed: 0,CClose,Date,FClose
0,237.11,2017-02-27,237.44249
1,236.47,2017-02-28,232.841202
2,239.78,2017-03-01,235.384415
3,238.27,2017-03-02,237.826065
4,238.42,2017-03-03,236.225693
5,237.71,2017-03-06,237.156509
6,237.71,2017-03-07,236.379303
7,236.56,2017-03-08,236.556427
8,236.86,2017-03-09,238.205902
9,237.69,2017-03-10,234.9021



--------------------------------------------------
Column=FVolume_5 accuracy=0.860508546731 mse=1.54519726446e+14 num_predictions=55


Unnamed: 0,CVolume,Date,FVolume
0,56515440.0,2017-02-27,47996392.0
1,96961938.0,2017-02-28,46576272.0
2,149158170.0,2017-03-01,84940464.0
3,70245978.0,2017-03-02,90956808.0
4,81974300.0,2017-03-03,74963216.0
5,55391533.0,2017-03-06,55247800.0
6,393822.0,2017-03-07,69506984.0
7,78168795.0,2017-03-08,78182704.0
8,90683918.0,2017-03-09,51209964.0
9,81991652.0,2017-03-10,72567152.0



--------------------------------------------------
[32mColumn=FClose_25 accuracy=0.99845288618 mse=2.04919274668 num_predictions=35[0m


Unnamed: 0,CClose,Date,FClose
0,237.11,2017-02-27,236.552734
1,236.47,2017-02-28,235.050507
2,239.78,2017-03-01,237.163391
3,238.27,2017-03-02,237.538895
4,238.42,2017-03-03,236.731735
5,237.71,2017-03-06,237.555984
6,237.71,2017-03-07,237.320831
7,236.56,2017-03-08,234.358047
8,236.86,2017-03-09,233.667374
9,237.69,2017-03-10,237.382278



--------------------------------------------------
[32mColumn=FLow_10 accuracy=0.998567461387 mse=2.11792497821 num_predictions=50[0m


Unnamed: 0,CLow,Date,FLow
0,236.35,2017-02-27,237.413834
1,236.02,2017-02-28,237.225479
2,238.37,2017-03-01,237.645981
3,238.21,2017-03-02,234.386292
4,237.73,2017-03-03,233.403427
5,237.01,2017-03-06,237.240616
6,237.71,2017-03-07,235.407959
7,236.4,2017-03-08,236.065186
8,235.74,2017-03-09,236.531815
9,236.59,2017-03-10,235.872589



--------------------------------------------------
[32mColumn=FOpen_5 accuracy=0.998177191116 mse=2.27659209481 num_predictions=55[0m


Unnamed: 0,COpen,Date,FOpen
0,236.64,2017-02-27,238.192352
1,236.67,2017-02-28,236.068298
2,238.39,2017-03-01,233.407837
3,239.56,2017-03-02,233.426804
4,238.17,2017-03-03,234.870193
5,237.5,2017-03-06,237.075485
6,237.71,2017-03-07,237.380112
7,237.34,2017-03-08,234.219711
8,236.7,2017-03-09,237.943634
9,237.97,2017-03-10,233.489441



--------------------------------------------------
[32mColumn=FLow_15 accuracy=0.99868083036 mse=1.15577788337 num_predictions=45[0m


Unnamed: 0,CLow,Date,FLow
0,236.35,2017-02-27,237.317108
1,236.02,2017-02-28,236.028763
2,238.37,2017-03-01,236.467682
3,238.21,2017-03-02,235.270294
4,237.73,2017-03-03,237.266052
5,237.01,2017-03-06,236.366791
6,237.71,2017-03-07,233.293579
7,236.4,2017-03-08,234.402542
8,235.74,2017-03-09,235.478241
9,236.59,2017-03-10,233.113327



--------------------------------------------------
[32mColumn=FHigh_10 accuracy=0.998665333528 mse=3.19837957813 num_predictions=50[0m


Unnamed: 0,CHigh,Date,FHigh
0,237.31,2017-02-27,238.116806
1,236.95,2017-02-28,238.073212
2,240.32,2017-03-01,238.702911
3,239.57,2017-03-02,235.622223
4,238.61,2017-03-03,235.265427
5,238.12,2017-03-06,239.444489
6,237.71,2017-03-07,238.095032
7,237.64,2017-03-08,239.200668
8,237.24,2017-03-09,234.270569
9,238.02,2017-03-10,239.583817



--------------------------------------------------
[32mColumn=FHigh_15 accuracy=0.998541531411 mse=1.55878995885 num_predictions=45[0m


Unnamed: 0,CHigh,Date,FHigh
0,237.31,2017-02-27,237.879074
1,236.95,2017-02-28,237.778702
2,240.32,2017-03-01,238.200607
3,239.57,2017-03-02,236.513977
4,238.61,2017-03-03,238.252991
5,238.12,2017-03-06,237.138977
6,237.71,2017-03-07,235.989029
7,237.64,2017-03-08,237.235672
8,237.24,2017-03-09,236.047653
9,238.02,2017-03-10,235.282043



--------------------------------------------------
[32mColumn=FOpen_25 accuracy=0.99848460541 mse=1.73816066913 num_predictions=35[0m


Unnamed: 0,COpen,Date,FOpen
0,236.64,2017-02-27,237.816574
1,236.67,2017-02-28,235.636246
2,238.39,2017-03-01,238.271469
3,239.56,2017-03-02,236.959274
4,238.17,2017-03-03,237.465118
5,237.5,2017-03-06,237.865158
6,237.71,2017-03-07,236.993179
7,237.34,2017-03-08,234.584763
8,236.7,2017-03-09,233.24411
9,237.97,2017-03-10,237.703644



--------------------------------------------------
[32mColumn=FHigh_30 accuracy=0.998540184532 mse=1.89523682882 num_predictions=30[0m


Unnamed: 0,CHigh,Date,FHigh
0,237.31,2017-02-27,238.654388
1,236.95,2017-02-28,237.913757
2,240.32,2017-03-01,237.615326
3,239.57,2017-03-02,237.942841
4,238.61,2017-03-03,235.067856
5,238.12,2017-03-06,236.428833
6,237.64,2017-03-08,237.182312
7,237.24,2017-03-09,237.329041
8,238.02,2017-03-10,237.844208
9,237.86,2017-03-13,235.09314



--------------------------------------------------
[32mColumn=FOpen_20 accuracy=0.99825314721 mse=1.28484845039 num_predictions=40[0m


Unnamed: 0,COpen,Date,FOpen
0,236.64,2017-02-27,237.917526
1,236.67,2017-02-28,232.084137
2,238.39,2017-03-01,234.533905
3,239.56,2017-03-02,238.029938
4,238.17,2017-03-03,235.26123
5,237.5,2017-03-06,236.595505
6,237.71,2017-03-07,235.02774
7,237.34,2017-03-08,235.152695
8,236.7,2017-03-09,237.827057
9,237.97,2017-03-10,235.309494



--------------------------------------------------
[32mColumn=FVolume_10 accuracy=0.907178016782 mse=1.1376357129e+14 num_predictions=50[0m


Unnamed: 0,CVolume,Date,FVolume
0,56515440.0,2017-02-27,49050020.0
1,96961938.0,2017-02-28,67179288.0
2,149158170.0,2017-03-01,57029576.0
3,70245978.0,2017-03-02,74554640.0
4,81974300.0,2017-03-03,59417156.0
5,55391533.0,2017-03-06,70559072.0
6,393822.0,2017-03-07,95975664.0
7,78168795.0,2017-03-08,75058032.0
8,90683918.0,2017-03-09,80309808.0
9,81991652.0,2017-03-10,117453024.0



--------------------------------------------------
[32mColumn=FVolume_30 accuracy=0.930475108578 mse=1.29142162391e+14 num_predictions=30[0m


Unnamed: 0,CVolume,Date,FVolume
0,56515440.0,2017-02-27,154272176.0
1,96961938.0,2017-02-28,66343696.0
2,149158170.0,2017-03-01,45580672.0
3,70245978.0,2017-03-02,76765984.0
4,81974300.0,2017-03-03,74599040.0
5,55391533.0,2017-03-06,87300176.0
6,78168795.0,2017-03-08,95490720.0
7,90683918.0,2017-03-09,54597272.0
8,81991652.0,2017-03-10,70504896.0
9,57256824.0,2017-03-13,61960604.0



--------------------------------------------------
Column=FVolume_15 accuracy=0.881865035722 mse=1.50714346281e+14 num_predictions=45


Unnamed: 0,CVolume,Date,FVolume
0,56515440.0,2017-02-27,58376660.0
1,96961938.0,2017-02-28,114653016.0
2,149158170.0,2017-03-01,68996536.0
3,70245978.0,2017-03-02,98230624.0
4,81974300.0,2017-03-03,53393920.0
5,55391533.0,2017-03-06,49446616.0
6,393822.0,2017-03-07,81472504.0
7,78168795.0,2017-03-08,100004472.0
8,90683918.0,2017-03-09,57248996.0
9,81991652.0,2017-03-10,71268888.0



--------------------------------------------------
[32mColumn=FLow_20 accuracy=0.998418492538 mse=1.78245389644 num_predictions=40[0m


Unnamed: 0,CLow,Date,FLow
0,236.35,2017-02-27,236.516693
1,236.02,2017-02-28,232.71936
2,238.37,2017-03-01,234.400543
3,238.21,2017-03-02,236.472946
4,237.73,2017-03-03,235.309799
5,237.01,2017-03-06,235.155243
6,237.71,2017-03-07,234.749527
7,236.4,2017-03-08,234.886383
8,235.74,2017-03-09,237.38765
9,236.59,2017-03-10,234.101929



--------------------------------------------------
[32mColumn=FLow_25 accuracy=0.998601484905 mse=1.80604115137 num_predictions=35[0m


Unnamed: 0,CLow,Date,FLow
0,236.35,2017-02-27,236.538483
1,236.02,2017-02-28,233.490814
2,238.37,2017-03-01,236.478592
3,238.21,2017-03-02,237.355896
4,237.73,2017-03-03,236.198212
5,237.01,2017-03-06,237.066391
6,237.71,2017-03-07,236.528244
7,236.4,2017-03-08,233.600281
8,235.74,2017-03-09,232.905045
9,236.59,2017-03-10,236.264847



--------------------------------------------------
[32mColumn=FClose_30 accuracy=0.998366330784 mse=1.4605830873 num_predictions=30[0m


Unnamed: 0,CClose,Date,FClose
0,237.11,2017-02-27,236.038269
1,236.47,2017-02-28,238.432297
2,239.78,2017-03-01,238.986664
3,238.27,2017-03-02,237.920959
4,238.42,2017-03-03,234.579483
5,237.71,2017-03-06,236.220596
6,236.56,2017-03-08,235.853409
7,236.86,2017-03-09,238.039886
8,237.69,2017-03-10,237.464264
9,237.81,2017-03-13,233.276535



--------------------------------------------------
[32mColumn=FClose_15 accuracy=0.998329081518 mse=1.55153303883 num_predictions=45[0m


Unnamed: 0,CClose,Date,FClose
0,237.11,2017-02-27,237.514938
1,236.47,2017-02-28,236.349167
2,239.78,2017-03-01,237.232834
3,238.27,2017-03-02,235.958359
4,238.42,2017-03-03,237.573029
5,237.71,2017-03-06,235.776993
6,237.71,2017-03-07,233.257751
7,236.56,2017-03-08,234.547958
8,236.86,2017-03-09,236.204132
9,237.69,2017-03-10,234.440323



--------------------------------------------------
[32mColumn=FClose_10 accuracy=0.998336184677 mse=3.67349565905 num_predictions=50[0m


Unnamed: 0,CClose,Date,FClose
0,237.11,2017-02-27,236.719559
1,236.47,2017-02-28,235.765656
2,239.78,2017-03-01,236.341888
3,238.27,2017-03-02,234.455933
4,238.42,2017-03-03,234.006851
5,237.71,2017-03-06,239.427414
6,237.71,2017-03-07,237.065964
7,236.56,2017-03-08,238.87265
8,236.86,2017-03-09,232.399292
9,237.69,2017-03-10,238.442032



--------------------------------------------------
[32mColumn=FHigh_5 accuracy=0.998197941367 mse=1.80702890975 num_predictions=55[0m


Unnamed: 0,CHigh,Date,FHigh
0,237.31,2017-02-27,237.312714
1,236.95,2017-02-28,236.647583
2,240.32,2017-03-01,235.408691
3,239.57,2017-03-02,235.41217
4,238.61,2017-03-03,236.592255
5,238.12,2017-03-06,236.942337
6,237.71,2017-03-07,237.805847
7,237.64,2017-03-08,235.71109
8,237.24,2017-03-09,238.066299
9,238.02,2017-03-10,234.807449



--------------------------------------------------
[32mColumn=FClose_5 accuracy=0.998150179646 mse=1.98280524804 num_predictions=55[0m


Unnamed: 0,CClose,Date,FClose
0,237.11,2017-02-27,236.434555
1,236.47,2017-02-28,236.510025
2,239.78,2017-03-01,234.04158
3,238.27,2017-03-02,232.594925
4,238.42,2017-03-03,235.686096
5,237.71,2017-03-06,236.494537
6,237.71,2017-03-07,235.777527
7,236.56,2017-03-08,234.72818
8,236.86,2017-03-09,237.905029
9,237.69,2017-03-10,231.17926





## Get the Analysis Images

In [6]:
job_res = get_job_analysis(job_id, show_plots=True)

[32mGetting analysis for job=558 url=https://redten.io/ml/analysis/558/[0m
[32mSUCCESS - GET Analysis Response Status=200 Reason=OK[0m
[32mFound Job=558 analysis[0m
[33mSPY-2-558 5-Days - Predictive Accuracy
Predicted Close 5 Days vs Actual Close 5 Days[0m
URL: https://rt-media.s3.amazonaws.com/media/imagesml/20170526/2_558_12122_7c49dccd964548fc.png


---------------------------------------------------------------------------------------
[33mSPY-2-558 5-Days - Predictive Accuracy
Predicted High 5 Days vs Actual High 5 Days[0m
URL: https://rt-media.s3.amazonaws.com/media/imagesml/20170526/2_558_12121_5ae84fdafa034174.png


---------------------------------------------------------------------------------------
[33mSPY-2-558 10-Days - Predictive Accuracy
Predicted Close 10 Days vs Actual Close 10 Days[0m
URL: https://rt-media.s3.amazonaws.com/media/imagesml/20170526/2_558_12120_2d87f32772344921.png


---------------------------------------------------------------------------------------
[33mSPY-2-558 15-Days - Predictive Accuracy
Predicted Close 15 Days vs Actual Close 15 Days[0m
URL: https://rt-media.s3.amazonaws.com/media/imagesml/20170526/2_558_12119_c54a1003037d4eb3.png


---------------------------------------------------------------------------------------
[33mSPY-2-558 30-Days - Predictive Accuracy
Predicted Close 30 Days vs Actual Close 30 Days[0m
URL: https://rt-media.s3.amazonaws.com/media/imagesml/20170526/2_558_12118_192e7a3c4ba840fc.png


---------------------------------------------------------------------------------------
[33mSPY-2-558 25-Days - Predictive Accuracy
Predicted Low 25 Days vs Actual Low 25 Days[0m
URL: https://rt-media.s3.amazonaws.com/media/imagesml/20170526/2_558_12117_6b74cc14075f4f20.png


---------------------------------------------------------------------------------------
[33mSPY-2-558 20-Days - Predictive Accuracy
Predicted Low 20 Days vs Actual Low 20 Days[0m
URL: https://rt-media.s3.amazonaws.com/media/imagesml/20170526/2_558_12116_1fed112e3a4e42ff.png


---------------------------------------------------------------------------------------
[33mSPY-2-558 15-Days - Predictive Accuracy
Predicted Volume 15 Days vs Actual Volume 15 Days[0m
URL: https://rt-media.s3.amazonaws.com/media/imagesml/20170526/2_558_12115_795d9657c2534f76.png


---------------------------------------------------------------------------------------
[33mSPY-2-558 30-Days - Predictive Accuracy
Predicted Volume 30 Days vs Actual Volume 30 Days[0m
URL: https://rt-media.s3.amazonaws.com/media/imagesml/20170526/2_558_12114_c19203024ec84944.png


---------------------------------------------------------------------------------------
[33mSPY-2-558 10-Days - Predictive Accuracy
Predicted Volume 10 Days vs Actual Volume 10 Days[0m
URL: https://rt-media.s3.amazonaws.com/media/imagesml/20170526/2_558_12113_1b000088cf154451.png


---------------------------------------------------------------------------------------
[33mSPY-2-558 20-Days - Predictive Accuracy
Predicted Open 20 Days vs Actual Open 20 Days[0m
URL: https://rt-media.s3.amazonaws.com/media/imagesml/20170526/2_558_12112_c72e079979774d17.png


---------------------------------------------------------------------------------------
[33mSPY-2-558 30-Days - Predictive Accuracy
Predicted High 30 Days vs Actual High 30 Days[0m
URL: https://rt-media.s3.amazonaws.com/media/imagesml/20170526/2_558_12111_b0c21c4c4d0d4e7a.png


---------------------------------------------------------------------------------------
[33mSPY-2-558 25-Days - Predictive Accuracy
Predicted Open 25 Days vs Actual Open 25 Days[0m
URL: https://rt-media.s3.amazonaws.com/media/imagesml/20170526/2_558_12110_2a27883839464a6c.png


---------------------------------------------------------------------------------------
[33mSPY-2-558 15-Days - Predictive Accuracy
Predicted High 15 Days vs Actual High 15 Days[0m
URL: https://rt-media.s3.amazonaws.com/media/imagesml/20170526/2_558_12109_325e5fefa07a444d.png


---------------------------------------------------------------------------------------
[33mSPY-2-558 10-Days - Predictive Accuracy
Predicted High 10 Days vs Actual High 10 Days[0m
URL: https://rt-media.s3.amazonaws.com/media/imagesml/20170526/2_558_12108_7dfe5cad6dd341c2.png


---------------------------------------------------------------------------------------
[33mSPY-2-558 15-Days - Predictive Accuracy
Predicted Low 15 Days vs Actual Low 15 Days[0m
URL: https://rt-media.s3.amazonaws.com/media/imagesml/20170526/2_558_12107_2fcaa612a10346d9.png


---------------------------------------------------------------------------------------
[33mSPY-2-558 5-Days - Predictive Accuracy
Predicted Open 5 Days vs Actual Open 5 Days[0m
URL: https://rt-media.s3.amazonaws.com/media/imagesml/20170526/2_558_12106_29c5e186270348f4.png


---------------------------------------------------------------------------------------
[33mSPY-2-558 10-Days - Predictive Accuracy
Predicted Low 10 Days vs Actual Low 10 Days[0m
URL: https://rt-media.s3.amazonaws.com/media/imagesml/20170526/2_558_12105_7139a2dd4df5447b.png


---------------------------------------------------------------------------------------
[33mSPY-2-558 25-Days - Predictive Accuracy
Predicted Close 25 Days vs Actual Close 25 Days[0m
URL: https://rt-media.s3.amazonaws.com/media/imagesml/20170526/2_558_12104_74d4b72e2bd342d6.png


---------------------------------------------------------------------------------------
[33mSPY-2-558 5-Days - Predictive Accuracy
Predicted Volume 5 Days vs Actual Volume 5 Days[0m
URL: https://rt-media.s3.amazonaws.com/media/imagesml/20170526/2_558_12103_286431b3eabc4af5.png


---------------------------------------------------------------------------------------
[33mSPY-2-558 20-Days - Predictive Accuracy
Predicted Close 20 Days vs Actual Close 20 Days[0m
URL: https://rt-media.s3.amazonaws.com/media/imagesml/20170526/2_558_12102_b8f6174000f5455c.png


---------------------------------------------------------------------------------------
[33mSPY-2-558 30-Days - Predictive Accuracy
Predicted Low 30 Days vs Actual Low 30 Days[0m
URL: https://rt-media.s3.amazonaws.com/media/imagesml/20170526/2_558_12101_bfe9888ad15648f9.png


---------------------------------------------------------------------------------------
[33mSPY-2-558 25-Days - Predictive Accuracy
Predicted Volume 25 Days vs Actual Volume 25 Days[0m
URL: https://rt-media.s3.amazonaws.com/media/imagesml/20170526/2_558_12100_0761ea1b0f684ce9.png


---------------------------------------------------------------------------------------
[33mSPY-2-558 20-Days - Predictive Accuracy
Predicted Volume 20 Days vs Actual Volume 20 Days[0m
URL: https://rt-media.s3.amazonaws.com/media/imagesml/20170526/2_558_12099_8f73f41290794595.png


---------------------------------------------------------------------------------------
[33mSPY-2-558 20-Days - Predictive Accuracy
Predicted High 20 Days vs Actual High 20 Days[0m
URL: https://rt-media.s3.amazonaws.com/media/imagesml/20170526/2_558_12098_5fd3e28826754f24.png


---------------------------------------------------------------------------------------
[33mSPY-2-558 30-Days - Predictive Accuracy
Predicted Open 30 Days vs Actual Open 30 Days[0m
URL: https://rt-media.s3.amazonaws.com/media/imagesml/20170526/2_558_12097_214858c79e704f4c.png


---------------------------------------------------------------------------------------
[33mSPY-2-558 25-Days - Predictive Accuracy
Predicted High 25 Days vs Actual High 25 Days[0m
URL: https://rt-media.s3.amazonaws.com/media/imagesml/20170526/2_558_12096_168bcf41ff5142ad.png


---------------------------------------------------------------------------------------
[33mSPY-2-558 15-Days - Predictive Accuracy
Predicted Open 15 Days vs Actual Open 15 Days[0m
URL: https://rt-media.s3.amazonaws.com/media/imagesml/20170526/2_558_12095_de06fd5548504b41.png


---------------------------------------------------------------------------------------
[33mSPY-2-558 5-Days - Predictive Accuracy
Predicted Low 5 Days vs Actual Low 5 Days[0m
URL: https://rt-media.s3.amazonaws.com/media/imagesml/20170526/2_558_12094_354bb5267c694ba5.png


---------------------------------------------------------------------------------------
[33mSPY-2-558 10-Days - Predictive Accuracy
Predicted Open 10 Days vs Actual Open 10 Days[0m
URL: https://rt-media.s3.amazonaws.com/media/imagesml/20170526/2_558_12093_e2ecdc03bc8a4f30.png


---------------------------------------------------------------------------------------
[33mSPY Close forecast overlay between 2017-02-27 00:00:00 - 2017-05-15 00:00:00[0m
URL: https://rt-media.s3.amazonaws.com/media/imagesml/20170526/2_558_12092_954ad32f51ef47f4.png


---------------------------------------------------------------------------------------


## Get the Recent Machine Learning Jobs

In [7]:
user_token = user_login(rt_user, rt_pass, rt_url)
auth_headers = {
                "Authorization" : "JWT " + str(user_token)
            }
resource_url = rt_url + "/ml/run/"
query_params = {}
post_data = {}

# Get the ML Job
resource_url = rt_url + "/ml/jobs/"

lg("Running Get ML Job url=" + str(resource_url), 6)
get_response = requests.get(resource_url, params=query_params, data=post_data, headers=auth_headers)

if get_response.status_code != 201 and get_response.status_code != 200:
    lg("Failed with GET Response Status=" + str(get_response.status_code) + " Reason=" + str(get_response.reason), 0)
    lg("Details:\n" + str(get_response.text) + "\n", 0)
else:
    lg("SUCCESS - GET Response Status=" + str(get_response.status_code) + " Reason=" + str(get_response.reason)[0:10], 5)

    as_json = True
    record = {}
    if as_json:
        record = json.loads(get_response.text)
        lg(ppj(record))
# end of post for running an ML Job

Running Get ML Job url=https://redten.io/ml/jobs/
[32mSUCCESS - GET Response Status=200 Reason=OK[0m
{
    "jobs": [
        {
            "algo_name": "xgb-regressor",
            "control_state": "active",
            "created": "2017-05-26 08-02-13",
            "csv_file": "",
            "desc": "Forecast simulation - 2017-05-26 08:02:13",
            "ds_name": "SPY",
            "feature_column_names": [
                "FHigh",
                "FLow",
                "FOpen",
                "FClose",
                "FVolume"
            ],
            "id": 558,
            "ignore_features": [
                "Ticker",
                "Date",
                "FDate",
                "FPrice",
                "DcsnDate",
                "Decision"
            ],
            "images": [
                "https://rt-media.s3.amazonaws.com/media/imagesml/20170526/2_558_12122_7c49dccd964548fc.png",
                "https://rt-media.s3.amazonaws.com/media/imagesml/20170526/2_558_

## Redis Machine Learning Manifest

Jobs use a manifest to prevent concurrent jobs in-flight and models from colliding between users and historical machine learning jobs

![Redis is awesome](http://i.imgur.com/BobKxGp.png)

#### A manifest contains:

1. A dictionary of Redis model locations
1. S3 archival locations
1. Tracking data for import and export across environments
1. Decoupled large model files (8gb files in S3) from the tracking and deployment


In [8]:
job_manifest = get_job_cache_manifest(job_report)
lg(ppj(job_manifest))

{
    "11811": {
        "rloc": "MODELS:_MD_SPY-2-558_cd1d9a_0",
        "sloc": "redten-models-west:rt_models_userid_2_job_558_train_450_jobresults_413_modelid_11811_modelkey_MD_SPY-2-558_cd1d9a_0.cache.pickle.zlib",
        "target": "FOpen_10",
        "tracking_id": "ML_SPY-2-558_95aa375d14684a8182fe20b566090a3",
        "tracking_name": "SPY-2-558",
        "tracking_type": "UseTargetColAndUnits"
    },
    "11812": {
        "rloc": "MODELS:_MD_SPY-2-558_cd1d9a_1",
        "sloc": "redten-models-west:rt_models_userid_2_job_558_train_450_jobresults_413_modelid_11812_modelkey_MD_SPY-2-558_cd1d9a_1.cache.pickle.zlib",
        "target": "FLow_5",
        "tracking_id": "ML_SPY-2-558_95aa375d14684a8182fe20b566090a3",
        "tracking_name": "SPY-2-558",
        "tracking_type": "UseTargetColAndUnits"
    },
    "11813": {
        "rloc": "MODELS:_MD_SPY-2-558_cd1d9a_2",
        "sloc": "redten-models-west:rt_models_userid_2_job_558_train_450_jobresults_413_modelid_11813_modelkey_MD_

## Multiple Models stored in Redis

Here's how models are stored in the Redis machine learning data store

![Forecast storing multiple models in Redis](http://i.imgur.com/zUxspVL.gif)

### Conclusion

Today's talk focused on:

1. Using Redis as a machine learning data store for housing 1000s of pre-trained models
1. Streamlining model pipelines to automate build + train + predict + export/import using a REST API
1. How time intensive and expensive it is to continually rebuild machine learning models from scratch
1. The importance of tracking model accuracy and performance over time
1. How using a system like Red10 can enable an organization or team of data scientists to quickly test datasets and new ideas without stomping on each other's work
1. How this approach can make predictions from __any dataset__...not just stocks
1. This can make predictions with lots of different technologies

![Red10 Use Cases for Machine Learning](http://i.imgur.com/WWDYzGb.jpg)