![Intro](http://i.imgur.com/65dsKzX.png)

### Why build Red10?

I wanted to use machine learning to accurately predict buy and sell opportunities for stocks.

While building the original version, I realized how important having a machine learning data store is to this type of intelligent application stack. Today's talk is about using Redis as a database for 1000s of machine learning models. By using Redis as the data store, it allows my team to focus on making better predictions through iterative improvements in an ever-changing market.

Once the API was defined, I realized this approach would work for any dataset...not just playing stocks. (Please see Appendix D for how Red10 works with the IRIS dataset)

#### Objectives

1. Enable dataset evolution and feature engineering - create 1000s of machine learning models and find the best performing ones as you refine your data
1. Manage **unique** machine learning models - how can I find accurate models and benchmark them?
1. Apply **DevOps for machine learning** - treat models like build artifacts
1. Simple iterative workflow: upload dataset, run job, evaluate
1. Automatic analysis, model training, prediction and forecasting supported out of the box
1. Export/Import models across environments
1. Running pre-trained machine learning models on IoT, healthcare, drones, other resource constrained environments wanting predictive capabilities

### What is a Machine Learning Model?

An algorithm is the general approach you will take. The model is what you get when you run the algorithm over your training data and what you use to make predictions on new data. You can generate a new model with the same algorithm with different data, or a different model from the same data with a different algorithm.

Source: https://www.quora.com/What-is-the-difference-between-machine-learning-model-and-ML-algorithm

Please refer to [Appendix G](https://redten.io:8900/notebooks/notebooks/redten/RedTen-Intro.ipynb#Appendix-G---Terminology-and-Definitions) for more defintions.

### Machine Learning Example

Iterative approach for navigating data to find the best predictions

![Iterative approach for navigating data to find the best predictions](http://i.imgur.com/fAZGfFY.jpg)

#### Navigating with a "simple" Model configuration - Test 1

![Models learn by trial/error through data - here is a possible outcome](http://i.imgur.com/2QdciOm.jpg)

#### Navigating with a different Model configuration - Test 2

![Models learn by trial/error through data - here is a possible outcome](http://i.imgur.com/blyf8LT.jpg)

#### Navigating with a different Model configuration - Test 3

![Models learn by trial/error through data - here is a possible outcome](http://i.imgur.com/m8aEYZz.jpg)

#### Navigating with a different Model configuration - Test 4

![Models learn by trial/error through data - here is a possible outcome](http://i.imgur.com/nvDohTD.jpg)

#### Navigating with the best case Model configuration

![Models learn by trial/error through data - here is a possible outcome](http://i.imgur.com/WfAhHlm.jpg)

### Redis as a Machine Learning Data Store

Redis is a great scalable, in-memory storage solution for handling CRUD machine learning use cases.

#### Origin
After using Redis for years to handle: caching, pub/sub and auto-reloading capabilities on restart, it was an obvious first choice as a scalable storage solution for many pre-trained machine learning models. In my humble opinion, pulling gigabytes of pickled objects from a database would take too long and is not an ideal use case for a relational or nosql database (mysql/postgres/oracle/mongo).

#### Redis Machine Learning Data Store Responsibilities

1. Store pre-trained models (it takes time and compute power to build them)
1. Store model accuracies and predictions (this may be broken out into a separate instance in the future)
1. Provide an API for making new predictions from stored models
1. Provide a naming system for tracking deployed models across environments (focused on reducing model in-memory collisions)
1. Provide a model deployment API (import/export) - DevOps for machine learning
1. Implement automatic model reloading - using rdb snapshots
1. Stability and scaling

## Over 10,000 Machine Learning Models

Here's an analysis of the Redis machine learning data store after it broke through 10,000 pre-trained models in-memory.

__Anyone that can use a REST API can use Red10 to do the same thing__

In [1]:
from __future__ import print_function
import sys, os, requests, json, datetime

# Load the environment and login the user
from src.common.load_redten_ipython_env import user_token, user_login, csv_file, run_job, core, api_urls, ppj, rt_url, rt_user, rt_pass, rt_email, lg, good, boom, anmt, mark, ppj, uni_key, rest_login_as_user, rest_full_login, wait_for_job_to_finish, wait_on_job, get_job_analysis, get_job_results, get_analysis_manifest, get_job_cache_manifest, build_prediction_results, build_forecast_results, get_job_cache_manifest, search_ml_jobs, show_logs, show_errors, ipyImage, ipyHTML, ipyDisplay, pd, np

# header
lg("")
good("Starting Redis Key Analysis: " + str(datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")))
lg("")

# store the model mappings by dataset name
models_iris = {}
models_spy = {}
models_xle = {}
models_xlf = {}
models_xli = {}
models_xlu = {}
total_models_in_redis = 0

# walk the redis dbs
for db_idx in range(0, 16):
    
    db_file = "/tmp/db-" + str(db_idx) + "-keys"
    os.system("echo \"select " + str(db_idx) + " \n keys *\" | redis-cli -p 6100 > " + str(db_file))
    database_keys = {}
    if os.path.exists(db_file): # if the file exists
        with open(db_file) as f: # open it
            lines_to_parse = f.readlines() # read all the lines
            for org_line in lines_to_parse:
                cur_line = org_line.rstrip("\n").strip().lstrip() # remove the newline and any other whitespace
                if str(cur_line) != "OK" and str(cur_line) != "":
                    database_keys[cur_line] = True
                # if it's not the redis OK response or an empty string
            # for all lines
        # with the db file open
    else:
        boom("Failed parsing Redis db_file=" + str(db_file))
    # end of parsing db_file
    
    if len(database_keys) > 1:
        for idx,k in enumerate(database_keys):

            # ignore the predictions and accuracies
            if "_PredictionsDF" not in k and "_Accuracy" not in k:
                num_underscores = len(str(k).split("_"))

                # ignore the analysis/manifest keys
                if num_underscores > 2:
                    if "_IRIS" in k:
                        models_iris[k] = True
                        total_models_in_redis += 1
                    elif "_SPY" in k:
                        models_spy[k] = True
                        total_models_in_redis += 1
                    elif "_XLE" in k:
                        models_xle[k] = True
                        total_models_in_redis += 1
                    elif "_XLF" in k:
                        models_xlf[k] = True
                        total_models_in_redis += 1
                    elif "_XLI" in k:   
                        models_xli[k] = True 
                        total_models_in_redis += 1    
                    elif "_XLU" in k:   
                        models_xlu[k] = True 
                        total_models_in_redis += 1
                # end of checking it's not an analysis/manifest key
                
            # end of if it's not a prediction and not an accuarcy key
        # for the large db keyset

        lg("IRIS models=" + str(len(models_iris)), 5)
        lg("SPY models=" + str(len(models_spy)), 5)
        lg("XLE models=" + str(len(models_xle)), 5)
        lg("XLF models=" + str(len(models_xlf)), 5)
        lg("XLI models=" + str(len(models_xli)), 5)
        lg("XLU models=" + str(len(models_xlu)), 5)

        lg("")
        lg("---------------------------------------------")
        anmt("Total Pre-trained Machine Learning Models in Redis:")
        boom(str(total_models_in_redis))
        lg("---------------------------------------------")
        lg("")
        
    # end of if there's database keys in the redis instance
# end for all db files

[32mLogged in: https://redten.io[0m

[32mStarting Redis Key Analysis: 2017-05-23 18:20:04[0m

[32mIRIS models=18[0m
[32mSPY models=2263[0m
[32mXLE models=2077[0m
[32mXLF models=2170[0m
[32mXLI models=2108[0m
[32mXLU models=2170[0m

---------------------------------------------
[33mTotal Pre-trained Machine Learning Models in Redis:[0m
[31m10806[0m
---------------------------------------------



### Redis Machine Learning Data Store in Action

Here's the machine learning data store saving all +10,000 pre-trained machine learning models to an rdb file which can then be moved to other environments and entirely different systems to make predictions.

![Storing Over 10000 Pre-trained Machine Learning Models in Redis](http://i.imgur.com/5waiLdD.gif)

## Generalized Data Science Workflow

1. Understanding what is important to the product or business - What can we improve?
1. Iterating on datasets (collect, build, define, implement, feature engineering, etc.)
1. Teaching algorithms to make predictions - What algorithm(s) should we use?
1. Evaluating predictive success - How can we benchmark success between models?
1. Model tuning - Can we improve the success rate by changing how the model learns?
1. Model deployment - How do we take this awesome model to production or deploy it to a mobile app/IoT/drone?
1. Regression testing predictive success with new data points - How good is that model from last year (model deprecation)? 


__Additional reading__
- https://www.quora.com/What-is-a-typical-day-like-as-a-data-scientist
- http://www.kdnuggets.com/2016/11/ibm-dsx-data-science-experience.html
- https://www.teamleada.com/handbook

## The Machine Learning Ecosystem

Great tools with a complex, ever-changing user manual

#### Perception
![Space man...space](http://i.imgur.com/Sz7EUCA.jpg)

#### And the Reality

Sorry kerbals. We'll get to space next time I promise!

![Space is super easy](http://i.imgur.com/BnFxfAu.gif)

definitely a data problem...

## The toolchains

#### Easy to use vs granular control

![Space Shuttle Endeavour's Control Panels](http://i.imgur.com/gkPWm7I.jpg)

### What data can we use?

1. Pricing
1. Sales
1. User events
1. Accounting
1. Real Estate
1. Fraud
1. Risk
1. Does it have numbers?


### What algorithm should we use?

1. Who wins the Kaggle competitions? [eXtreme gradient boosting (XGB) won a bunch in 2016](http://www.kdnuggets.com/2016/03/xgboost-implementing-winningest-kaggle-algorithm-spark-flink.html)

1. Many, many more choices
![Linear Regression](http://i.imgur.com/ChidNeu.png)
(source: http://aiplaybook.a16z.com/docs/guides/dl)

1. Start iterating in a notebook
http://nbviewer.jupyter.org/github/jmsteinw/Notebooks/blob/master/XG_Boost_NB.ipynb

## How does XGB work?

#### 1. Time is a factor when training models

![Unique Models](http://i.imgur.com/rbRvbas.png)

#### 2. Highly parameterized

```
def __init__(self, max_depth=3, learning_rate=0.1, n_estimators=100,
                 silent=True, objective="reg:linear",
                 nthread=-1, gamma=0, min_child_weight=1, max_delta_step=0,
                 subsample=1, colsample_bytree=1, colsample_bylevel=1,
                 reg_alpha=0, reg_lambda=1, scale_pos_weight=1,
                 base_score=0.5, seed=0, missing=None):
```
https://github.com/dmlc/xgboost/blob/master/python-package/xgboost/sklearn.py

## What's the support story?

#### 1. Updates are a breeze

![Don't worry I got this](https://media.giphy.com/media/2ZFuPKWcSw16E/giphy.gif)

#### 2. You're gonna need a bigger boat

![Eats memory like pacman...chomp](http://i.imgur.com/KTVTwIh.png)

#### 3. What if we just add some new data for this conference?

SELL SELL SELL!

![awww yea my own flash crash indicator perfect](http://i.imgur.com/qSAtWX2.png)

#### 4. More data does not always lead to better predictions

![Just Keep Stirring](http://i.imgur.com/rxGROk8.png)

### How does Red10 work?

1. Two modes: manual and cloud service
1. Runs anywhere with docker (virtualbox, on-prem, AWS, OpenShift, Swarm, Kubernetes)

Please refer to [Appendix A](https://redten.io:8900/notebooks/notebooks/redten/RedTen-Intro.ipynb#Appendix-A---What-is-Red10?) for more details on Red10's architecture.

### 1. Original, manual version using the GitHub Repo

The github repo: https://github.com/jay-johnson/sci-pype is built for using this workflow:

![Sci-pype Machine Learning Workflow](http://i.imgur.com/UsLbBE2.jpg)

#### 2. Red10 - multi-tenant, machine learning REST API built with Jupyter integration

![AWS Deployment](http://i.imgur.com/LD1jvKY.png)

### Where can this be used?

![Machine Learning Use Cases](http://i.imgur.com/Ua49NAP.jpg)

Using Red10 for price forecasting:

### What's next?

##### [Pricing Multi-Model Forecast](https://redten.io/forecast/) 

Please refer to the appendices for architecture slides and developer-centric tooling for reviewing offline.

### Appendix A - What is Red10?

1. A containerized, distributed machine learning platform for streamlining analysis to build higly predictive models
1. Multi-tenant REST API wrapping https://github.com/jay-johnson/sci-pype inside of Django REST Framework with JWT authentication.
1. Users build, train and predict using machine learning models (https://github.com/dmlc/xgboost) housed in Redis (Tensorflow coming soon).
1. Export and import pre-trained models with archival to S3.
1. Use the same API to make new predictions using pre-trained models.
1. Streamline analysis - every column in a dataset will be analyzed and compared.
1. Ensemble learning - every column gets a distinct, trained model for helping improve predictive accuracy.
1. Horizontally scalable machine learning cloud - maximize infrastructure by scaling up the number of Celery workers consuming published jobs out of Redis.
1. S3 integration - images, analysis, predictions, export + import pre-trained models, manifest file.
1. Security (pre-trained models, analysis, predictions) - the user must present a secret key to access any of the machine learning job's analysis and predictions.
1. Swagger API - easy to build new web applications derived from the service layer.
1. Centralized logging (Elasticsearch + Logstash + Kibana) - logs from Django REST Framework + Celery are published to logstash over Redis.
1. Search historical analysis using Elasticsearch - machine learning jobs are automatically published as json documents to Elasticsearch on successful completion.
1. Dockerized - runs anywhere with docker (virtualbox, on-prem, AWS, OpenShift, Swarm, K8).


### Appendix B - Red10 Machine Learning Workflow

### Build, Train, Analyze and Store

![Easy as 1-2-3](http://i.imgur.com/HuRpcws.png)

### Make New Predictions

![Once trained and stored, new predictions use the same api](http://i.imgur.com/XWBZiGz.png)

### Moving Model Artifacts - Export / Import

![Treat trained models like build artifacts](http://i.imgur.com/TyDi3xD.png)


## Appendix C - Red10 Manifest Mapping to Models in Redis 

### Storing Multiple Models in Redis

![Forecast storing multiple models in Redis](http://i.imgur.com/oDS9iSE.gif)

### Analysis Archival in S3

![Bucket per user or co-located](http://i.imgur.com/lEET3Fx.png)



### Appendix D - Additional Presentations

##### 1. [IRIS Multi-Model Predictions](https://redten.io/predictions/)
##### 2. [Screen recorded - IRIS Multi-Model Predictions](https://redten.io/recorded-iris/)
##### 3. [Screen recorded - Pricing Multi-Model Forecast](https://redten.io/recorded-forecast/) 

### Appendix E - Redis Key Overview (Non-ML Data Store)

Here is an overview of how Red10 uses Redis databases and keys for handling everything outside of the machine learning use cases.

In [2]:
from __future__ import print_function
import sys, os, requests, json, datetime

# Load the environment and login the user
from src.common.load_redten_ipython_env import user_token, user_login, csv_file, run_job, core, api_urls, ppj, rt_url, rt_user, rt_pass, rt_email, lg, good, boom, anmt, mark, ppj, uni_key, rest_login_as_user, rest_full_login, wait_for_job_to_finish, wait_on_job, get_job_analysis, get_job_results, get_analysis_manifest, get_job_cache_manifest, build_prediction_results, build_forecast_results, get_job_cache_manifest, search_ml_jobs, show_logs, show_errors, ipyImage, ipyHTML, ipyDisplay, pd, np

# header
lg("")
good("Starting Redis Key Analysis: " + str(datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")))
lg("")

# walk the redis dbs
for db_idx in range(0, 16):
    db_file = "/tmp/db-" + str(db_idx) + "-keys"
    os.system("echo \"select " + str(db_idx) + " \n keys *\" | redis-cli > " + str(db_file))
    database_keys = {}
    if os.path.exists(db_file): # if the file exists
        with open(db_file) as f: # open it
            lines_to_parse = f.readlines() # read all the lines
            for org_line in lines_to_parse:
                cur_line = org_line.rstrip("\n").strip().lstrip() # remove the newline and any other whitespace
                if str(cur_line) != "OK" and str(cur_line) != "":
                    database_keys[cur_line] = True
                # if it's not the redis OK response
            # for all lines
        # with the db file open
    else:
        boom("Failed parsing Redis db_file=" + str(db_file))
    # end of parsing db_file
    
    anmt("Redis DB=" + str(db_idx) + " keys=" + str(len(database_keys)))

    # used for manual inspection
    sample_cache_record = {}
    
    if db_idx == 0:
        equity_ticker = "SPY"
        num_equities = 0
        detailed_keys = []
        for idx,k in enumerate(database_keys):
            if "EQTY_" in str(k) or "EQID_" in str(k):
                num_equities += 1
            else:
                lg(" - key=" + str(idx) + " name: " + str(k))
                detailed_keys.append(k)
                
                # Pull this record from the "STOCKS:EQ_DAILY_SPY" 
                # redis location => <cache app name:redis key> 
                if "EQ_DAILY_SPY" == k:
                    sample_cache_record = core.get_cache_from_redis(core.m_rds["STOCKS"], k, False, False)
                # end of pulling a sample
                
        # for the large db keyset
        
        if len(sample_cache_record) > 0:
            lg("")
            lg("Daily Sticks for Ticker(" + str(equity_ticker) + ") StartDate(" + str(sample_cache_record["Record"]["StartDate"]) + ") Sticks(" + str(len(sample_cache_record["Record"]["Sticks"])) + ")", 6)
            lg("Date, High, Low, Open, Close, Volume", 5)
            for idx,record in enumerate(reversed(sample_cache_record["Record"]["Sticks"])):
                lg(record["Date"] + ", " + record["High"] + ", " + record["Low"] + ", " + record["Open"] + ", " + record["Close"] + ", " + record["Volume"])
                # stop after a few
                if idx > 10:
                    break
            # end for all sticks in the cache
            
            lg("")
        # end of inspecting the sample record
        
        lg("DB=" + str(db_idx) + " detailed_keys=" + str(len(detailed_keys)) + " equities=" + str(num_equities))
    else:
        for k in database_keys:
            if "session:" in k:
                lg(" - key: session:<redacted>")
            else:
                lg(" - key: " + str(k))
        # end of for all keys
    # end of for the post processing keyset in this redis db
    
    lg("---------------------------------------------")
    lg("")
    
# end for all db files


[32mStarting Redis Key Analysis: 2017-05-23 18:20:41[0m

[33mRedis DB=0 keys=16856[0m
 - key=279 name: DS_01_day_XLF_STICKS
 - key=421 name: _STATS_DAILY_SUMMARY_XLF
 - key=949 name: _OPTS_XLF_2017-05-19
 - key=1159 name: _ALLBESTSSPREADS_XLU
 - key=1166 name: _ALLBESTSSPREADS_XLF
 - key=1167 name: _ALLBESTSSPREADS_XLE
 - key=1171 name: _ALLBESTSSPREADS_XLI
 - key=1368 name: _OPTS_XLU_2017-06-16
 - key=1758 name: _390_min_XLU
 - key=1843 name: _LST_SPY_PRICING
 - key=3408 name: _390_min_XLI
 - key=3768 name: _OPTS_XLI_LATEST
 - key=4979 name: _OPTS_SPY_2017-06-16
 - key=5013 name: _OPTS_XLF_2017-06-16
 - key=5595 name: _STATS_DAILY_SUMMARY_XLE
 - key=5599 name: _STATS_DAILY_SUMMARY_XLI
 - key=5601 name: _STATS_DAILY_SUMMARY_XLU
 - key=6129 name: _LAST_390_min_XLI
 - key=6131 name: _LAST_390_min_XLF
 - key=6132 name: _LAST_390_min_XLE
 - key=6138 name: _LAST_390_min_XLU
 - key=6589 name: _OPTS_XLE_2017-05-19
 - key=6654 name: _OPTS_XLU_LATEST
 - key=6713 name: _STATS_DAILY_SUMMARY_

### Appendix F - Debugging Tools

### Develop with Swagger

![Red10 Swagger](http://i.imgur.com/fYvg1mh.gif)

## Search Jobs in Elasticsearch

Red10 is running an ELK stack for searching by user, dataset identifiers, and jobs.

![ELK](http://i.imgur.com/cUCebca.png)

Once a job completes it is automatically published to Elasticsearch

In [3]:
from __future__ import print_function
import sys, os, requests, json, datetime

# Load the environment and login the user
from src.common.load_redten_ipython_env import user_token, user_login, csv_file, run_job, core, api_urls, ppj, rt_url, rt_user, rt_pass, rt_email, lg, good, boom, anmt, mark, ppj, uni_key, rest_login_as_user, rest_full_login, wait_for_job_to_finish, wait_on_job, get_job_analysis, get_job_results, get_analysis_manifest, get_job_cache_manifest, build_prediction_results, build_forecast_results, get_job_cache_manifest, search_ml_jobs, show_logs, show_errors, ipyImage, ipyHTML, ipyDisplay, pd, np

search_req = {
        "title" : "", # job title with completion
        "dsname" : "SPY", # dataset name with completion
        "desc" : "", # description with completion
        "features" : "", # feature search
        "target_column" : "" # name of target column for this analysis
    }
job_search = {}
job_res = {}
if len(search_req) == 0 :
    boom("Please create a valid search request")
else:
    job_res = search_ml_jobs(search_req)

    if job_res["status"] != "SUCCESS":
        boom("Job=" + str(job_id) + " failed with status=" + str(job_res["status"]) + " err=" + str(job_res["error"]))
    else:
        job_search = job_res["record"]
        anmt("Job Matches=" + str(len(job_search)))
        lg(ppj(job_search), 5)
    # found jobs
# end of searching for job

Searching ML Jobs url=https://redten.io/ml/search/
[32mSUCCESS - Job Search Response Status=200 Reason=OK[0m
[32mFound Job={'target_column': '', 'desc': '', 'dsname': 'SPY', 'features': '', 'title': ''} results[0m
[33mJob Matches=1[0m
[32m{
    "jobs": [
        {
            "algo_name": "xgb-regressor",
            "desc": "Forecast simulation - 2017-05-23 22:00:09",
            "ds_name": "SPY",
            "feature_column_names": [
                "FHigh",
                "FLow",
                "FOpen",
                "FClose",
                "FVolume"
            ],
            "ignore_features": [
                "Ticker",
                "Date",
                "FDate",
                "FPrice",
                "DcsnDate",
                "Decision"
            ],
            "images": [
                {
                    "id": 12006,
                    "image": "https://rt-media.s3.amazonaws.com/media/imagesml/20170523/2_555_12006_75950bb597ae44cf.png",
          

### Pulling the Latest Error Logs from Elasticsearch

In [4]:
boom("Finding latest error logs:")
show_errors(limit=50)

[31mFinding latest error logs:[0m
2017-05-24 06:38:28 - ERROR - Demo error logs from celery workers - uuid=ec9440aa1c764cf4a33dcdfe6026f88
2017-05-24 06:38:28 - ERROR - Demo error logs from celery workers - uuid=f650f23f0a664927a96a71875ca3068
2017-05-24 06:38:28 - ERROR - Demo error logs from celery workers - uuid=08ceca09a27b4f78b930a005c274dc9
2017-05-24 06:38:28 - ERROR - Demo error logs from celery workers - uuid=9ebef890badc4c1580465d8b91dd63d
2017-05-24 06:38:28 - ERROR - Demo error logs from celery workers - uuid=f3f20e7324e44627b3f92a7799eaa3a
2017-05-24 06:38:28 - ERROR - Demo error logs from celery workers - uuid=2a9c417e87f64166bd8c2a6a6599f84
2017-05-24 06:38:28 - ERROR - Demo error logs from celery workers - uuid=68362d7935b04f8388923c4ed824818
2017-05-24 06:38:28 - ERROR - Demo error logs from celery workers - uuid=68e882a944904f7bb6d936720b97721
2017-05-24 06:38:28 - ERROR - Demo error logs from celery workers - uuid=b5746788a0044d00babb9f07214c8b2
2017-05-24 06:38:28

### Pulling the Latest Logs from Elasticsearch

In [5]:
anmt("Finding latest logs:")
show_logs(limit=50)

[33mFinding latest logs:[0m
2017-05-24 06:38:28 - INFO - Demo info logs from celery workers - uuid=75e406e1515d451c8af06136c52e478
2017-05-24 06:38:28 - INFO - Demo info logs from celery workers - uuid=ffb1ac5fe34f4130a8eeea4fe188ef9
2017-05-24 06:38:28 - INFO - Demo info logs from celery workers - uuid=1f6954c4cd5e46ed9b0b0584199d725
2017-05-24 06:38:28 - INFO - Demo info logs from celery workers - uuid=c9a5a8f696a04c4bb7716a559718224
2017-05-24 06:38:28 - INFO - Demo info logs from celery workers - uuid=2b3850f444cd4e0cad5f1f1ae72168c
2017-05-24 06:38:28 - INFO - Demo info logs from celery workers - uuid=d056dc9c3c604d9a916a86127d7e31a
2017-05-24 06:38:28 - INFO - Demo info logs from celery workers - uuid=8d111ebd1eee4c68ac031dcd1f4c530
2017-05-24 06:38:28 - INFO - Demo info logs from celery workers - uuid=5aa041864fb24c34988b259330d1332
2017-05-24 06:38:28 - INFO - Demo info logs from celery workers - uuid=fd2647c217304a9ab87d1f31adb6ea9
2017-05-24 06:38:28 - INFO - Demo info logs

### Appendix G - Terminology and Definitions

1. What is a Machine Learning Model?
An algorithm is the general approach you will take. The model is what you get when you run the algorithm over your training data and what you use to make predictions on new data. You can generate a new model with the same algorithm with different data, or a different model from the same data with a different algorithm. https://www.quora.com/What-is-the-difference-between-machine-learning-model-and-ML-algorithm
1. XGB / eXtreme Gradient Boosting - http://xgboost.readthedocs.io/en/latest/model.html
1. Pub/Sub - publisher to subscriber - https://en.wikipedia.org/wiki/Publish%E2%80%93subscribe_pattern
1. RDB - https://redis.io/topics/persistence
1. Redis - https://redislabs.com/ and https://redis.io/ 
1. Docker - https://www.docker.com/
1. Django Rest Framework - http://www.django-rest-framework.org/
1. Celery - http://celery.readthedocs.io/en/latest/
1. Jupyter - http://jupyter.org/
1. S3 - https://aws.amazon.com/s3/
1. Elasticsearch - https://www.elastic.co/
1. Logstash - https://www.elastic.co/products/logstash
1. Kibana - https://www.elastic.co/products/kibana
1. nginx - https://nginx.org/en/

### Appendix H - Machine learning tools and products

1. Powerful tools and predictive systems that take time to master, automate and integrate: https://github.com/dmlc, https://www.tensorflow.org, http://caffe.berkeleyvision.org/, https://github.com/Theano/Theano, https://keras.io/
1. Automated cloud platforms with vendor lock: https://aws.amazon.com/machine-learning/, https://azure.microsoft.com/en-us/services/machine-learning/, https://www.ibm.com/watson/
1. Interactive notebook technologies - https://jupyter-notebook.readthedocs.io/en/latest/
1. Tons of visualization tools