Alejandro Saucedo
@AxSaucedo
in/axsaucedo
[NEXT]
![portrait](images/alejandro.jpg)
Alejandro Saucedo |
|
[NEXT]
An overview of caveats in deploying ML
Very high level talkIntuitive overview of ML
Going distributed, and beyond
[NEXT]
Today we are...
[NEXT]
Let's jump the hype-train!
A ML framework for crypto-dataSupporting heavy compute/memory ML
Can our system survive the 2017 crypto-craze?
[NEXT]
All historical data from top 100 cryptos
Data goes from beginning to 11/2017
* Supporting heavy ML computations * Supporting increasing traffic563871 daily-price (close) points
Objectives:
[NEXT]
from crypto_ml.data_loader import CryptoLoader cl
loader = cl()
loader.get_prices("bitcoin")
> array([ 134.21, 144.54, 139. , ..., 3905.95, 3631.04, 3630.7 ])
loader.get_df("bitcoin").head()
> Date Open High Low Close Market Cap
> 1608 2013-04-28 135.30 135.98 132.10 134.21 1,500,520,000
> 1607 2013-04-29 134.44 147.49 134.00 144.54 1,491,160,000
> 1606 2013-04-30 144.00 146.93 134.05 139.00 1,597,780,000
> 1605 2013-05-01 139.00 139.89 107.72 116.99 1,542,820,000
> 1604 2013-05-02 116.38 125.60 92.28 105.21 1,292,190,000
[NEXT]
from crypto_ml.manager import CryptoManager as cm
manager = cm()
manager.send_tasks()
> bitcoin [[4952.28323284 5492.85474648 6033.42626011 6573.99777374 7114.56928738
> 7655.14080101 8195.71231465 8736.28382828 9276.85534192 9817.42685555]]
> bitconnect [[157.70136155 181.86603134 206.03070113 230.19537092 254.36004071
> 278.5247105 302.6893803 326.85405009 351.01871988 375.18338967]]
[NEXT]
https://github.com/axsauze/crypto-ml
http://github.com/axsauze/industrial-machine-learning
[NEXT]
[NEXT]
[NEXT]
Crypto-ML Ltd. managed to obtain access to a unique dataset, which allowed them
to built their initial prototype and raise massive amounts of VC money
import random
def predict_crypto(self, crypto_data):
# I have no clue what I'm doing
return crypto_data * random.uniform(0, 1)
[NEXT]
[NEXT] Given some input data, predict the correct output
Let's try to predict whether a shape is a square or a triangle
[NEXT]
- Imagine a 2-d plot
- The x-axis is the area of the input shape
- The y-axis is the perimeter of the input shape
[NEXT]
**x̄** is input (area & perimeter)
**m** and **b** are weights/bias
The result
(i.e. if it's larger than 0.5 it's triangle otherwise square)
[NEXT]
[NEXT]
[NEXT]
[NEXT]
We optimise the model by minimising its loss.
Keep adjusting the weights...
...until loss is not getting any smaller.
[NEXT]
When it finishes, we find optimised weights and biases
i.e.
[NEXT]
Once we have our function, we can predict NEW data!
[NEXT]
Please collect your certificates after the talk
These are valid in:
- Your Linkedin profile
- Non-tech Meetups and Parties
- Any time you reply to a tweet
[NEXT] The Crypto-ML devs asked themselves...
from crypto_ml.data_loader import CryptoLoader
btc = CryptoLoader().get_df("bitcoin")
btc.head()
> Date Open High Low Close Market Cap
> 1608 2013-04-28 135.30 135.98 132.10 134.21 1,500,520,000
> 1607 2013-04-29 134.44 147.49 134.00 144.54 1,491,160,000
> 1606 2013-04-30 144.00 146.93 134.05 139.00 1,597,780,000
> 1605 2013-05-01 139.00 139.89 107.72 116.99 1,542,820,000
> 1604 2013-05-02 116.38 125.60 92.28 105.21 1,292,190,000
...can this be used for our cryptocurrency price data?
[NEXT]
Processing sequential data requires a different approach.
Instead of trying to predict two classes...
...we want to predict future steps
[NEXT]
Sequential models often are used to predict future data.
f(x) = mx + b
To find the weights and biases
But can be used on time-sequence data - ie. prices, words, characters, etc.
[NEXT]
Predicting prices by fitting a line on set of time-series points
[NEXT]
from sklearn import linear_model
def predict(prices, times, predict=10):
model = linear_model.LinearRegression()
model.fit(times, prices)
predict_times = get_prediction_timeline(times, predict)
return model.predict(predict_times)
[NEXT]
from crypto_ml.data_loader import CryptoLoader
cl = CryptoLoader()
df = cl.get_df("bitcoin")
times = df[["Date"]].values
prices = df[["Price"]].values
results = predict(prices, times, 5)
[NEXT]
#cutting edge tech
[NEXT]
[NEXT] The Crypto-ML team realised that their usecases were much more complex
[NEXT] The team had to learn many critical points on machine learning development!
- Extending our feature space
- Increasing number of inputs
- Regularisation techniques (dropout, batch normalisation)
- Normalising datasets
[NEXT] But they also stumbled upon some neural network tutorials...
[NEXT]
f(x) = mx + b
Now on a simple perceptron function
in a neural network
[NEXT]
[NEXT]
This gives the function more flexibility
[NEXT]
This gives more flexibility for learning
[NEXT]
[NEXT]
Deep Networks — many hidden layers
[NEXT]
For sequential models?
[NEXT]
One node represents a full layer of neurons.
[NEXT]
One node represents a full layer of neurons.
[NEXT]
Previous predictions help make the next prediction.
Each prediction is a time step.
[NEXT]
Hidden layer's input includes the output of itself during the last run of the network.
[NEXT]
Cost function is based on getting the prediction right!
[NEXT]
def deep_predict(prices):
p = 10
model = get_rnn_model()
x, y = build_lstm_data(prices, 50)
model.fit(x, y, batch_size=512, nb_epoch=1, validation_split=0.05)
return rnn_predict(model, x, prices)
- Build model
- Compute the weights
- Return the prediction
[NEXT]
from sklearn import linear_model
def predict(prices, times, predict=10):
model = linear_model.LinearRegression()
model.fit(times, prices)
predict_times = get_prediction_timeline(times, predict)
return model.predict(predict_times)
def deep_predict(prices):
p = 10
model = get_rnn_model()
x, y = build_lstm_data(prices, 50)
model.fit(x, y, batch_size=512, nb_epoch=1, validation_split=0.05)
return rnn_predict(model, x, prices)
[NEXT]
from keras.layers.core import Dense, Activation, Dropout
from keras.layers.recurrent import LSTM
from keras.models import Sequential
import lstm
def get_rnn_model():
model = Sequential()
model.add(LSTM(input_dim=1, output_dim=50, return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(100, return_sequences=False))
model.add(Dropout(0.2))
model.add(Dense(output_dim=1))
model.add(Activation('linear'))
model.compile(loss="mse", optimizer="rmsprop")
return model
A linear dense layer to aggregate the data into a single value
Compile with mean sq. error & gradient descent as optimizer
Simple!
[NEXT]
from crypto_ml.data_loader import CryptoLoader
cl = CryptoLoader()
df = cl.get_df("bitcoin")
times = df[["Date"]].values
prices = df[["Price"]].values
results = deep_predict(prices, times, 5)
[NEXT]
In this example we are training and predicting in the same function.
Normally you would train your model, and then run the model in "production" for predictions
[NEXT]
Are we done then?
The fun is just starting
[NEXT]
[NEXT]
After CryptoML was caught using deep learning...
...they were featured in the top 10 global news
[NEXT]
Now they have quite a few users coming in every day
Each user is running several ML algorithms concurrently
They tried getting larger and larger AWS servers
[NEXT]
- Machine learning is known for being computing heavy
- But often it's forgotten how memory-heavy it is
- I'm talking VERY heavy - holding whole models in-mem
- Scaling to bigger instances with more cores is expensive
- Having everything in one node is a central point of failure
[NEXT]
The Crypto-ML Devs thought go distributed was too hard
It's not.
[NEXT]
- Distributed
- Asynchronous
- Task Queue
- For Python
[NEXT]
def deep_predict(prices, times, predict=10):
model = utils.get_rnn_model()
model.fit(time, prices, batch_size=512, nb_epoch=1, validation_split=0.05)
predict_times = get_prediction_timeline(times, predict)
return model.predict(predict_times)
[NEXT]
from celery import Celery
from utils import load, dump
# Initialise celery with rabbitMQ address
app = Celery('crypto_ml',
backend='amqp://guest@localhost/',
broker='amqp://guest@localhost/')
# Add decorator for task (can also be class)
@app.task
def deep_predict(d_prices, d_times, predict=10):
# Load from stringified binaries (Pandas Dataframes)
prices = load(d_prices)
times = load(d_times)
model = utils.get_rnn_model()
model.fit(time, prices, batch_size=512, nb_epoch=1, validation_split=0.05)
predict_times = get_prediction_timeline(times, predict)
return dump(model.predict(predict_times))
[NEXT]
from crypto_ml.models import deep_predict
$ celery -A crypto_ml worker
Monitor the activity logs:
$ celery -A crypto_ml worker
Darwin-15.6.0-x86_64-i386-64bit 2018-03-10 00:43:28
[config]
.> app: crypto_ml:0x10e81a9e8
.> transport: amqp://user:**@localhost:5672//
.> results: amqp://
.> concurrency: 8 (prefork)
.> task events: OFF (enable -E to monitor tasks in this worker)
[queues]
.> celery exchange=celery(direct) key=celery
[NEXT]
Now we just need to make the producer!
We can just follow the same recipe
[NEXT]
cl = CryptoLoader()
results = {}
# Compute results
for name in cl.get_crypto_names():
prices, times = cl.get_prices(name)
result = deep_predict(prices, times)
results[name] = result
# Print results
for k,v in results.items():
print(k, v)
[NEXT]
from crypto_ml.data_loader import CryptoLoader
from util import load, dump
cl = CryptoLoader()
results = {}
# Send task for distributed computing
for name in cl.get_crypto_names():
prices, times = cl.get_prices(name)
task = deep_predict.delay(
dump(prices)
, dump(times))
results[name] = task
# Wait for results and print
for k,v in results.items():
p_result = v.get()
if result:
result = load(p_result)
print(k, result)
[NEXT]
By just running the Python in a shell command!
[NEXT]
You can visualise through Celery "Flower"
$ celery -A crypto_ml flower
> INFO: flower.command: Visit me at http://localhost:5555
[NEXT]
We now have ML, and are distributed.
We have surely won.
We can pack our ba- oh, not yet?
[NEXT]
[NEXT]
Their datapipeline is getting unmanagable!
[NEXT]
- There is a growing need to pull data from different sources
- There is a growing need to pre- and post-process data
- As complexity increases tasks depending on others
- If a task fails we wouldn't want to run the children tasks
- Some tasks need to be triggered chronologically
- Data pipelines can get quite complex
- Having just celerized tasks ran by Linux chronjob gets messy
[NEXT]
[NEXT]
[NEXT]
The swiss army knife of data pipelines
[NEXT]
- Written in Python!
- Alternative to Chronos (also built by AirBnb) + Luigi
- Has a scheduler (like chronjob, but not like cronjob)
- Can define tasks and dependent tasks (as a pipeline)
- Has real-time visualisation of jobs
- Modular separation between framework and logic
- Can run on top of celery (without any modifications)
- Being introduced to the apache family (incubation)
- Actively maintained and growing community
- Used by tons of companies (AirBnB, Paypal, Quora,)
[NEXT]
- Airflow is not perfect (but the best out there)
- It's not a Lambda/FaaS framework (but can be programmed)
- Is not extremely mature (ie incubation)
- Airflow is not a data streaming solution (e.g. Storm/Spark Streaming)
[NEXT]
[NEXT]
[NEXT]
[NEXT]
Scheduled task:
- Operator: polls new Crypto-data + triggers each sub-dags
Sub-DAG:
- Operator: Transforms the data
- Operator: Sends data to the crypto-prediction engine
- Sensor: Polls until the prediction is finished
- Branch: Modifies data & based on rules triggers action
- Operator: Stores data in database
- Operator: Sends request to trade
[NEXT]
[NEXT]
[NEXT]
[NEXT] The CryptoML team remembers the easy days...
...when they only had a few new users a day
Now they have much heavier traffic!
They are working with large organsations!
They are processing massive loads of ML requests!
### Their DevOps infrastructure # Can't keep up!
[NEXT]
- Complexity of staging and deploying ML models
- Backwards compatibility of feature-spaces/models
- Distributing load across infrastructure
- Idle resource time minimisation
- Node failure back-up strategies
- Testing of Machine Learning functionality
- Monitoring of ML ecosystem
- And the list goes on and on...
[NEXT]
Package it. Ship it.
[NEXT]
from conda/miniconda3-centos7:latest
ADD . /crypto_ml/
WORKDIR /crypto_ml/
# Dependency for one of the ML predictors
RUN yum install mesa-libGL -y
# Install all the dependencies
RUN conda env create -f crypto_ml.yml
and...
docker -t crypto_ml .
it's done!
[NEXT]
####Yes it can!
Packaging it up and installing through pip/conda.
from conda/miniconda3-centos7:latest
RUN conda install <YOUR_PACKAGE>
Nice and simple!
[NEXT]
version: '2'
services:
manager:
container_name: crypto_manager
image: crypto_ml
build: .
links:
- rabbitmq
depends_on:
- rabbitmq
command: tail -f /dev/null
worker:
container_name: crypto_worker
image: crypto_ml
build: .
links:
- rabbitmq
depends_on:
- rabbitmq
command: /usr/local/envs/crypto_ml/bin/celery -A crypto_ml worker --prefetch-multiplier 1 --max-tasks-per-child 1 -O fair
rabbitmq:
container_name: rabbitmq
image: rabbitmq:3.6.0
environment:
- RABBITMQ_DEFAULT_USER=user
- RABBITMQ_DEFAULT_PASS=1234
ports:
- 4369
- 5672
- 15672
It all just works automagically!
docker-compose up --scale crypto_worker=4
[NEXT]
[NEXT]
Creating all the services and controllers
kubectl create -f k8s/*
[NEXT]
- Minikube
- Docker for Mac
[NEXT]
- Elastic Container Service for K8s (AWS)
- CoreOS Tectonic
- Red Hat OpenShift
[NEXT]
- CloudFormation (AWS)
- Terraform
- Kops
[NEXT]
Mini fist-pump
[NEXT] What's next for Crypto-ML?
As with every other tech startup
the roller-coaster keeps going!
[NEXT]
Obtained an intuitive understanding on ML
Learned about caveats on practical ML
Obtained tips on building distributed architectures
Got an taste of elastic DevOps infrastructure
[NEXT]
https://github.com/axsauze/crypto-ml
http://github.com/axsauze/industrial-machine-learning
[NEXT]
![portrait](images/alejandro.jpg)
Alejandro Saucedo |
|