Docker support for PredictionIO
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.ivy2
bin
charts
deploy
elasticsearch
jupyter-dev
jupyter
localfs
mysql
pgsql
pio
templates
JUPYTER.md
LICENSE
README.md
docker-compose.deploy.yml
docker-compose.jupyter.yml
docker-compose.spark.yml
docker-compose.yml

README.md

Apache PredictionIO Docker

Overview

PredictionIO Docker provides Docker image for use in development and production environment.

Usage

Run PredictionIO with Selectable docker-compose Files

You can choose storages for event/meta/model to select docker-compose.yml.

docker-compose -f docker-compose.yml -f ... up

Supported storages are as below:

Type Storage
Event Postgresql, MySQL, Elasticsearch
Meta Postgresql, MySQL, Elasticsearch
Model Postgresql, MySQL, LocalFS

If you run PredictionIO with Postgresql, run as below:

docker-compose -f docker-compose.yml \
  -f pgsql/docker-compose.base.yml \
  -f pgsql/docker-compose.meta.yml \
  -f pgsql/docker-compose.event.yml \
  -f pgsql/docker-compose.model.yml \
  up

To use localfs as model storage, change as below:

docker-compose -f docker-compose.yml \
  -f pgsql/docker-compose.base.yml \
  -f pgsql/docker-compose.meta.yml \
  -f pgsql/docker-compose.event.yml \
  -f localfs/docker-compose.model.yml \
  up

Tutorial

In this demo, we will show you how to build a recommendation template.

Run PredictionIO environment

The following command starts PredictionIO with an event server. PredictionIO docker image mounts ./templates directory to /templates.

$ docker-compose -f docker-compose.yml \
    -f pgsql/docker-compose.base.yml \
    -f pgsql/docker-compose.meta.yml \
    -f pgsql/docker-compose.event.yml \
    -f pgsql/docker-compose.model.yml \
    up

We provide pio-docker command as an utility for pio command. pio-docker invokes pio command in PredictionIO container.

$ export PATH=`pwd`/bin:$PATH
$ pio-docker status
...
[INFO] [Management$] Your system is all ready to go.

Download Recommendation Template

This demo uses predictionio-template-recommender.

$ cd templates
$ git clone https://github.com/apache/predictionio-template-recommender.git MyRecommendation
$ cd MyRecommendation

Register Application

You need to register this application to PredictionIO:

$ pio-docker app new MyApp1
[INFO] [App$] Initialized Event Store for this app ID: 1.
[INFO] [Pio$] Created a new app:
[INFO] [Pio$]       Name: MyApp1
[INFO] [Pio$]         ID: 1
[INFO] [Pio$] Access Key: i-zc4EleEM577EJhx3CzQhZZ0NnjBKKdSbp3MiR5JDb2zdTKKzH9nF6KLqjlMnvl

Since an access key is required in subsequent steps, set it to ACCESS_KEY.

$ ACCESS_KEY=i-zc4EleEM577EJhx3CzQhZZ0NnjBKKdSbp3MiR5JDb2zdTKKzH9nF6KLqjlMnvl

engine.json contains an application name, so replace INVALID_APP_NAME with MyApp1.

...
"datasource": {
  "params" : {
    "appName": "MyApp1"
  }
},
...

Import Data

To import training data to Event server for PredictionIO, this template provides an import tool. The tool depends on PredictionIO Python SDK and install as below:

$ pip install predictionio

and then import data:

$ curl https://raw.githubusercontent.com/apache/spark/master/data/mllib/sample_movielens_data.txt --create-dirs -o data/sample_movielens_data.txt
$ python data/import_eventserver.py --access_key $ACCESS_KEY

Build Template

This is Scala based template. So, you need to build this template by pio command.

$ pio-docker build --verbose

Train and Create Model

To train a recommendation model, run train sub-command:

$ pio-docker train

Deploy Model

If a recommendation model is created successfully, deploy it to Prediction server for PredictionIO.

$ pio-docker deploy

You can check predictions as below:

$ curl -H "Content-Type: application/json" \
-d '{ "user": "1", "num": 4 }' http://localhost:8000/queries.json

Advanced Topics

Run with Elasticsearch

For Elasticsearch, Meta and Event storage are available. To start PredictionIO with Elasticsearch,

docker-compose -f docker-compose.yml \
  -f elasticsearch/docker-compose.base.yml \
  -f elasticsearch/docker-compose.meta.yml \
  -f elasticsearch/docker-compose.event.yml \
  -f localfs/docker-compose.model.yml \
  up

Run with Spark Cluster

Adding docker-compose.spark.yml, you can use Spark cluster on pio train.

docker-compose -f docker-compose.yml \
  -f docker-compose.spark.yml \
  -f elasticsearch/docker-compose.base.yml \
  -f elasticsearch/docker-compose.meta.yml \
  -f elasticsearch/docker-compose.event.yml \
  -f localfs/docker-compose.model.yml \
  up

To submit a training task to Spark Cluster, run pio-deploy train with --master option:

pio-docker train -- --master spark://spark-master:7077

See docker-compose.spark.yml if changing settings for Spark Cluster.

Run Engine Server

To deploy your engine and start an engine server, run Docker with docker-compose.deploy.yml.

docker-compose -f docker-compose.yml \
  -f pgsql/docker-compose.base.yml \
  -f pgsql/docker-compose.meta.yml \
  -f pgsql/docker-compose.event.yml \
  -f pgsql/docker-compose.model.yml \
  -f docker-compose.deploy.yml \
  up

See deploy/run.sh and docker-compose.deploy.yml if changing a deployment.

Run with Jupyter

You can launch PredictionIO with Jupyter.

docker-compose -f docker-compose.jupyter.yml \
  -f pgsql/docker-compose.base.yml \
  -f pgsql/docker-compose.meta.yml \
  -f pgsql/docker-compose.event.yml \
  -f pgsql/docker-compose.model.yml \
  up

For more information, see JUPYTER.md.

Development

Build Base Docker Image

docker build -t predictionio/pio pio

Build Jupyter Docker Image

docker build -t predictionio/pio-jupyter jupyter

Push Docker Image

docker push predictionio/pio:latest
docker tag predictionio/pio:latest predictionio/pio:$PIO_VERSION
docker push predictionio/pio:$PIO_VERSION