-
Notifications
You must be signed in to change notification settings - Fork 0
Getting started with SinaraML
After completion of this tutorial you will create SinaraML Server, setup example ML pipeline and train model. After than you will create model image with REST service.
RAM: 16 Gb
vCPU: 4
Disk Free space: 20Gb + space for user's data
SinaraML components can be run on Linux, MacOS and Windows. Following programs should be installed.
- Docker is running
- Python 3.6+ installed
- Unzip installed
- Git installed
To check prerequisites please follow Prerequisites checklist for Linux
- Docker Desktop is running
- Ubuntu installed
- Python 3.6+ installed in Ubuntu
- Unzip installed in Ubuntu
- Git installed in Ubuntu
To check prerequisites please follow Prerequisites checklist for Windows
For now, only MacOS running on Intel CPU supperted MacOS devices running on the Apple m series CPU can experience issues when running Apache Spark tasks.
- Docker Desktop is running
- Python 3.6+ installed
- Unzip installed
- Git installed
- Before sinaraml installation add python 3 scripts to PATH:
echo "/Users/$(whoami)/Library/Python/3.8/bin:$PATH" >> ~/.zshrc
- All commands should use
pip3instead ofpip
To check prerequisites please follow Prerequisites checklist for MacOS
Use SinaraML CLI for SinaraML Server management - https://pypi.org/project/sinaraml/
Important
In Linux and MacOS use built-in terminal
In Windows use Ubuntu terminal
pip install sinaramlif you see a warning in the end of the installation process:
WARNING: The script sinara is installed in '/home/testuser/.local/bin' which is not on PATH.
Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Add path to $HOME/.local/bin to your $PATH environment variable to enable cli commands.
export PATH=$PATH:/some/new/path:$HOME/.local/binYou may need reload shell or reboot your machine after installation to enable CLI commands.
Perform Linux post-installation steps for Docker Engine to run SinaraML CLI commands without sudo:
sudo groupadd -f docker
sudo usermod -aG docker $USERFor additional details check Remote Platform
After installing SinaraML CLI you can use 'sinara' command in terminal (or remote machine terminal via ssh).
First, create a SinaraML server.
sinara server create
Second, run SinaraML server:
sinara server start
After SinaraML Server starts you will see urls where running server is available so you can open it in your browser.
Important
Please read Known Issues to address inconvenience that you can experience while running pipelines.
Open Jupyter Notebook Server at http://127.0.0.1:8888/lab in any browser.
Inside Jupyter server terminal execute:
git clone --recursive https://github.com/4-DS/pipeline-step_template.git
cd pipeline-step_template
Perform following actions:
- Open 'prepare_data_for_template.ipynb' notebook and run all cells to get sample data or execute following command in the server terminal:
ipython3 prepare_data_for_template.ipynb- Execute 'step.dev.py' in Jupyter server terminal:
python step.dev.pyTo stop SinaraML Server execute:
sinara server stop
To continue using the SinaraML execute:
sinara server start
To remove the SinaraML Server execute:
sinara server remove
Note
When on creating and removing SinaraML by default setup scripts creates docker volumes for data, code and temporary data. For day to day usage its recommended to use folder mapping on local disk using create command with option:
sinara server create --runMode b
You will be asked to enter host path where to store server's '/data', '/tmp' and '/work' folders.
Following example shows how to build a model serving pipeline from a raw dataset to a ML-Model docker container with REST API.
ML-pipeline is based on SinaraML Framework and tools.
This example ML-model calculates house median price. Based on open data sample.
Example pipeline includes 5 steps, which must be run sequentially:
- Data Load
- Data Preparation
- Model Train
- Model Evaluation
- Model Test
Warning
Creating new pipeline from examples are not recommended since examples can be outdated and use old version if the SinaraML Library. Please see Creating New Pipeline Tutorial
This step downloads a csv-dataset from the internet and converts it to partitioned parquet files that can be read by Apache spark later in efficient way
To run the step do:
- Clone git repository:
git clone --recursive https://github.com/4-DS/house_price-data_load.git
cd house_price-data_load
- Run step:
python step.dev.py
This step splits the dataset into train, test and evaluation sets using partitioned parquets made by the data load step
To run the step do:
- Clone git repository:
git clone --recursive https://github.com/4-DS/house_price-data_prep.git
cd house_price-data_prep
- Run step:
python step.dev.py
This step
- Trains a GradientBoostingRegressor model on the train set made by the data prep step
- Packs model to a BentoService using BentoML library
To run the step do:
- Clone git repository:
git clone --recursive https://github.com/4-DS/house_price-model_train.git
cd house_price-model_train
- Run step:
python step.dev.py
This step checks quality of the model made by the model train step
To run the step do:
- Clone git repository:
git clone --recursive https://github.com/4-DS/house_price-model_eval.git
cd house_price-model_eval
- Run step:
python step.dev.py
This step checks that bento service's REST-API (made by model train step) is working properly before it will be built into a docker image
To run the step do:
- Clone git repository:
git clone --recursive https://github.com/4-DS/house_price-model_test.git
cd house_price-model_test
- Run step:
python step.dev.py
Important
Following commands should be executed in the host terminal.
After running all 5 steps we are ready to build a docker image with REST API. To build do the following:
- SinaraML CLI should be installed
- Create docker image for model service - execute:
sinara model containerize
- Enter a path to bento service's model.zip inside your running dev jupyter environment
- Enter repository url to push image to (enter "local" to use local docker repository on machine where docker is installed)
- When docker build command finishes running it will output a model image name which we'll run
- Execute docker run command with the image name from previous step.
Ensure that 5000 port is free on the host system or you can choose your own port that REST Service will be available at:
docker run -p 0.0.0.0:5000:5000 %model_image_name%
- Swagger UI of the model should be available at http://127.0.0.1:5000
To create your own pipeline please read Creating New Pipeline Tutorial