Skip to content

Getting started with SinaraML

Maksim Buslovaev edited this page Feb 12, 2025 · 1 revision

What you'll get after completion of this tutorial

After completion of this tutorial you will create SinaraML Server, setup example ML pipeline and train model. After than you will create model image with REST service.

SinaraML System Requirements

RAM: 16 Gb

vCPU: 4

Disk Free space: 20Gb + space for user's data

Prerequisites

SinaraML components can be run on Linux, MacOS and Windows. Following programs should be installed.

Prerequisites for Linux

  1. Docker is running
  2. Python 3.6+ installed
  3. Unzip installed
  4. Git installed

To check prerequisites please follow Prerequisites checklist for Linux

Prerequisites for Windows

  1. Docker Desktop is running
  2. Ubuntu installed
  3. Python 3.6+ installed in Ubuntu
  4. Unzip installed in Ubuntu
  5. Git installed in Ubuntu

To check prerequisites please follow Prerequisites checklist for Windows

Prerequisites for MacOS

For now, only MacOS running on Intel CPU supperted MacOS devices running on the Apple m series CPU can experience issues when running Apache Spark tasks.

  1. Docker Desktop is running
  2. Python 3.6+ installed
  3. Unzip installed
  4. Git installed
  5. Before sinaraml installation add python 3 scripts to PATH:
echo "/Users/$(whoami)/Library/Python/3.8/bin:$PATH" >> ~/.zshrc
  • All commands should use pip3 instead of pip

To check prerequisites please follow Prerequisites checklist for MacOS

Setup SinaraML on your desktop

Use SinaraML CLI for SinaraML Server management - https://pypi.org/project/sinaraml/

Important

In Linux and MacOS use built-in terminal

In Windows use Ubuntu terminal

CLI Installation

pip install sinaraml

if you see a warning in the end of the installation process:

WARNING: The script sinara is installed in '/home/testuser/.local/bin' which is not on PATH.
Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.

Add path to $HOME/.local/bin to your $PATH environment variable to enable cli commands.

export PATH=$PATH:/some/new/path:$HOME/.local/bin

You may need reload shell or reboot your machine after installation to enable CLI commands.

Configure docker on Linux or Windows

Perform Linux post-installation steps for Docker Engine to run SinaraML CLI commands without sudo:

sudo groupadd -f docker
sudo usermod -aG docker $USER

Creating and running server

For additional details check Remote Platform

After installing SinaraML CLI you can use 'sinara' command in terminal (or remote machine terminal via ssh).

First, create a SinaraML server.

sinara server create

Second, run SinaraML server:

sinara server start

After SinaraML Server starts you will see urls where running server is available so you can open it in your browser.

Running step template

Important

Please read Known Issues to address inconvenience that you can experience while running pipelines.

Open Jupyter Notebook Server at http://127.0.0.1:8888/lab in any browser.
Inside Jupyter server terminal execute:

git clone --recursive https://github.com/4-DS/pipeline-step_template.git
cd pipeline-step_template

Perform following actions:

  1. Open 'prepare_data_for_template.ipynb' notebook and run all cells to get sample data or execute following command in the server terminal:
ipython3 prepare_data_for_template.ipynb
  1. Execute 'step.dev.py' in Jupyter server terminal:
python step.dev.py

Stopping and removing server

To stop SinaraML Server execute:

sinara server stop

To continue using the SinaraML execute:

sinara server start

To remove the SinaraML Server execute:

sinara server remove

Note

When on creating and removing SinaraML by default setup scripts creates docker volumes for data, code and temporary data. For day to day usage its recommended to use folder mapping on local disk using create command with option:

sinara server create --runMode b

You will be asked to enter host path where to store server's '/data', '/tmp' and '/work' folders.

SinaraML Pipeline Example

Following example shows how to build a model serving pipeline from a raw dataset to a ML-Model docker container with REST API.
ML-pipeline is based on SinaraML Framework and tools.
This example ML-model calculates house median price. Based on open data sample.
Example pipeline includes 5 steps, which must be run sequentially:

  1. Data Load
  2. Data Preparation
  3. Model Train
  4. Model Evaluation
  5. Model Test

Warning

Creating new pipeline from examples are not recommended since examples can be outdated and use old version if the SinaraML Library. Please see Creating New Pipeline Tutorial

1. Data Load

This step downloads a csv-dataset from the internet and converts it to partitioned parquet files that can be read by Apache spark later in efficient way
To run the step do:

  1. Clone git repository:
git clone --recursive https://github.com/4-DS/house_price-data_load.git
cd house_price-data_load
  1. Run step:
python step.dev.py

2. Data Preparation

This step splits the dataset into train, test and evaluation sets using partitioned parquets made by the data load step
To run the step do:

  1. Clone git repository:
git clone --recursive https://github.com/4-DS/house_price-data_prep.git
cd house_price-data_prep
  1. Run step:
python step.dev.py

3. Model Train

This step

  • Trains a GradientBoostingRegressor model on the train set made by the data prep step
  • Packs model to a BentoService using BentoML library

To run the step do:

  1. Clone git repository:
git clone --recursive https://github.com/4-DS/house_price-model_train.git
cd house_price-model_train
  1. Run step:
python step.dev.py

4. Model Evaluation

This step checks quality of the model made by the model train step
To run the step do:

  1. Clone git repository:
git clone --recursive https://github.com/4-DS/house_price-model_eval.git
cd house_price-model_eval
  1. Run step:
python step.dev.py

5. Model Test

This step checks that bento service's REST-API (made by model train step) is working properly before it will be built into a docker image
To run the step do:

  1. Clone git repository:
git clone --recursive https://github.com/4-DS/house_price-model_test.git
cd house_price-model_test
  1. Run step:
python step.dev.py

Model Image build

Important

Following commands should be executed in the host terminal.

After running all 5 steps we are ready to build a docker image with REST API. To build do the following:

  1. SinaraML CLI should be installed
  2. Create docker image for model service - execute:
sinara model containerize
  1. Enter a path to bento service's model.zip inside your running dev jupyter environment
  2. Enter repository url to push image to (enter "local" to use local docker repository on machine where docker is installed)
  3. When docker build command finishes running it will output a model image name which we'll run
  4. Execute docker run command with the image name from previous step.
    Ensure that 5000 port is free on the host system or you can choose your own port that REST Service will be available at:
docker run -p 0.0.0.0:5000:5000 %model_image_name%
  1. Swagger UI of the model should be available at http://127.0.0.1:5000

How to create your own pipeline

To create your own pipeline please read Creating New Pipeline Tutorial

Clone this wiki locally