# Continuous Training of AI for IBM North Pole Accelerator with HPE MLOPs Platform

author: Andrew Mendez, andrew.mendez@hpe.com

Version: 0.0.1

Date: 1.18.24

* In this notebook, we see how we can create an AI application that can automatically update as we add more data. 
* Specifically, we are continuosly training an object detection model to detect vehicles and personell in full-motion-video (FMV)
* We use MLDM to manage data and pipeline orchestration and Streamlit for the user facing application.

`Pre-requisites: This demo requires a GPU`

Details:
* We are finetuning a YoloV4 architecture (original codebase [WongKinYiu/PyTorch_YOLOv4](https://github.com/WongKinYiu/PyTorch_YOLOv4)
* We develop an MLOPs pipeline that does the following:
    * Load the initial dataset
    * Preprocess the dataset
    * Finetune a pretrained YoloV4 model on FMV data
    * Export the model to run on the IBM NorthPole Chip
    * Deploy a user facing application to process new FMV videos
* We create and end-to-end MLOPs pipeline to show how we can continously update a model based on new data

In [None]:
# How will we build this? 
Using HPE's Machine Learning Operations (MLOps) platform
<img src="./static/platform_step3.png" alt="Enterprise Machine Learning platform architecture" width="850">

In [None]:
# Overview of MLOPs Pipeline

Our ML Pipline consists:

In [None]:
## Install pachctl and connect to pachyderm

In [1]:
# Connect to deployed pachyderm application
!pachctl connect pachd-peer.pachyderm.svc.cluster.local:30653
# list current projects
!pachctl version

New context 'pachd-peer.pachyderm.svc.cluster.local:30653' created, will connect to Pachyderm at grpc://pachd-peer.pachyderm.svc.cluster.local:30653
Context 'pachd-peer.pachyderm.svc.cluster.local:30653' set as active
COMPONENT           VERSION             
pachctl             2.8.2               
pachd               2.8.2               


In [2]:
## Create project and set active context

In [3]:
# Create Pachyderm application
!pachctl create project north-pole
# Set pachctl's active context to the deploy-rag project
!pachctl config update context --project north-pole

project "north-pole" already exists
project north-pole already exists
editing the currently active context "pachd-peer.pachyderm.svc.cluster.local:30653"


In [4]:
## Create the data repo. 
* The data repo contains the documents we will ingest into the vector database and RAG system

SyntaxError: invalid syntax (2589833664.py, line 2)

In [5]:
!pachctl create repo data

In [6]:
%%capture
!pachctl put file -r  data@master:/ -f /nvmefs1/andrew.mendez/virat-aerial-156-frames-v2-coco-yolov5-subset/

In [7]:
# Train Pipeline

In [8]:
%%writefile train.yaml
pipeline:
    name: 'train'
description: 'Finetune model on FMV dataset'
input:
    cross:
        - pfs: 
            repo: 'data'
            branch: 'master'
            glob: '/'
transform:
    image: mendeza/yolov4-env:0.0.2
    cmd: 
        - '/bin/sh'
    stdin: 
        - 'bash /nvmefs1/shared_nb/01\ -\ Users/andrew.mendez/2024/PyTorch_YOLOv4/train-pipeline-runner.sh'
        - 'echo "$(openssl rand -base64 12)" > /pfs/out/random_file.txt'
    secrets:
        - name: pipeline-secret
          key: det_master
          env_var: DET_MASTER
        - name: pipeline-secret
          key: det_user
          env_var: DET_USER
        - name: pipeline-secret
          key: det_password
          env_var: DET_PASSWORD
        - name: pipeline-secret
          key: pac_token
          env_var: PAC_TOKEN
autoscaling: False
pod_patch: >-
  [{"op": "add","path": "/volumes/-","value": {"name":
  "host-shared","hostpath": {"path":
  "/nvmefs1/","type": "Directory"}}}, {"op":
  "add","path": "/containers/0/volumeMounts/-","value": {"mountPath":
  "/nvmefs1/","name": "host-shared"}}]

Overwriting train.yaml


In [9]:
!pachctl create pipeline -f train.yaml

In [12]:
%%writefile export.yaml
pipeline:
    name: 'export'
description: 'Export trained model for NorthPole Accelerator'
input:
    cross:
        - pfs: 
            repo: 'data'
            branch: 'master'
            glob: '/'
        - pfs: 
            repo: 'train'
            branch: 'master'
            glob: '/'
transform:
    image: mendeza/yolov4-env:0.0.2
    cmd: 
        - '/bin/sh'
    stdin: 
        - 'bash /nvmefs1/shared_nb/01\ -\ Users/andrew.mendez/2024/PyTorch_YOLOv4/export-model-runner.sh'
    secrets:
        - name: pipeline-secret
          key: det_master
          env_var: DET_MASTER
        - name: pipeline-secret
          key: det_user
          env_var: DET_USER
        - name: pipeline-secret
          key: det_password
          env_var: DET_PASSWORD
        - name: pipeline-secret
          key: pac_token
          env_var: PAC_TOKEN
autoscaling: False
pod_patch: >-
  [{"op": "add","path": "/volumes/-","value": {"name":
  "host-shared","hostpath": {"path":
  "/nvmefs1/","type": "Directory"}}}, {"op":
  "add","path": "/containers/0/volumeMounts/-","value": {"mountPath":
  "/nvmefs1/","name": "host-shared"}}]

Overwriting export.yaml


In [13]:
!pachctl create pipeline -f export.yaml

In [26]:
%%writefile deploy.yaml
pipeline:
    name: 'deploy'
description: 'Deploy application'
input:
    cross:
        - pfs: 
            repo: 'data'
            branch: 'master'
            glob: '/'
        - pfs: 
            repo: 'export'
            branch: 'master'
            glob: '/'
transform:
    image: mendeza/yolov4-env:0.0.2
    cmd: 
        - '/bin/sh'
    stdin: 
        - 'bash /nvmefs1/shared_nb/01\ -\ Users/andrew.mendez/2024/PyTorch_YOLOv4/app/deploy.sh'
        - 'echo "$(openssl rand -base64 12)" > /pfs/out/random_file.txt'
    secrets:
        - name: pipeline-secret
          key: det_master
          env_var: DET_MASTER
        - name: pipeline-secret
          key: det_user
          env_var: DET_USER
        - name: pipeline-secret
          key: det_password
          env_var: DET_PASSWORD
        - name: pipeline-secret
          key: pac_token
          env_var: PAC_TOKEN
autoscaling: False
pod_patch: >-
  [{"op": "add","path": "/volumes/-","value": {"name":
  "host-shared","hostpath": {"path":
  "/nvmefs1/","type": "Directory"}}}, {"op":
  "add","path": "/containers/0/volumeMounts/-","value": {"mountPath":
  "/nvmefs1/","name": "host-shared"}}]

Overwriting deploy.yaml


In [19]:
# run pipeline

In [27]:
!pachctl create pipeline -f deploy.yaml