# Face Recognition Pipeline 

<a id="gs-mlrun-install"></a>The notebook runs the face recognition pipeline flow. 
the flow order is
1. encode image: encode the images using open cv into numbers vector
2. train: train the model based on the images encoded in step before and save model into mlrun artifacts
3. deploy the nuclio face recognition function from nuclio-face-prediction-notebook to predict person based on model deployed
4. deploy the api-serving function from nuclio-api-serving notebook to serve images sent from client and return response to clients 


## Prerequisites

### Installing the MLRun Python Package (mlrun)

To use the MLRun Python library, you need to install the `mlrun` Python package in your development environment.
This needs to be done only once, although you might occasionally need to update the package version.
When running on the Iguazio Data Science Platform you can use the provided **align_mlrun.sh** script in your **/User** directory to install the MLRun package or upgrade the version of an installed package.
By default, the script attempts to download the latest version of the MLRun package that matches the version of the running MLRun service.
To manually install the MLRun package, run `pip install mlrun` with the MLRun version that matches your MLRun service.

> **Note:** After installing or updating the MLRun package, restart the notebook kernel in your environment.

In [1]:
!/User/align_mlrun.sh

Already installed: mlrun


## Configure

The configuration below is shared across the notebooks. Change the values in this subsection if you would like different configuration settings.

### Project

Projects in the platform are used to package multiple functions, workflows, and artifacts. Set here the project base name.

In [2]:
PROJECT_BASE_NAME = "faces"

### Data

All data in the platform is stored in user-defined data containers. In this case we use the predefined "users" container. For more information refer to [Data containers, collections, and objects documentation](https://www.iguazio.com/docs/latest-release/concepts/containers-collections-objects)

In [3]:
CONTAINER = 'users'
WEB_API = "http://v3io-webapi:8081"

In [4]:
from os import getenv, path

V3IO_USERNAME = getenv('V3IO_USERNAME')
DATA_PATH = path.join('/User', 'examples',PROJECT_BASE_NAME, 'data/')
ARTIFACTS_PATH = path.join(V3IO_USERNAME, 'examples',PROJECT_BASE_NAME, 'artifacts')
USER_ARTIFACTS_PATH = path.join('/User', 'examples',PROJECT_BASE_NAME, 'artifacts')
FUNCTIONS_PATH=path.abspath('./functions')
MODELS_PATH = path.join(FUNCTIONS_PATH, 'models.py')
MODEL_PATH=path.join(USER_ARTIFACTS_PATH, 'model.bst')
CLASSES_MAP=path.join(ARTIFACTS_PATH, 'idx2name.csv')
USER_NAME = getenv('V3IO_USERNAME')
ENCODINGS_PATH = path.join(ARTIFACTS_PATH,'encodings') 


### create the mlrun project 

In [5]:
from mlrun import new_project

project_name = '-'.join(filter(None, [PROJECT_BASE_NAME, getenv('V3IO_USERNAME', None)]))
project_path = path.abspath('./')
project = new_project(project_name, project_path)

print(f'Project path: {project_path}\nProject name: {project_name}')



Project path: /User/mlrun/demos/realtime-face-recognition/notebooks
Project name: faces-avia


In [6]:
from mlrun import mlconf

# Target location for storing pipeline artifacts
project.artifact_path = ARTIFACTS_PATH
# MLRun DB path or API service URL

print(f'Artifacts path: {project.artifact_path}\nMLRun DB path: {mlconf.dbpath}')

Artifacts path: avia/examples/faces/artifacts
MLRun DB path: http://mlrun-api:8080


## Shared Configuration

Store the configuration defined in this notebook in the project `params`. We will use these values in subsequent notebooks.

In [7]:
project.spec.params = {}

project.spec.params['PROJECT_BASE_NAME'] = PROJECT_BASE_NAME
project.spec.params['CONTAINER'] = CONTAINER
project.spec.params['WEB_API'] = WEB_API
project.spec.params['DATA_PATH'] = DATA_PATH
project.spec.params['ENCODINGS_PATH'] = ENCODINGS_PATH
project.spec.params['MODELS_PATH'] = MODELS_PATH
project.spec.params['MODEL_PATH'] = MODEL_PATH
project.spec.params['ARTIFACTS_PATH'] = ARTIFACTS_PATH
project.spec.params['USER_ARTIFACTS_PATH'] = USER_ARTIFACTS_PATH
project.spec.params['CLASSES_MAP'] = CLASSES_MAP

### declare encode-images function for encoding initial images dataset

In [8]:
from mlrun import mount_v3io, code_to_function
encode_images_func = code_to_function('encode-images', kind='job', filename='functions/encode_images.py',image='aviaigz/faces:0.6.0')
encode_images_func.deploy(with_mlrun=False)

True

### declare train function for training the encoded images dataset

In [9]:
from mlrun import mount_v3io, code_to_function
train_func = code_to_function('train', kind='job', filename='functions/train.py',image='aviaigz/faces:0.6.0')
train_func.deploy(with_mlrun=False)

True

### declare face-prediction nuclio function for predicting face based on model created in training

In [10]:
import nuclio
import os
from mlrun import mount_v3io, code_to_function
nuclio_face_prediction_func = code_to_function('nuclio-face-prediction', kind='nuclio', filename='nuclio-face-prediction.ipynb')
# set the API/trigger, attach the home dir to the function
nuclio_face_prediction_func.with_http(workers=2).apply(mount_v3io())

# set environment variables
nuclio_face_prediction_func.set_env('MODELS_PATH', MODELS_PATH)
nuclio_face_prediction_func.set_env('MODEL_PATH', MODEL_PATH)
nuclio_face_prediction_func.set_env('CLASSES_MAP', CLASSES_MAP)
nuclio_face_prediction_func.set_env('V3IO_ACCESS_KEY', os.environ['V3IO_ACCESS_KEY'])
nuclio_face_prediction_func.spec.build.base_image = 'mlrun/ml-models'

### declare nuclio api-serving function for managing images requests and process face-prediction response

In [11]:
import nuclio
import os
from mlrun import mount_v3io, code_to_function
nuclio_api_serving_func = code_to_function('nuclio-api-serving', kind='nuclio', filename='nuclio-api-serving.ipynb')
# set the API/trigger, attach the home dir to the function
nuclio_api_serving_func.with_http(workers=2).apply(mount_v3io())

# set environment variables
nuclio_api_serving_func.set_env('DATA_PATH' ,DATA_PATH)
nuclio_api_serving_func.set_env('V3IO_ACCESS_KEY', os.environ['V3IO_ACCESS_KEY'])
nuclio_api_serving_func.spec.build.base_image = 'mlrun/ml-models'

### set the project functions

In [12]:
from mlrun import mount_v3io, code_to_function


project.set_function(encode_images_func,name='encode-images')
project.set_function(train_func,name = 'train')
project.set_function(nuclio_face_prediction_func,name = 'nuclio-face-prediction')
project.set_function(nuclio_api_serving_func,name = 'nuclio-api-serving')

project.func('encode-images').apply(mount_v3io())
project.func('train').apply(mount_v3io())
project.func('nuclio-face-prediction').apply(mount_v3io())
project.func('nuclio-api-serving').apply(mount_v3io())


project.func('encode-images').set_env('PYTHONPATH', project_path)
project.func('train').set_env('PYTHONPATH', project_path)
project.func('nuclio-face-prediction').set_env('PYTHONPATH', project_path)
project.func('nuclio-api-serving').set_env('PYTHONPATH', project_path)


project.func('encode-images').spec.artifact_path = ARTIFACTS_PATH
project.func('train').spec.artifact_path = ARTIFACTS_PATH
project.func('nuclio-face-prediction').spec.artifact_path = ARTIFACTS_PATH
project.func('nuclio-api-serving').spec.artifact_path = ARTIFACTS_PATH





<a id="gs-step-create-n-run-ml-pipeline"></a>
## Create and Run a Fully Automated ML Pipeline

You're now ready to create a full ML pipeline.
This is done by using [Kubeflow Pipelines](https://www.kubeflow.org/docs/pipelines/overview/pipelines-overview/), which is integrated into the Iguazio Data Science Platform.
Kubeflow Pipelines is an open-source framework for building and deploying portable, scalable machine-learning workflows based on Docker containers.
MLRun leverages this framework to take your existing code and deploy it as steps in the pipeline.

In [13]:
%%writefile {path.join(project_path, 'workflow.py')}

from kfp import dsl
from mlrun import mount_v3io, load_project
from os import getenv, path

project_path = path.abspath('./')
project = load_project(project_path)

DATA_PATH =project.spec.params.get('DATA_PATH')
USER_ARTIFACTS_PATH = project.spec.params.get('USER_ARTIFACTS_PATH')
ARTIFACTS_PATH = project.spec.params.get('ARTIFACTS_PATH')

MODELS_PATH = project.spec.params.get('MODELS_PATH')
FRAMES_URL = 'framesd:8081'
V3IO_ACCESS_KEY = getenv('V3IO_ACCESS_KEY')
WEB_API = "http://v3io-webapi:8081"
ENCODINGS_PATH = project.spec.params.get('ENCODINGS_PATH')

funcs = {}
project_path = path.abspath('./')
faces_params = {'data_path' : DATA_PATH,
                'artifacts_path': USER_ARTIFACTS_PATH,
                'models_path': MODELS_PATH,
                'frames_url': FRAMES_URL,
                'token' : V3IO_ACCESS_KEY, 
                'encodings_path': ENCODINGS_PATH }

# Configure function resources and local settings
def init_functions(functions: dict, project=None, secrets=None):
    project_path = path.abspath('./')
    for f in functions.values():
        f.apply(mount_v3io())
        f.set_env('PYTHONPATH', project_path)
        f.spec.artifact_path = ARTIFACTS_PATH
        
        
        
# Create a Kubeflow Pipelines pipeline
@dsl.pipeline(
    name = "faces-pipeline",
    description = "faces demo pipeline"
)
def kfpipeline():
    # encode images
    encode = funcs['encode-images'].as_step(
        name="encode_images",
        params=faces_params,
        outputs=['encode']
    )
    
    # train the model based on the images
    train = funcs['train'].as_step(
        name="train",
        params = faces_params,
        inputs={'table': encode.outputs},                       
        outputs=['training']
    )
    # deploy the model as nuclio function
    nuclio_face_prediction = funcs['nuclio-face-prediction'].deploy_step(                
        models={"nuclio-face-prediction": train.outputs['training']}        
    )    
    
    # deploy api serving as nuclio function
    nuclio_api_serving = funcs['nuclio-api-serving'].deploy_step()
    nuclio_api_serving.after(nuclio_face_prediction)
    
    

Overwriting /User/mlrun/demos/realtime-face-recognition/notebooks/workflow.py


<a id="gs-register-workflow"></a>
#### Register the Workflow

Use the `set_workflow` MLRun project method to register your workflow with MLRun.
The following code sets the `name` parameter to the selected workflow name ("main") and the `code` parameter to the name of the workflow file that is found in your project directory (**workflow.py**).

In [14]:
# Register the workflow file as "main"
project.set_workflow('main', 'workflow.py')

In [15]:
project.save()

In [16]:
run_id = project.run(
    'main',
    arguments={}, 
    
    artifact_path=path.abspath(path.join('pipeline','{{workflow.uid}}'),
    
                              )
    ,dirty=True)

> 2021-01-25 07:22:00,173 [info] using in-cluster config.


> 2021-01-25 07:22:00,895 [info] Pipeline run id=5318559d-406d-46eb-a7fc-8b1063e5225e, check UI or DB for progress


## Stream images

Continue to [**client README.md**](../client/README.md) to clone client and generate images from your webcam