# Face Recognition Pipeline 

## Prerequisites

<a id="gs-mlrun-install"></a>The tutorial uses MLRun to create a project, implement and execute an ML pipeline, and track the execution.
(For more information about MLRun, see Step 1.)
To use MLRun, you must first ensure that it's installed and running as a service on your platform cluster.
Look for an `mlrun` service on the **Services** page of the platform dashboard.
For more information and additional assistance, contact the Iguazio [support team](mailto:support@iguazio.com).

To use MLRun from Jupyter Notebook, you need to run the following code to install the `mlrun` Python package.
This needs to be done only once per Jupyter Notebook service.
> **Note:** You must **restart the Jupyter kernel** to complete the installation.

In [125]:
import sys
import subprocess
import pkg_resources
import IPython

required = {'mlrun'}
installed = {pkg.key for pkg in pkg_resources.working_set}
missing = required - installed
previously_installed = required.intersection(installed)

if missing:
    print(f'Installing {",".join(missing)}')
    python = sys.executable
    subprocess.check_call([python, '-m', 'pip', 'install', *missing], stdout=subprocess.DEVNULL)
    print('Restarting kernel')
    IPython.Application.instance().kernel.do_shutdown(True) #automatically restarts kernel
if previously_installed:
    print(f'Already installed: {",".join(previously_installed)}')

Already installed: mlrun


### create the mlrun project 

In [126]:
from os import path, getenv
from mlrun import new_project, mlconf

#project_name = '-'.join(filter(None, ['getting-started-iris', getenv('V3IO_USERNAME', None)]))
project_name = "faces"
project_path = path.abspath('./')
project = new_project(project_name, project_path)
project.save()
print(f'Project path: {project_path}\nProject name: {project_name}')

Project path: /User/mlrun/demos/faces/notebooks
Project name: faces


In [127]:
out = mlconf.artifact_path or path.abspath('./data')
# {{run.uid}} will be substituted with the run id, so output will be written to different directoried per run
artifact_path = path.join(out, '{{run.uid}}')

### declare encode-images function for encoding initial images dataset

In [128]:
from mlrun import mount_v3io, code_to_function
encode_images_func = code_to_function('encode-images', kind='job', filename='functions/encode_images.py',image='aviaigz/ml-models:0.5.4')
#encode_images_func.spec.build.base_image = 'aviaigz/ml-models:0.5.4'
encode_images_func.deploy(with_mlrun=False)

True

### declare train function for training the encoded images dataset

In [129]:
from mlrun import mount_v3io, code_to_function
train_func = code_to_function('train', kind='job', filename='functions/train.py',image='aviaigz/ml-models:0.5.4')
#train_func.spec.build.base_image = 'aviaigz/ml-models:0.5.4'
train_func.deploy(with_mlrun=False)

True

### declare face-prediction nuclio function for predicting face based on model created in training

In [130]:
import nuclio
import os
from mlrun import mount_v3io, code_to_function
nuclio_face_prediction_func = code_to_function('nuclio-face-prediction', kind='nuclio', filename='nuclio-face-prediction.ipynb')
# set the API/trigger, attach the home dir to the function
nuclio_face_prediction_func.with_http(workers=2).apply(mount_v3io())

# set environment variables
nuclio_face_prediction_func.set_env('MODELS_PATH', '/User/mlrun/demos/faces/notebooks/functions/models.py')
nuclio_face_prediction_func.set_env('MODEL_PATH', '/User/faces/artifacts/model.bst')
nuclio_face_prediction_func.set_env('CLASSES_MAP', '/User/faces/artifacts/idx2name.csv')
nuclio_face_prediction_func.set_env('V3IO_ACCESS_KEY', os.environ['V3IO_ACCESS_KEY'])
nuclio_face_prediction_func.spec.build.base_image = 'mlrun/ml-models'

### declare nuclio api-serving function for managing images requests and process face-prediction response

In [131]:
import nuclio
import os
from mlrun import mount_v3io, code_to_function
nuclio_api_serving_func = code_to_function('nuclio-api-serving', kind='nuclio', filename='nuclio-api-serving.ipynb')
# set the API/trigger, attach the home dir to the function
nuclio_api_serving_func.with_http(workers=2).apply(mount_v3io())

# set environment variables
nuclio_api_serving_func.set_env('DATA_PATH' ,'/User/faces/dataset/')
nuclio_api_serving_func.set_env('V3IO_ACCESS_KEY', os.environ['V3IO_ACCESS_KEY'])
nuclio_api_serving_func.spec.build.base_image = 'mlrun/ml-models'

### set the project functions

In [132]:
ARTIFACTS_PATH ='/User/faces/artifacts/'
from mlrun import mount_v3io, code_to_function


project.set_function(encode_images_func,name='encode-images')
project.set_function(train_func,name = 'train')
project.set_function(nuclio_face_prediction_func,name = 'nuclio-face-prediction')
project.set_function(nuclio_api_serving_func,name = 'nuclio-api-serving')

project.func('encode-images').apply(mount_v3io())
project.func('train').apply(mount_v3io())
project.func('nuclio-face-prediction').apply(mount_v3io())
project.func('nuclio-api-serving').apply(mount_v3io())


project.func('encode-images').set_env('PYTHONPATH', project_path)
project.func('train').set_env('PYTHONPATH', project_path)
project.func('nuclio-face-prediction').set_env('PYTHONPATH', project_path)
project.func('nuclio-api-serving').set_env('PYTHONPATH', project_path)


project.func('encode-images').spec.artifact_path = ARTIFACTS_PATH
project.func('train').spec.artifact_path = ARTIFACTS_PATH
project.func('nuclio-face-prediction').spec.artifact_path = ARTIFACTS_PATH
project.func('nuclio-api-serving').spec.artifact_path = ARTIFACTS_PATH





<a id="gs-step-create-n-run-ml-pipeline"></a>
## Create and Run a Fully Automated ML Pipeline

You're now ready to create a full ML pipeline.
This is done by using [Kubeflow Pipelines](https://www.kubeflow.org/docs/pipelines/overview/pipelines-overview/), which is integrated into the Iguazio Data Science Platform.
Kubeflow Pipelines is an open-source framework for building and deploying portable, scalable machine-learning workflows based on Docker containers.
MLRun leverages this framework to take your existing code and deploy it as steps in the pipeline.

In [133]:
%%writefile {path.join(project_path, 'workflow.py')}

from kfp import dsl
from mlrun import mount_v3io
from os import getenv, path

DATA_PATH ='/User/faces/dataset/'
ARTIFACTS_PATH ='/User/faces/artifacts/'
MODELS_PATH = '/User/mlrun/demos/faces/notebooks/functions/models.py'
FRAMES_URL = 'framesd:8081'
V3IO_ACCESS_KEY = getenv('V3IO_ACCESS_KEY')
USER_NAME = getenv('V3IO_USERNAME')
ENCODINGS_PATH = '/'.join([USER_NAME,'faces','encodings']) 
WEB_API = "http://v3io-webapi:8081"


funcs = {}
project_path = path.abspath('./')
faces_params = {'data_path' : DATA_PATH,
                'artifacts_path': ARTIFACTS_PATH,
                'models_path': MODELS_PATH,
                'frames_url': FRAMES_URL,
                'token' : V3IO_ACCESS_KEY, 
                'encodings_path': ENCODINGS_PATH }

# Configure function resources and local settings
def init_functions(functions: dict, project=None, secrets=None):
    project_path = path.abspath('./')
    for f in functions.values():
        f.apply(mount_v3io())
        f.set_env('PYTHONPATH', project_path)
        f.spec.artifact_path = ARTIFACTS_PATH
        
        
        
# Create a Kubeflow Pipelines pipeline
@dsl.pipeline(
    name = "faces-pipeline",
    description = "faces demo pipeline"
)
def kfpipeline():
    # encode images
    encode = funcs['encode-images'].as_step(
        name="encode_images",
        params=faces_params,
        outputs=['encode']
    )
    
    # train the model based on the images
    train = funcs['train'].as_step(
        name="train",
        params = faces_params,
        inputs={'table': encode.outputs},                       
        outputs=['training']
    )
    # deploy the model as nuclio function
    nuclio_face_prediction = funcs['nuclio-face-prediction'].deploy_step(                
        models={"nuclio-face-prediction": train.outputs['training']}        
    )    
    
    # deploy api serving as nuclio function
    nuclio_api_serving = funcs['nuclio-api-serving'].deploy_step()
    nuclio_api_serving.after(nuclio_face_prediction)
    
    

Overwriting /User/mlrun/demos/faces/notebooks/workflow.py


<a id="gs-register-workflow"></a>
#### Register the Workflow

Use the `set_workflow` MLRun project method to register your workflow with MLRun.
The following code sets the `name` parameter to the selected workflow name ("main") and the `code` parameter to the name of the workflow file that is found in your project directory (**workflow.py**).

In [134]:
# Register the workflow file as "main"
project.set_workflow('main', 'workflow.py')

In [135]:
project.save()

In [136]:
run_id = project.run(
    'main',
    arguments={}, 
    
    artifact_path=path.abspath(path.join('pipeline','{{workflow.uid}}'),
    
                              )
    ,dirty=True)

> 2020-11-30 14:21:53,149 [info] Pipeline run id=dc581f66-c692-42ba-b0ea-4f174c51e530, check UI or DB for progress
