# OpenCV Edge Detection 
Note that this OpenCV jupyter notebook is created based on [Pachyderm tutorial](https://docs.pachyderm.com/latest/getting_started/beginner_tutorial/). 
I assume that you have jupyterhub and pachyderm on your cluster(OpenShift/Kubernetes). If you don't have try to use Open Data Hub for test environment. These are blogs that explain how to deploy Pachyderm with Open Data Hub and get this Jupyter notebook from ODH Jupyterhub.
- [OpenCV Edge Detection with OpenDatahub + Pachyderm](https://developers.redhat.com/articles/2022/03/25/opencv-edge-detection-opendatahub-pachyderm)
- [The easiest way to install Pachyderm with OpenDataHub](https://developers.redhat.com/articles/2022/01/04/easiest-way-install-pachyderm-opendatahub)

## Download pachctl cli
Pachyderm always recommend that you should use the same version of pachctl cli and Pachyderm. You can check the Pachyderm version from CR.
![Pachyderm version](./edited_pachyderm_version.png)
Then, please update the following version `2.1.6` in the following command variable to the version in CR.

In [None]:
# Download pachctl binary file
PACH_VERSION='2.1.6'! curl -o /tmp/pachctl.tar.gz -L https://github.com/pachyderm/pachyderm/releases/download/v{PACH_VERSION}/pachctl_{PACH_VERSION}_linux_amd64.tar.gz && tar -xvf /tmp/pachctl.tar.gz  -C /tmp && cp /tmp/pachctl_{PACH_VERSION}_linux_amd64/pachctl  /opt/app-root/bin/

## Create a Pachyderm Context to use Pachyderm 
OpenShift provides a local hostname and this hostname is automatically created with a specific format. 
`%ServiceName%.%Project%.svc.cluster.local`
In this case, `pachd` is the service name and `opendatahub` is the project name. So you can use `pachd.opendatahub.svc.cluster.local` with port number 30650.

In [None]:
# Create a new context with pachd_address < pachd.%Project_Name%.svc.cluster.local:%Port >
!echo '{"pachd_address":"pachd.opendatahub.svc.cluster.local:30650"}' | pachctl config set context pachyderm

# Switch the pachd context to a new context
!pachctl config set active-context pachyderm

# Verify active context is the new context
!pachctl config get active-context

# Check pachctl/pachd version
!pachctl version

## Create a source repository "images"
This `images` is the source repository in pachyderm. Whenever a new `png` image is pushed into this `images` repository, then pipeline will start to do the job with original image and push it to a new repository `edges`

In [None]:
# Create repo `images`
!pachctl create repo images

# Check the images repo
!pachctl list repo

## Push an image to source repo and check the commits

In [None]:
# Put a file to the repo `images`
!pachctl put file images@master:liberty.png -f https://raw.githubusercontent.com/Jooho/pachyderm-operator-manifests/master/notebooks/liberty.png

# Check the repo used storage
!pachctl list repo

# List up commits on the repo `images`
!pachctl list commit images

## Display the image that is pushed into images repo

In [None]:
# See the original image
!pachctl get file images@master:liberty.png -o original_liberty.png
from IPython.display import Image, display
Image(filename='original_liberty.png') 

## Create a Edge Pipeline 
You create a `edges` pipeline that does edge detection with an image that is pushed into source repo `images`.
When you create the edges pipeline, you have to check the pipeline pod is Ready in the project. The pod will do `edge detection` so before the pod is Ready, you can not find the processed image in edges repository.

If you are wondering the source, please refer [this](https://docs.pachyderm.com/latest/getting_started/beginner_tutorial/#create-a-pipeline)

In [None]:
# Create a pipeline
!pachctl create pipeline -f https://raw.githubusercontent.com/pachyderm/pachyderm/master/examples/opencv/edges.pipeline.json

# Check pipeline created jobs
!pachctl list job

## Get the proceed image from edges repository
After the edge pipeline pod is Ready, the images in the source repo `images` are being processed. Then they are pushed into `edges` repo.
You can download the processed image from `edges`
However, keep in mind before you execute the following command, you have to make sure the pipeline pod is Ready.
`oc get pod -l pipelineName=edges -n opendatahub`

In [None]:
# Check if a new repo `edges` that has the changed image after pipeline creatd
!pachctl list repo

#See the changed image
!pachctl get file edges@master:liberty.png -o edge_liberty.png

Image(filename='edge_liberty.png')   

## Push images to test pipeline.
Now, your pipeline, repositories are configured. Which means that the pipeline will start to process whenever any images are pushed into `images` repository. So let's push any test images into `images` repository.

In [None]:
# Try more images. Put 2 images into repo `images
!pachctl put file images@master:AT-AT.png -f https://raw.githubusercontent.com/Jooho/pachyderm-operator-manifests/master/notebooks/AT-AT.png
!pachctl put file images@master:kitten.png -f https://raw.githubusercontent.com/Jooho/pachyderm-operator-manifests/master/notebooks/kitten.png

# Check new jobs deployed
!pachctl list job

## Verify if the images are pushed well into source repository.

In [None]:
# See original images
!pachctl get file images@master:AT-AT.png -o original_at_at.png
!pachctl get file images@master:kitten.png  -o original_kitten.png

listOfImageNames = ['original_at_at.png',
                    'original_kitten.png']

for imageName in listOfImageNames:
    display(Image(filename=imageName))

## Check processed images from target repository.

In [None]:
# See edge images
!pachctl get file edges@master:AT-AT.png -o edge_at_at.png
!pachctl get file edges@master:kitten.png  -o edge_kitten.png

listOfImageNames = ['edge_at_at.png',
                    'edge_kitten.png']

for imageName in listOfImageNames:
    display(Image(filename=imageName))
