# Ray

This tutorial demonstrates submiting and monitoring a job which is running on an external 
[Ray](https://www.ray.io/) cluster. We can submit jobs to an external Ray cluster using the Bridge operator or submitting a Kubeflow Pipelines script which uses the Ray pod, or using the Ray pod directly. This tutorial will demonstrate the setup and deployment for running a test script, and how to use S3 for file upload and download for all three implementations. 

--------------------------------------------------------------------------------------------------------------------

##  Setup 

#### S3
Create the S3 bucket with input files

- Create a test bucket on S3 called "mybucket" and upload the files parameters.json, metadata.json and code.py to /mybucket/ray/


#### Create environment variables

For these tests we need to specify our S3 and resource endpoints and S2 bucket name. If the job script, parameter file and metadata file are in S3 the we also need to provide the ```bucket:folder/filename```

In [None]:
%env RESOURCE_URL=http://10.0.57.51:8265
%env ENDPOINT=minio-kubeflow.apps.adp-rosa-2.5wcf.p1.openshiftapps.com
%env BUCKET=mybucket
%env JOBSCRIPT=mybucket:ray/code.py
%env SCRIPT_MD=mybucket:ray/metadata.json
%env PARAMS=mybucket:ray/parameters.json

#### Create the S3 and Ray secrets needed by the pod

Edit the S3 secret yaml file with credentials to access S3. Then create these secrets in the namespace you wish to run jobs in, e.g. to run in bridge-operator-system use 

In [None]:
# Define env names for secrets to be used for all jobs
%env S3_SECRET=mysecret-s3
%env RESOURCE_SECRET=mysecret

!sed -i '' "s#{{S3_SECRET}}#$S3_SECRET#g" ../core/secrets/s3secret.yaml 
!sed -i '' "s#{{RESOURCE_SECRET}}#$RESOURCE_SECRET#g" ../core/secrets/raysecret.yaml 


In [15]:
!kubectl apply -f ../core/secrets/raysecret.yaml -n bridge-operator-system
!kubectl apply -f ../core/secrets/s3secret.yaml -n bridge-operator-system

secret/mysecret created
secret/mysecret-s3 created


--------------------------------------------------------------------------------------------------------------------

## 1. Testing the Ray pod directly

Testing of individual pods can be done directly without invoking the Bridge operator.

For Ray the ```samples/tests/ray/ray_job.yaml``` specifies
- the pod image to use ```quay.io/ibmdpdev/ray-pod:v0.0.1```
- the configmap ```rayjob-bridge-cm```

In [None]:
!sed -i '' "s#{{S3_SECRET}}#$S3_SECRET#g" ../test/ray/ray_job.yaml
!sed -i '' "s#{{RESOURCE_SECRET}}#$RESOURCE_SECRET#g" ../test/ray/ray_job.yaml


The configmap yamls are in ```samples/tests/ray/ ``` and there you must specify
- with the address of the Ray cluster
- the Minio endpoint
- the bucket name

Edit the yamls and create the configmap. Then submit the job:

In [None]:
!sed -i '' "s#{{BUCKET}}#$BUCKET#g" ../test/ray/ray_sample0_cm.yaml 
!sed -i '' "s#{{ENDPOINT}}#$ENDPOINT#g" ../test/ray/ray_sample0_cm.yaml 
!sed -i '' "s#{{RESOURCE_URL}}#$RESOURCE_URL#g" ../test/ray/ray_sample0_cm.yaml
!sed -i '' "s#{{S3_SECRET}}#$S3_SECRET#g" ../test/ray/ray_sample0_cm.yaml

In [None]:
!kubectl apply -f ../test/ray/ray_sample0_cm.yaml 
!kubectl apply -f ../test/ray/ray_job.yaml 

In [None]:
#Monitor the job
!kubectl logs hpcjob-pod
!kubectl describe pod hpcjob-pod
# Once the job completes the log file will be in the S3 bucket specified in the configmap

## 2. Bridge operator for Ray

There are two sample yaml files in ```samples/core/operator``` for running Ray jobs using the Bridge operator.
Before running either job edit the files so that 

- resourceURL corresponds to your Ray cluster
- S3storage: endpoint: corresponds to your S3 endpoint
- S3upload: bucket: corresponds to your bucket in S3

### Inline script and job parameters example 
The ```job0ray.yaml``` submits a python job script which is given 'inline' and the log output from the job is saved into the S3upload bucket ```<BUCKET_NAME>/bridgejob-ray```. The input variables to the python script are defined in the jobparameters dictionary and envoirnment settings and package installations can be specified in ```scriptmetadata```.
To edit the yaml and run the job:

In [None]:
!sed -i '' "s#{{BUCKET}}#$BUCKET#g" ../core/operator/job0ray.yaml 
!sed -i '' "s#{{ENDPOINT}}#$ENDPOINT#g" ../core/operator/job0ray.yaml 
!sed  -i '' "s#{{RESOURCE_URL}}#$RESOURCE_URL#g" ../core/operator/job0ray.yaml
!sed -i '' "s#{{RESOURCE_URL}}#$RESOURCE_URL#g" ../core/operator/job0ray.yaml
!sed -i '' "s#{{S3_SECRET}}#$S3_SECRET#g" ../core/operator/job0ray.yaml

In [18]:
!kubectl apply -f ../core/operator/job0ray.yaml 

bridgejob.bridgeoperator.ibm.com/bridgejob-ray created


In [None]:
#check the pod logs
!kubectl describe pod bridgejob-ray-bridge-pod
!kubectl logs bridgejob-ray-bridge-pod

### Script and job parameters in S3 example
The ```job1ray.yaml``` submits a python job script which is in S3 at ```<BUCKET_NAME>/ray/code.py```. The log output from the job is saved into the S3upload bucket ```<BUCKET_NAME>/bridgejob-ray```. The input variables to the python script are defined in the ```parameters.json``` file in S3 at ```<BUCKET_NAME>/ray/```  and envoirnment settings and package installations are specified in ```metadata.json``` in S3 at ```<BUCKET_NAME>/ray/```.
To run the job:

In [None]:
!sed -i '' "s#{{BUCKET}}#$BUCKET#g" ../core/operator/job1ray.yaml 
!sed -i '' "s#{{ENDPOINT}}#$ENDPOINT#g" ../core/operator/job1ray.yaml 
!sed  -i '' "s#{{RESOURCE_URL}}#$RESOURCE_URL#g" ../core/operator/job1ray.yaml 
!sed -i '' "s#{{JOBSCRIPT}}#$JOBSCRIPT#g" ../core/operator/job1ray.yaml 
!sed -i '' "s#{{SCRIPT_MD}}#$SCRIPT_MD#g" ../core/operator/job1ray.yaml 
!sed -i '' "s#{{PARAMS}}#$PARAMS#g" ../core/operator/job1ray.yaml 
!sed -i '' "s#{{RESOURCE_URL}}#$RESOURCE_URL#g" ../core/operator/job1ray.yaml
!sed -i '' "s#{{S3_SECRET}}#$S3_SECRET#g" ../core/operator/job1ray.yaml

In [36]:
!kubectl apply -f ../core/operator/job1ray.yaml 

bridgejob.bridgeoperator.ibm.com/bridgejob-ray unchanged


In [None]:
#check the pod logs
!kubectl describe pod bridgejob-ray-bridge-pod
!kubectl logs bridgejob-ray-bridge-pod

--------------------------------------------------------------------------------------------------------------------

## 3. KubeFlow Pipelines

These examples assume you have access to a KFP with Tekton installation where you can submit and run jobs or upload pipelines to the KFP UI. See e.g. ``` bridge-operator/kubeflow/```

The credentials for S3 and the external resource should be saved to the kubeflow namespace:

In [19]:
!kubectl apply -f ../core/secrets/raysecret.yaml -n kubeflow
!kubectl apply -f ../core/secrets/s3secret.yaml -n kubeflow

secret/mysecret configured
secret/mysecret-s3 configured


The implementation with KubeFlow Pipelines uses a general ```bridge-pipeline``` given in ```kubeflow/bridge_pipeline_handler.py``` and the specific implementation for Ray is in ```kubeflow/implementations/ray_invoker.py```

1. compile the bridge pipeline

``` $ python bridge_pipeline_handler.py ```

2. Upload the generated yaml to the KFP UI > pipelines


3. Run ```kubeflow/implementations/ray_invoker.py``` providing

- a host endpoint for KFP
- a ```RESOURCEURL``` for the Ray cluster 
- a ```s3endpoint``` for S3 
- a ```s3uploadbucket``` name 
- a bucket name in ```jobparams```, ```script``` and ```scriptmd``` if ```scriptlocation``` and ```scriptextraloc``` are 'S3'

In [None]:
# submit the job
!python ../../kubeflow/implementations/ray_invoker.py --kfphost=<KFP_HOST> --resource_url=<RESOURCE_URL> \
                                                      --s3endpoint=<s3ENDPOINT> --s3uploadbucket=<BUCKET> \
                                                      --script=<BUCKET:SCRIPT> --scriptmd=<BUCKET:SCRIPTMD> \
                                                      --jobparams=<BUCKET:JOBPARAMS>

Output from the KFP job can be viewed in the UI and the logs are uploaded to S3 ```<BUCKET>/rayjob-kfp```

--------------------------------------------------------------------------------------------------------------------