## **Setting up the Ray cluster**

**Let's first login to the OpenShift cluster and navigate to the project**

In [2]:
! oc login --token=your-token --server=https://your-cluster

Logged into "https://your-cluster" as "opentlc-mgr" using the token provided.

You have access to 91 projects, the list has been suppressed. You can list all projects with 'oc projects'

Using project "default".


In [None]:
! oc project default

**We will import the CodeFlare pieces from codflare-sdk**

In [3]:
from codeflare_sdk.cluster.cluster import Cluster, ClusterConfiguration

### **Request aggregated resources using CodeFlare**

**cluster-up() will create an AppWrapper CRD that will request aggregated resources and create
a Ray cluster with Ray head and two Ray worker nodes (each represented by a pod) when resources are available. If resources are not available,
it will wait in a queue and instantly deploy the Ray cluster when resources free up.**

In [4]:
# Create our cluster and submit appwrapper
cluster = Cluster(ClusterConfiguration(name='road-ray', min_worker=1, max_worker=1, min_cpus=2, max_cpus=2, min_memory=8, max_memory=8, gpu=0))

Written to: road-ray.yaml


In [25]:
cluster.up()

In [6]:
cluster.is_ready()

(<CodeFlareClusterStatus.READY: 1>, True)

In [7]:
cluster.status()

<RayClusterStatus.READY: 'ready'>

In [8]:
ray_cluster_uri = cluster.cluster_uri()

**Below we will go ahead and connect to this cluster so that we can run our code on it.**

In [9]:
#before proceeding make sure the cluster exists and the uri is not empty
assert ray_cluster_uri, "Ray cluster needs to be started and set before proceeding"

import ray

# reset the ray context in case there's already one. 
ray.shutdown()
# establish connection to ray cluster

#install additionall libraries that will be required for this training
runtime_env = {"pip": ["scikit-learn"]}

ray.init(address=f'{ray_cluster_uri}', runtime_env=runtime_env)

print("Ray cluster is up and running: ", ray.is_initialized())

Ray cluster is up and running:  True


## Load Data

In [13]:
!pip install -r requirements.txt
import joblib
import pandas as pd

df = pd.read_csv('road_roughness_data.csv')
print(df)

Collecting pandas==1.5.1
  Downloading pandas-1.5.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.2 MB)
[K     |████████████████████████████████| 12.2 MB 4.7 MB/s eta 0:00:01
[?25hCollecting scikit-learn==1.1.3
  Downloading scikit_learn-1.1.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (31.2 MB)
[K     |████████████████████████████████| 31.2 MB 131.9 MB/s eta 0:00:01
[?25hCollecting Flask==2.2.2
  Downloading Flask-2.2.2-py3-none-any.whl (101 kB)
[K     |████████████████████████████████| 101 kB 135.3 MB/s ta 0:00:01
Collecting itsdangerous>=2.0
  Downloading itsdangerous-2.1.2-py3-none-any.whl (15 kB)
Installing collected packages: scikit-learn, itsdangerous, pandas, Flask
  Attempting uninstall: scikit-learn
    Found existing installation: scikit-learn 1.1.1
    Uninstalling scikit-learn-1.1.1:
[31mERROR: Could not install packages due to an OSError: [Errno 13] Permission denied: 'COPYING'
Check the permissions.
[0m
You should consider upgrading vi

## Features

In [14]:
df.iloc[:,:-1]

Unnamed: 0,acc_x,acc_y,acc_z,speed,gyro_x,gyro_y,gyro_z
0,0.365116,0.167893,9.793961,0.009128,-0.133896,-0.018883,0.138092
1,0.392649,0.176273,9.771216,0.009128,-0.027084,-0.003624,0.000763
2,0.409408,0.181062,9.732909,0.009128,0.125504,-0.186729,-0.090790
3,0.371101,0.164302,9.749668,0.009128,-0.088120,-0.034142,0.046539
4,0.390255,0.159514,9.869378,0.009128,-0.179672,0.118446,-0.182343
...,...,...,...,...,...,...,...
144031,-0.527921,-0.322918,9.583271,0.005715,-0.332260,-0.095177,-0.060272
144032,-0.663194,-0.575506,9.433633,0.005715,-0.240707,-0.308800,-0.182343
144033,-0.375890,-0.245106,9.957964,0.005715,0.064468,-0.156212,0.000763
144034,-0.385466,-0.091877,9.840648,0.005715,-0.423813,0.820351,-0.243378


## Target variable

In [15]:
df.iloc[:,-1:]

Unnamed: 0,road_condition
0,2
1,2
2,2
3,2
4,2
...,...
144031,2
144032,2
144033,2
144034,2


## Split data into Train and Test sets

In [16]:
# Import train_test_split function
from sklearn.model_selection import train_test_split

X = df.iloc[:,:-1]
#y = df.iloc[:,-1:]
y = df['road_condition']

# Split dataset into training set and test set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.80) # 70% training and 30% test

In [17]:
# Create Ray object references
X_train_remote, X_test_remote, y_train_remote, y_test_remote = ray.put(X_train), ray.put(X_test), ray.put(y_train), ray.put(y_test)

## Fit Random Forest Classifier to Train set and Run prediction on test data

In [18]:
@ray.remote
def train_fn(X_train, y_train, X_test):
    #Import Random Forest Model
    from sklearn.ensemble import RandomForestClassifier

    #Create a Gaussian Classifier
    clf = RandomForestClassifier(n_estimators=100,verbose=1)

    #Train the model using the training sets y_pred=clf.predict(X_test)
    clf.fit(X_train,y_train)
    
    #Run prediction on test data and return the results
    y_pred = clf.predict(X_test)
    return y_pred, clf

In [19]:
y_pred, clf = ray.get(train_fn.remote(X_train_remote, y_train_remote, X_test_remote))

[2m[36m(train_fn pid=241)[0m [Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[2m[36m(train_fn pid=241)[0m [Parallel(n_jobs=1)]: Done 100 out of 100 | elapsed:    4.0s finished
[2m[36m(train_fn pid=241)[0m [Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[2m[36m(train_fn pid=241)[0m [Parallel(n_jobs=1)]: Done 100 out of 100 | elapsed:    1.4s finished
https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations
https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations


**Let's clean up. cluster.down() will delete the Ray cluster, free up resources and delete the AppWrapper CRD.**

## Test model accuracy

In [20]:
#Import scikit-learn metrics module for accuracy calculation
from sklearn import metrics
# Model Accuracy, how often is the classifier correct?
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))

Accuracy: 0.8925617683048538


## Save model

In [21]:
# save the model to disk
filename = 'road-model.joblib'
joblib.dump(clf, filename)

['road-model.joblib']

## Load and Test prediction from saved model

In [22]:
# load the model from disk
loaded_model = joblib.load(filename)
result = loaded_model.score(X_test, y_test)
print(result)

[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.


0.8925617683048538


[Parallel(n_jobs=1)]: Done 100 out of 100 | elapsed:    1.4s finished


In [26]:
cluster.down()

In [None]:
!nvidia-smi