## Steps to reproduce the Job Agent bug
To get the below wheel file for codeflare-sdk set up you can run the following in your terminal 
```bash
poetry lock
poetry install
poetry build
```

Then you can run the below to install it. This is using a dev image of the kuberay python client found here -> https://test.pypi.org/project/odh-kuberay-client/#history. The referenced kuberay python client version is working as intended. 

In [None]:
%pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ dist/codeflare_sdk-0.0.0.dev0-py3-none-any.whl --force-reinstall

Below is the code I used to confirm that Kuberay Python Client is functional. It uses the cluster api to create a cluster, and submitting a job to this is successful. You don't need to run this

In [None]:

from odh_kuberay_client import kuberay_cluster_api
from odh_kuberay_client.utils import kuberay_cluster_builder

cluster_name = "working"
namespace = "default"

director = kuberay_cluster_builder.Director()
cluster_api = kuberay_cluster_api.RayClusterApi()

# Build a small cluster
cluster_body = director.build_small_cluster(
    name=cluster_name,
    k8s_namespace=namespace,
    labels={"ray.io/cluster": cluster_name},
)

created_cluster = cluster_api.create_ray_cluster(
    body=cluster_body, k8s_namespace=namespace
)

The below uses the default ray image. Creating this cluster and submitting a job will work. To recreate the bug, comment out the `image="ray-project/ray:2.47.1"` line and submit the job again. Ensure the RayJob CR from the working job submission was deleted.  

You will need a Kind cluster with Kuberay installed for this to work FYI!

In [None]:
from codeflare_sdk import Cluster, ClusterConfiguration

cluster = Cluster(ClusterConfiguration(
    name='test-cluster',
    namespace='default',
    head_cpu_requests='1',
    head_cpu_limits='2',
    head_memory_requests=4,
    head_memory_limits=5,
    head_extended_resource_requests={'nvidia.com/gpu':0},
    worker_extended_resource_requests={'nvidia.com/gpu':0},
    num_workers=1,
    worker_cpu_requests='1',
    worker_cpu_limits='2',
    worker_memory_requests=3,
    worker_memory_limits=4,
    image="ray-project/ray:2.47.1" # this image works, omit this line to use our default image which does not work
))

cluster.apply()

With the below, a pod gets created which submits the job. This will be marked as completed if its successful and you can view the job in the Dashboard as normal. If not, it will fail, and the CR will try to recreate new pods in a loop

In [None]:
from odh_kuberay_client.kuberay_job_api import RayjobApi

rayjob_api = RayjobApi()

job_body = {
    "apiVersion": "ray.io/v1",
    "kind": "RayJob",
    "metadata": {
        "name": "test-job",
        "namespace": "default",
        "labels": {
            "app.kubernetes.io/name": "test-job",
            "app.kubernetes.io/managed-by": "kuberay",
        },
    },
    "spec": {
        "clusterSelector": {
            "ray.io/cluster": "test-cluster",
        },
        "entrypoint": 'python -c "import time; time.sleep(20)"',
        "submissionMode": "K8sJobMode",
    },
}

rayjob_api.submit_job(job=job_body, k8s_namespace="default")