-
Notifications
You must be signed in to change notification settings - Fork 59
fix: get_cluster logic #476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: get_cluster logic #476
Conversation
Fiona-Waters
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have run through a notebook on OpenShift and all works as expected. Will run on a kind cluster next. Just a couple of nitpicks at the moment. 🙂
|
@Bobbins228 Cluster details fetching successfully with Note: Job submission works fine without getting cluster details using get_cluster() function. Error log : |
|
@Srihari1192
I was able to submit a job without failure I am unsure what other differences there can be between our setups or how to recreate the error unless I missed a step? |
I have a local fix for that. I will push later today :)
With the way |
changed the get cluster logic to reflect the new volume mounts added References RHOAIENG-4028
@Bobbins228 Tested with the new changes , |
KPostOffice
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm other than one small nit
KPostOffice
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: KPostOffice The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |


Issue link
References RHOAIENG-4028
What changes have been made
The addition of the new volumeMounts has caused the
get_cluster()function to fail as it assumes the new VMs are the local interactive VMs.This makes the function try to recreate the local_interactive related resources which causes the
ValueError: ingress_domain is invalid. For creating the client route/ingress please specify an ingress domainerror because
ingress_domainwould not be specified.Also added simplified import for the
get_cluster()command. It can now be imported like this.from codeflare_sdk import get_clusterVerification steps
Run through a demo notebook and create a Ray Cluster on Kind and OpenShift.
run
cluster = get_cluster(name, namespace)cluster.down()and othercluster.commands should work.Checks