-
Notifications
You must be signed in to change notification settings - Fork 178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The cleanPolicy field of PyTorchJob is incorrectly placed. #966
Comments
Hi, @cheyang , considering the incompatibility between the kubeflow trainer-operator CRD with arena code , what's your though ? |
If no objection, I'm doing the fix now |
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
@lvxian-kjl @panpan0000 In previous tf-operator and pytorch-operator, the cleanPolicy was under spec, but in the training-operator CRD, it has been moved to spec.RunPolicy, which is a breaking change. Considering compatibility with users who are currently using it, there has been no direct switch to the training-operator CRD. |
Thank you for your PR, but it would impact existing users. We are also considering how to make it compatible with the training-operator CRD, such as adding a new template in the arena charts, and making a distinction when installing arena. |
Thank you @Syulin7 , but people's intuitions are "arena will be the CLI tool to manage the jobs , after installing kubeflow and its training-operator" May I suggest that : |
@panpan0000 Thank you for your suggestion, I agree. Supporting the training-operator CRD and maintaining backward compatibility are both very important. Creating a new branch is a solution, but it may add complexity and cause confusion. I am trying to achieve compatibility with the training-operator CRD by identifying the CRD and using different templates when submitting jobs. In summary, our goal is to support the latest training-operator CRD. |
so did you mean the old CRD having the different |
No, the apiVersion is all kubeflow.org/v1. |
#1024 fixed this issue, If you have any suggestions, please let me know. @panpan0000 @lvxian-kjl |
sorry, I found the release binary missing so cannot test the update |
@panpan0000 I just updated, PTLA. |
I just verified with v0.9.12, the issue has gone , thank you @Syulin7 my verification log
|
The cleanPolicy, activeDeadlineSeconds, and ttlSecondsAfterFinished within spec should be under spec.runPolicy, not directly under spec.
The text was updated successfully, but these errors were encountered: