Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why manifests/base/service.yaml does not include webhook server port (443) in version 1.7.0~1.5.0? #2080

Closed
yeonhooy opened this issue Apr 24, 2024 · 7 comments

Comments

@yeonhooy
Copy link

I wonder why manifests/base/service.yaml does not include webhook server port (443) in version 1.7.0~1.4.0?

I tried to use training-operator version 1.7.0 or 1.5.0 (stable version) instead of master version.
However, I found that there is no webhook port for training operator, which seems to cause this error:
Internal error occurred: failed calling webhook "validator.tfjob.training-operator.kubeflow.org": Post "https://training-operator.kubeflow.svc:443/validate-kubeflow-org-v1-tfjob?timeout=10s": no service port 443 found for service "training-operator"

Thanks

@show981111
Copy link

show981111 commented Apr 28, 2024

Same problem.

Edit) I deleted my minikube cluster and restarted. Then seems like working again.

@yeonhooy
Copy link
Author

Thanks for sharing :)

It may not be a big issue with missing the webhook server port in previous versions.
Because they did not require webook validation, I guess (maybe I`m wrong).

I`m now just using a master version that does not cause any problems.

@tenzen-y
Copy link
Member

@yeonhooy @show981111 Hi, thank you for creating this issue.
How to install training-operator? I guess that you performed kubectl apply -k "github.com/kubeflow/training-operator/manifests/overlays/standalone", right?

@yeonhooy
Copy link
Author

yeonhooy commented May 1, 2024

@tenzen-y Right. I tried to use this command kubectl apply -k "github.com/kubeflow/training-operator/manifests/overlays/standalone?ref=v1.7.0" for stable version.
But I failed due to the missing of webhook server port.

I'm now using master version without ValidatingWebhookConfiguration.

@yeonhooy yeonhooy reopened this May 1, 2024
@tenzen-y
Copy link
Member

tenzen-y commented May 1, 2024

@tenzen-y Right. I tried to use this command kubectl apply -k "github.com/kubeflow/training-operator/manifests/overlays/standalone?ref=v1.7.0" for stable version. But I failed due to the missing of webhook server port.

I'm now using master version without ValidatingWebhookConfiguration.

It's curious. In v1.7.0, the training-operator doesn't have any webhook validations. So, the above error should not happen.
But, we introduced webhook validations since v1.8.0-rc0 (as well as the master branch). So, I suspect that you accidentally performed the kubectl apply -k "github.com/kubeflow/training-operator/manifests/overlays/standalone (the master branch) before.

@tenzen-y
Copy link
Member

tenzen-y commented May 1, 2024

But, we introduced webhook validations since v1.8.0-rc0 (as well as the master branch).

In other words, we never provided the manifests for the validator.tfjob.training-operator.kubeflow.org until v1.8.0.

@yeonhooy
Copy link
Author

yeonhooy commented May 1, 2024

I see. May be I performed the master version before performing version v1.7.
My curiosity is resolved :)

Thanks!!

@yeonhooy yeonhooy closed this as completed May 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants