Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated examples/pytorch to disable istio sidecar injection #2004

Merged
merged 2 commits into from
Feb 23, 2024

Conversation

jdcfd
Copy link
Contributor

@jdcfd jdcfd commented Feb 11, 2024

What this PR does / why we need it:
Examples do not work because of istio sidecar injection causing trouble
Which issue(s) this PR fixes (optional, in Fixes #<issue number>, #<issue number>, ... format, will close the issue(s) when PR gets merged):
Fixes #1637

Checklist:

Copy link

google-cla bot commented Feb 11, 2024

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

Copy link
Member

@tenzen-y tenzen-y left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!
/lgtm
/approve

/assign @kubeflow/wg-training-leads
for approval CI.

@tenzen-y
Copy link
Member

tenzen-y commented Feb 13, 2024

/hold
for CI

@andreyvelich
Copy link
Member

andreyvelich commented Feb 13, 2024

Thank you for creating this PR @jdcfd!
@tenzen-y @johnugeorge I remember previously we had discussion that we should not add Istio annotation to the Examples to not confuse users: kubeflow/katib#1359 (comment)
At least we did it for Katib.
Should we follow the same model for Training Operator examples, and just update the documentation ?

@tenzen-y
Copy link
Member

Thank you for creating this PR @jdcfd! @tenzen-y @johnugeorge I remember previously we had discussion that we should not add Istio annotation to the Examples to not confuse users: kubeflow/katib#1359 (comment) At least we did it for Katib. Should we follow the same model for Training Operator examples, and just update the documentation ?

@andreyvelich I'm ok with either way. I think that the inconsistent examples and docs brought confusions :(
So, if we remove the istio annotation, we should remove the annotation from all examples and add documentation as you say.

@coveralls
Copy link

coveralls commented Feb 13, 2024

Pull Request Test Coverage Report for Build 8015652309

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 8 unchanged lines in 1 file lost coverage.
  • Overall coverage decreased (-0.09%) to 42.805%

Files with Coverage Reduction New Missed Lines %
pkg/controller.v1/mpi/mpijob_controller.go 8 80.48%
Totals Coverage Status
Change from base Build 7758874698: -0.09%
Covered Lines: 3748
Relevant Lines: 8756

💛 - Coveralls

@jdcfd
Copy link
Contributor Author

jdcfd commented Feb 14, 2024

Hi all, thanks for the feedback. I have to admit this is all new to me, even the process of submitting a PR to an open-source project so bear with me.

From what I understood, it is not always necessary to add the Istio annotation, therefore, the user should do it manually if necessary. That was the path you took for katib. In that case, one option is to remove annotation from all examples and add a comment in examples/pytorch/README.md .

Is this the same with all other training operators? or only pytorch?

@tenzen-y
Copy link
Member

Hi all, thanks for the feedback. I have to admit this is all new to me, even the process of submitting a PR to an open-source project so bear with me.

From what I understood, it is not always necessary to add the Istio annotation, therefore, the user should do it manually if necessary. That was the path you took for katib. In that case, one option is to remove annotation from all examples and add a comment in examples/pytorch/README.md .

Is this the same with all other training operators? or only pytorch?

Oh, I see. Welcome to the kubeflow community :)
Yes, we should remove the istio annotations from all training operators.

@andreyvelich
Copy link
Member

Welcome to the community @jdcfd!
Yes, @tenzen-y is right, we can remove istio annotations from all examples.
That would be great if you could submit PR in the Kubeflow Website repo to explain how users can disable Istio for every job.
Maybe we can follow the TFJob guide and update other CRs accordingly: https://www.kubeflow.org/docs/components/training/tftraining/#what-is-tfjob

@tenzen-y
Copy link
Member

/lgtm cancel
/approve cancel

@google-oss-prow google-oss-prow bot removed the lgtm label Feb 16, 2024
@tenzen-y
Copy link
Member

/approve cancel

@jdcfd
Copy link
Contributor Author

jdcfd commented Feb 22, 2024

@tenzen-y @andreyvelich I have removed the istio annotations from the examples, added some comments to the README file for PyTorch jobs and submitted a PR to the kubeflow/website repo: kubeflow/website#3686

@@ -13,7 +13,7 @@ spec:
containers:
- name: pytorch
image: pytorch-cpu:py3.8
imagePullPolicy: Always
imagePullPolicy: IfNotPresent
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove this imagePullPolicy, since IfNotPresent is set by default in Kubernetes container.

jdcfd and others added 2 commits February 22, 2024 22:41
…ytorch/README.md

Signed-off-by: Juan Diego Colmenares Fernandez <cielocfd@gmail.com>
Removed pull policy

Signed-off-by: Juan Diego Colmenares Fernandez <cielocfd@gmail.com>
@jdcfd
Copy link
Contributor Author

jdcfd commented Feb 23, 2024

@tenzen-y I removed the istio annotations as discussed. I also submitted a PR to the documentation website repo.

Copy link
Member

@andreyvelich andreyvelich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution @jdcfd 🎉
/lgtm
/approve

Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andreyvelich, jdcfd

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@andreyvelich
Copy link
Member

/hold cancel

@google-oss-prow google-oss-prow bot merged commit 426ec21 into kubeflow:master Feb 23, 2024
36 checks passed
deepanker13 pushed a commit to deepanker13/deepanker-training-operator that referenced this pull request Apr 8, 2024
…#2004)

* Removed istio annotations from examples. Added comments to examples/pytorch/README.md

Signed-off-by: Juan Diego Colmenares Fernandez <cielocfd@gmail.com>

* Update demo.yaml

Removed pull policy

Signed-off-by: Juan Diego Colmenares Fernandez <cielocfd@gmail.com>

---------

Signed-off-by: Juan Diego Colmenares Fernandez <cielocfd@gmail.com>
Signed-off-by: deepanker13 <deepanker.gupta@nutanix.com>
johnugeorge pushed a commit to johnugeorge/training-operator that referenced this pull request Apr 28, 2024
…#2004)

* Removed istio annotations from examples. Added comments to examples/pytorch/README.md

Signed-off-by: Juan Diego Colmenares Fernandez <cielocfd@gmail.com>

* Update demo.yaml

Removed pull policy

Signed-off-by: Juan Diego Colmenares Fernandez <cielocfd@gmail.com>

---------

Signed-off-by: Juan Diego Colmenares Fernandez <cielocfd@gmail.com>
johnugeorge pushed a commit to johnugeorge/training-operator that referenced this pull request Apr 28, 2024
…#2004)

* Removed istio annotations from examples. Added comments to examples/pytorch/README.md

Signed-off-by: Juan Diego Colmenares Fernandez <cielocfd@gmail.com>

* Update demo.yaml

Removed pull policy

Signed-off-by: Juan Diego Colmenares Fernandez <cielocfd@gmail.com>

---------

Signed-off-by: Juan Diego Colmenares Fernandez <cielocfd@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

example is missing annotation to prevent istio side car injection
4 participants