Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Donate the MPI-Operator.V2 to kubernetes-sigs #557

Closed

Conversation

ArangoGutierrez
Copy link

Signed-off-by: Carlos Eduardo Arango Gutierrez carangog@redhat.com

Signed-off-by: Carlos Eduardo Arango Gutierrez <carangog@redhat.com>
@ArangoGutierrez
Copy link
Author

/cc @alculquicondor

@ArangoGutierrez
Copy link
Author

@terrytangyuan
Copy link
Member

@kubeflow/project-steering-group

proposals/donate-mpi-operator.md Outdated Show resolved Hide resolved
proposals/donate-mpi-operator.md Outdated Show resolved Hide resolved
proposals/donate-mpi-operator.md Outdated Show resolved Hide resolved
proposals/donate-mpi-operator.md Outdated Show resolved Hide resolved
proposals/donate-mpi-operator.md Outdated Show resolved Hide resolved
proposals/donate-mpi-operator.md Outdated Show resolved Hide resolved
Co-authored-by: Aldo Culquicondor <1299064+alculquicondor@users.noreply.github.com>
@ArangoGutierrez
Copy link
Author

/assign @Bobgy

@alculquicondor
Copy link

/assign @theadactyl

Signed-off-by: Carlos Eduardo Arango Gutierrez <carangog@redhat.com>
@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ArangoGutierrez
To complete the pull request process, please ask for approval from bobgy after the PR has been reviewed.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@theadactyl
Copy link
Contributor

Thanks for opening this! I don't see anywhere an assessment of how this impacts existing users of MPI operator and/or the unified operator, and what the implications are for unified operator going forward. Also how this might impact other parts of Kubeflow. I think that's important to include -- can you do so?

Additionally, what in particular is the value of donation here, vs just maintaining MPI operator within Kubeflow with fewer dependencies that would block HPC users? Donation isn't a lightweight process or decision, so interested to see what the specific advantages/disadvantages are there so we can call out and validate any assumptions being made here.

Also, when there are a couple more updates, I'd like to get feedback from the community. This would benefit from being sent to kubeflow-discuss & having a spot for discussion in an upcoming Community Meeting.

How does that sound?

@richardsliu
Copy link

I would like to see a more detailed proposal for the migration plan. Specifically:

  • How do we avoid having two divergent versions of the MpiJob?
  • Assuming that the new MpiJob replaces the current version, how do we handle installation of the new MPI operator?
  • How do we handle common dependencies?
  • Will the new MpiJob API be backward compatible?
  • What will be the release and versioning plan going forward?

Copy link
Member

@Arhell Arhell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@google-oss-prow google-oss-prow bot added the lgtm label May 10, 2022
@Arhell Arhell removed their assignment May 10, 2022
@denkensk
Copy link
Member

denkensk commented Aug 5, 2022

HI @ArangoGutierrez Any new process?

@ArangoGutierrez
Copy link
Author

Given cncf/toc#950
let's
/hold

@ArangoGutierrez
Copy link
Author

I think now I have time to get back to this thread
PING @theadactyl @alculquicondor @richardsliu
What would it take to revive this thread

@alculquicondor
Copy link

While this is still open, it would be a lot of effort from a copyright perspective cncf/toc#950

So I think we should wait.

@theadactyl
Copy link
Contributor

theadactyl commented Feb 9, 2023 via email

@tenzen-y
Copy link
Member

cc

@ArangoGutierrez
Copy link
Author

cncf/toc#1042

@ArangoGutierrez
Copy link
Author

I guess we can restart this effort?

@alculquicondor
Copy link

I think we can rather drop it.

Now that kubeflow is part of CNCF, it's easier for organizations to contribute to the project.

@ArangoGutierrez do you have an additional motivation to make this a kubernetes subproject?

@ArangoGutierrez
Copy link
Author

I think we can rather drop it.

Now that kubeflow is part of CNCF, it's easier for organizations to contribute to the project.

@ArangoGutierrez do you have an additional motivation to make this a kubernetes subproject?

Agree
/Close

@google-oss-prow google-oss-prow bot closed this Jul 26, 2023
@google-oss-prow
Copy link

@ArangoGutierrez: Closed this PR.

In response to this:

I think we can rather drop it.

Now that kubeflow is part of CNCF, it's easier for organizations to contribute to the project.

@ArangoGutierrez do you have an additional motivation to make this a kubernetes subproject?

Agree
/Close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@thesuperzapper
Copy link
Member

If this is the case, can we please make some effort to update the documentation on the Kubeflow website to reflect that there are now two separate components:

  1. The unified training operator: https://github.com/kubeflow/training-operator
  2. The MPI Operator: https://github.com/kubeflow/mpi-operator

We need to update and split the following component page: https://www.kubeflow.org/docs/components/training/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants