Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add write access to kubeflow repo for wg-notebooks-leads #618

Merged

Conversation

thesuperzapper
Copy link
Member

@thesuperzapper thesuperzapper commented Oct 4, 2023

Currently, none of the @kubeflow/wg-notebooks-leads have write access to the kubeflow/kubeflow repo.

I believe that @kimwnasptd and I are trusted enough to have the additional permissions, and we both agree not to make direct commits to the master branch of repo (which can be enforced with a branch-protection rule if the admins would like).

Reason 1

This is causing significant problems because we are unable to approve GitHub action runs for new contributors (meaning our CI/CD never runs for these users).

GitHub introduced new security requirements for GitHub actions, and only members with write access can approve runs from PRs created by new contributors:

Reason 2

We cant manage our own GitHub actions jobs (e.g. canceling hung jobs and re-running them), as this requires write access to the repo.

Reason 3

We cant create release tags for testing versions (or full releases).

@thesuperzapper
Copy link
Member Author

@james-jwu and @zijianjoy, I would appreciate your review of this, to help unblock the Notebooks WG.

@thesuperzapper
Copy link
Member Author

The admins should ensure that the following repo branch-protection rules are enabled on master and v*:

  • Require a pull request before merging
  • Require status checks to pass before merging
  • Restrict who can push to matching branches (this one is important, because it prevents us from manually clicking "merge" on PRs, and keeps the bot as the only way to approve PRs)

Require a pull request before merging:

Screenshot 2023-10-04 at 16 01 53

Require status checks to pass before merging:

Screenshot 2023-10-04 at 15 59 36

Restrict who can push to matching branches:

Screenshot 2023-10-04 at 15 59 57

@zijianjoy
Copy link
Contributor

Hello @thesuperzapper ,

  • I have been helping Kimonas to trigger Github Action per request for a while. I think it is fine for now.
  • GitHub action or computing resource required presubmit testing should not be triggered by default for new contributors.

If there is any need for repo-level write permission, please raise it to Kubeflow Steering Committee, or propose to separate Notebook component into its own repo going forward.

@thesuperzapper
Copy link
Member Author

@zijianjoy just copying the main points raised during the community meeting:

  • I might not have clearly articulated this, but as long as the branch protection rules are correctly set, having write permission at the github level does not allow us to merge PRs (except via the normal OWNERS approval process)
  • Even though you can approve the new committers, this is unnecessary and puts blocks on us (as sometimes the same PR has to be approved for github actions multiple times)
  • (Also, I checked, and actually I was mistaken about being able to move issues with write permission, as we wont have write on the target repos, like pipelines)

@kimwnasptd I am interested in your feedback here.

@juliusvonkohout
Copy link
Member

juliusvonkohout commented Jan 2, 2024

@zijianjoy @kimwnasptd @thesuperzapper i think we need the same for Kubeflow/manifests

Signed-off-by: Mathew Wicks <5735406+thesuperzapper@users.noreply.github.com>
@thesuperzapper thesuperzapper changed the title add kubeflow repo maintainers add write access to kubeflow repo for wg-notebooks-leads Mar 1, 2024
@thesuperzapper
Copy link
Member Author

@andreyvelich @james-jwu this really needs to be resolved soon (it's making it hard to release 1.8.1 for notebooks).

Can the KSC confirm they are happy with this, and if so, please merge?

@juliusvonkohout
Copy link
Member

Yes please and I also need write access to create tags for Kubeflow/manifests and the 1.8.1 prerelease.

Cc @rimolive

@thesuperzapper
Copy link
Member Author

Yes please and I also need write access to create tags for Kubeflow/manifests and the 1.8.1 prerelease.

Cc @rimolive

@juliusvonkohout raise a separate PR, because I don't want that to block this one.

@juliusvonkohout
Copy link
Member

@thesuperzapper it is already the case for wg-manifests-leads
https://github.com/juliusvonkohout/internal-acls/blob/83113d5f0d5bd55a17bbd5578780b07547b85616/github-orgs/kubeflow/org.yaml#L1047-L1047 so this should be merged rather soon.

@juliusvonkohout
Copy link
Member

But i still cannot create tags. So there is another problem.

@thesuperzapper
Copy link
Member Author

But i still cannot create tags. So there is another problem.

@juliusvonkohout I bet there is a "protected tags" rule, like this one from another repo:

Screenshot 2024-03-04 at 17 32 00

The only way around that is to be an admin (which is obviously not allowed), however, GitHub now has more granular "rule sets" in which we can add specific groups as "except from":

Screenshot 2024-03-04 at 17 35 13

We can remove the "protected tags" and replace them with:

  1. A ruleset to prevent v* tag deletion by anyone (including admins)
  2. A ruleset to allow working group leads to create new tags

Copy link
Member

@terrytangyuan terrytangyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered moving notebooks to a separate repo?

@juliusvonkohout
Copy link
Member

Have you considered moving notebooks to a separate repo?

@terrytangyuan @andreyvelich There are several Problems here.

  1. It is slowing down development. I am very happy that I finally got write access to Kubeflow/manifests. It made life so much easier. This should happen as soon as possible for Kubeflow/notebooks maintainers as well to speed up development.

  2. We need to be able to create tags as explained above for Kubeflow/manifests and Kubeflow/kubeflow

  3. We need more maintainers for both repositories. I am not sure whether Kimonas is still active and we have 400 open issues and PRs for botebooks. So it is mainly me for manifests and Matthew for notebooks. Notebooks is not properly maintained anymore due to lack of maintainer time and permissions. So I will apply for Kubeflow/Kubeflow maintainership especially for Gsoc and cleaning up issues and merging PRs @thesuperzapper what do you think? Would you be willing to add me as Kubeflow/notebooks maintainer? I want to clean it up as I cleaned up Kubeflow/manifests with 25 real issues and 2 PRs now. Before it was probably 8 times that.

  4. After this is done and development is unblocked again we can break notebooks/workspaces into its own repository.

@zijianjoy @james-jwu can approve this PR.
While the KSC can decide about the tagging permissions.

I hope that @kimwnasptd comes back as maintainer and that the release manager @rimolive can back up my statements.

@james-jwu
Copy link
Contributor

I am okay with granting write access until longer term solution can be found. I would like to get 2 more lgtms from @kubeflow/kubeflow-steering-committee

@andreyvelich
Copy link
Member

I understand these concerns @juliusvonkohout and I agree with you that WG co-chairs should have write permission to the appropriate repositories.
As @james-jwu said let's bring this up to the next KSC meeting and move this forward.
At the same time, for the long-term solution can you create an issue in kubeflow/kubeflow to track migration of Notebook, Central Dashboard, KFAM, Profiles, PVCs, TensorBoards, KFAM to the separate GitHub repositories.

So kubeflow/kubeflow will be just a landing page for users like in Argo: https://github.com/argoproj/argoproj

That will make very clear to the users that Kubeflow is a combination of different sub-services for end-to-end AI/ML platform on Kubernetes.

We can also disable issue creation in the kubeflow/kubeflow repo to avoid miss-triaging issues for WG leads.

@StefanoFioravanzo
Copy link
Member

@andreyvelich I agree on disabling issues under kubeflow/kubeflow (as a long-term goal), we can redirect people to kubeflow/community for anything concerning the org, governance, etc, and kubeflow/manifests for anything concerning installation, integration, etc. I think we still need to capture a category of issues and discussion that involve e2e use of Kubeflow (cross-component). Let's discuss more in a dedicated issue

@StefanoFioravanzo
Copy link
Member

StefanoFioravanzo commented Mar 6, 2024

@juliusvonkohout @thesuperzapper I want to help you guys to stabilize the issue count and the repo migration. We can form a plan and assign tasks. Migration out of kubeflow/kubeflow and dealing with triaging the high number of issues/PR is top of mind. We could have a separate meeting outside of the WG meeting so that we can have a focused discussion on this and make a plan.

@andreyvelich
Copy link
Member

I think we still need to capture a category of issues and discussion that involve e2e use of Kubeflow (cross-component). Let's discuss more in a dedicated issue

We can use kubeflow/community repo or other repos. Let's discuss it some time. Usually, these issues are not created by Kubeflow users so we can find better place.

@thesuperzapper
Copy link
Member Author

@andreyvelich @james-jwu so can we please merge this?

It simply adds myself and @kimwnasptd as write access on kubeflow/kubeflow, clearly there's a lot of separate work to discuss around the future of the kubeflow/kubeflow repo, but in the short term, it would make our lives a lot easier, and help us get releases out.

@andreyvelich
Copy link
Member

@andreyvelich @james-jwu so can we please merge this?

It simply adds myself and @kimwnasptd as write access on kubeflow/kubeflow, clearly there's a lot of separate work to discuss around the future of the kubeflow/kubeflow repo, but in the short term, it would make our lives a lot easier, and help us get releases out.

Please give us a few more hours to discuss it during today's KSC meeting.
/assign @johnugeorge @jbottum @james-jwu @terrytangyuan

@andreyvelich
Copy link
Member

Hi @thesuperzapper, during today KSC meeting we discussed this PR to give write access to kubeflow/kubeflow repo with branch protection.
We are agreed to give Notebooks WG write access if you can create an issue where you will establish an estimate timeline to migrate Kubeflow components to the separate repositories. This timeline should not be tight, but at least we will start discussions.

@andreyvelich
Copy link
Member

andreyvelich commented Apr 3, 2024

Hi @thesuperzapper @juliusvonkohout @kubeflow/wg-notebooks-leads, do we have any updates here with our plan?

We think that we should solve this problem relatively soon since many users/contributors are getting confused with What is Kubeflow? question.
We still see issues created in kubeflow/kubeflow which are related to other Kubeflow components. For example: kubeflow/kubeflow#7504, kubeflow/kubeflow#7488.

By solving this problem we will give much more clarity to our users and will see much more adoption when users realise the full potential for the Kubeflow project.

cc @kubeflow/kubeflow-steering-committee

@thesuperzapper
Copy link
Member Author

thesuperzapper commented Apr 3, 2024

@andreyvelich with the amount of resources we currently have in the @kubeflow/wg-notebooks-leads, I just don't see any way that splitting the repo up will be possible in the short to medium term. We barely have enough contributors to maintain the existing components.

Realistically, I can't continue to maintain the notebooks and core components without write access to kubeflow/kubeflow, it's simply not possible to develop software without the ability to regularly create releases (especially pre-releases).

For example, there are critical issues with 1.8.0 (kubeflow/kubeflow#7453), but as I am unable cherry-pick and cut a 1.8.1-beta.0, we are unable to release anything.

@thesuperzapper
Copy link
Member Author

Even if we get more resources for "core" + "notebooks", we still have to consider the following:

  • kubeflow/kubeflow contains all the "core" components (dashboard, profile controller, kfam, volumes, etc)
  • Kubeflow Notebooks can't be run without the shared components (it requires the dashboard, istio, profiles), and that is unlikely to change.

I am not saying we can't split it up, just that there are no clear lines to split down, and we will likely still end up with the core components in kubeflow/kubeflow anyway.

@thesuperzapper
Copy link
Member Author

Also, to help direct people to the right repo for issues, we can use issue templates and make users tick a box saying "yes, this issue is related to notebooks or core" before allowing them to submit a new issue.

But as a backup, issues can be transferred between repos by people who have write access to BOTH the source and target repo.

@andreyvelich
Copy link
Member

andreyvelich commented Apr 3, 2024

We barely have enough contributors to maintain the existing components.

I understand your frustration around not enough contributors to the existing components, but don't you think if we separate Kubeflow Notebooks in a separate GitHub repo it will be easier for folks to understand where and how to contribute to Kubeflow Notebooks ?
Also, we will open doors to make Kubeflow Notebooks more portable and make it possible to deploy without other Kubeflow components (dashboard, profile controller, etc.).

kubeflow/kubeflow contains all the "core" components (dashboard, profile controller, kfam, volumes, etc). I am not saying we can't split it up, just that there are no clear lines to split down, and we will likely still end up with the core components in kubeflow/kubeflow anyway.

I am not sure if kubeflow/kubeflow should contain core components given the confusion and feedback from our users and new contributors. We can always create kubeflow/multi-tenancy or kubeflow/control-plane or kubeflow/central-dashboard repos to isolate them.
Also, this migration should help us to deprecate/remove un-used components and reduce complexity of Kubeflow control plane.

If I remember correctly @kubeflow/wg-notebooks-leads discussed previously that we can remove KFAM.

Also, to help direct people to the right repo for issues, we can use issue templates and make users tick a box saying "yes, this issue is related to notebooks or core" before allowing them to submit a new issue.

It's not only about issue template it is more about answering the question What is Kubeflow ?
For example, new users can easily get lost when they open https://github.com/kubeflow/kubeflow and see just part of Kubeflow components. Also, many users still think that all Kubeflow components should be deployed to use it as part of their ML Platform by looking at the Installing Kubeflow Website.

Right now given the energy around new Kubeflow components (e.g. Model Registry, Spark Operator) we should work towards:

  1. Explaining the composability of Kubeflow components
  2. Explaining the capability to use Kubeflow as an end-to-end ML platform.

I saw benefits in both approaches if we want to grow our community.

I believe, this first step of migrating Kubeflow Notebooks and Central Dashboard from kubeflow/kubeflow should help us with that.

What do others think about it?
cc @kubeflow/wg-training-leads @kubeflow/wg-pipeline-leads @kubeflow/wg-model-registry-leads @kubeflow/kubeflow-steering-committee @jeremyeder @bigsur0 @StefanoFioravanzo @zijianjoy @akgraner

@juliusvonkohout
Copy link
Member

juliusvonkohout commented Apr 3, 2024

@andreyvelich

I think for the time being @thesuperzapper and @kimwnasptd aka the "notebooks" leads just need the write access, independent of the repository splitting.

I talked to Kimonas at Kubecon and he will support me to become the third maintainer for kubeflow/kubeflow, but it might take until the end of the year. Then I might be able to help Matthew a bit and improve the maintainer situation.

Until then we have to at least make it possible to release Kubeflow 1.9. At the moment Mathew can't even approve workflows for PRs which really stalls the development.

@kimwnasptd
Copy link
Member

Hey folks, it's been a minute.

I agree with @andreyvelich on the importance of keeping kubeflow/kubeflow repo clean/empty and how people can add issues for all sorts of things. @thesuperzapper let's have a sync and write a plan and checklist with the things that need to get migrated and where. I can help with some of the toil work there as well as we tackle it in parallel.

My high level proposal is the following, after reading the comments:

  1. Create a new kubeflow/notebooks repo and move there the Jupyter web app, example notebook servers and notebook controller
    • The jupyter web app can reference the common code in another repo similarly to models web app and katib web app
    • The notebook controller needs some common code, but we could start with copy/paste just to unblock for now
  2. Create new kubeflow/control-plane and move there everything else
  3. Setup dockerhub (or ghcr.io) credentials
  4. Copy and slightly modify the existing CI to the corresponding repos
  5. Update the docs with new references

I am not saying we can't split it up, just that there are no clear lines to split down, and we will likely still end up with the core components in kubeflow/kubeflow anyway.

@thesuperzapper I also agree that indeed the lines of what to split is not fully clear, and potentially then raise the question: What is kubeflow/control-plane and versioning scheme? But the above is good enough for me at this point to unblock the effort and I'm sure @andreyvelich and the @kubeflow/kubeflow-steering-committee will be willing to further define this in the future.

TL;DR: @thesuperzapper let's have a sync to see how we can most painlessly empty the kubeflow/kubeflow repo

@kimwnasptd
Copy link
Member

But @andreyvelich in the interim, once we provide a plan, we'll really need to have write access to even work on the 1.9 release

#618 (comment)

@andreyvelich
Copy link
Member

Thank you for your comments @juliusvonkohout @kimwnasptd.

What is kubeflow/control-plane and versioning scheme?

Another solution could be to move it to kubeflow/manifests if these control plane components will be always versioned together with Kubeflow releases and will be used only when all Kubeflow components are installed, but we can discuss it separately.

But @andreyvelich in the interim, once we provide a plan

That makes sense @kimwnasptd, I am happy to unblock it. Let me get confirmation today form @kubeflow/kubeflow-steering-committee to move this forward.

In the meantime @kimwnasptd please can I ask you to create tracking issue in kubeflow/community repo with high-level proposal that you suggested above: #618 (comment)
So we can track our ideas and discussions. We will work together in the future to improve it.

@juliusvonkohout
Copy link
Member

juliusvonkohout commented Apr 8, 2024

Another solution could be to move it to kubeflow/manifests if these control plane components will be always versioned together with Kubeflow releases and will be used only when all Kubeflow components are installed, but we can discuss it separately.

@andreyvelich
Yes, that could actually be more sensible since Kubeflow/manifests is to some degree kubeflow/control-plane already. We already have istio, knative, cert-manager and other stuff.

@james-jwu
Copy link
Contributor

The KSC discussed the issues presented in this PR. The short term need for Notebook WG leads to obtain write access to github/kubeflow repo is acknowledged. KSC would like to ask the Notebook WG to work towards the long term solution of moving notebooks to a separate repo. The approval of this PR will be temporarily granted for 3 months, during which time the Notebook WG is asked to:

  1. Create an issue for notebook repo migration (see comment for suggested work items)
  2. Work with KSC on repo migration.

Please ACK this message if you are in agreement with this decision @thesuperzapper, @kimwnasptd, after which we will unblock this PR.

Thanks! - James (on behalf of KSC)

@andreyvelich
Copy link
Member

@kimwnasptd @thesuperzapper Please can you reply on the message above from @james-jwu to move this forward ?
As I mentioned before that should significantly help Kubeflow project to be more successful.

@thesuperzapper
Copy link
Member Author

I have made an issue that we can use to discuss the next steps

@james-jwu I am ok with trying to move as much as we can out of the kubeflow/kubefow repo (as long as we are careful about it).

@andreyvelich
Copy link
Member

Since @thesuperzapper created tracking issue, I am happy to unblock this PR as @james-jwu suggested here: #618 (comment)
/assign @kubeflow/kubeflow-steering-committee

@kimwnasptd
Copy link
Member

Thank you for this @andreyvelich @james-jwu!

@rimolive
Copy link
Member

rimolive commented Apr 24, 2024

Any ETA to merge this PR and give @thesuperzapper write access to cut a release for kubeflow/kubeflow? This is blocking Kubeflow 1.9.0-rc0 release at this moment.

cc @kubeflow/kubeflow-steering-committee

@andreyvelich
Copy link
Member

Yes, we can merge it, sorry for the delay.
/lgtm
/assign @james-jwu @johnugeorge @jbottum @terrytangyuan

@jbottum
Copy link
Contributor

jbottum commented Apr 24, 2024

/lgtm

@james-jwu
Copy link
Contributor

/lgtm
/approve

Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: james-jwu, thesuperzapper

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-prow google-oss-prow bot merged commit 35b98a2 into kubeflow:master Apr 24, 2024
4 checks passed
@thesuperzapper
Copy link
Member Author

@james-jwu I think someone will need to run the sync script, because I still don't have write access on kubeflow/kubeflow

@james-jwu
Copy link
Contributor

@thesuperzapper this is fixed by @zijianjoy

@thesuperzapper thesuperzapper deleted the add-kubeflow-repo-maintainers branch April 26, 2024 00:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.