-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TRACKING] discussion & planning for future of kubeflow/kubeflow
repo
#7549
Comments
Thank you for opening this @thesuperzapper! |
I would like to add @andreyvelich option mentioned here kubeflow/internal-acls#618 (comment) My personal opinion based on andreys proposal is: The more repositories we have, the more cumbersome development becomes. There is also the requirement from the KSC to split kubeflow/kubeflow. kubeflow/control-plane + kubeflow/workspaces and kubeflow/manifests adds too much development and synchronization overhead in my opinion. Moving multi-user/multi-tenancy stuff into Kubeflow/manifests where already most of the multi tenancy stuff lives makes more sense to me. Splitting what we have to release together anyway over multiple repositories does not make that much sense to me. Kfam, profile controller, etc. are tightly coupled with kubeflow/manifests. I am also fine with renaming kubeflow/manifests to kubeflow/control-plane or kubeflow/platform. But we need as few repositories as possible and a common place for multi-user stuff. For example notebooks, PVC-viewer and maybe some other things could stay in kubeflow/workspaces. |
I like @juliusvonkohout's arguments. My prosal would as well be on moving multi-tenancy closer to manifests and the rest on @thesuperzapper has a good point that right now manifests repo is laser focused only on providing a catalogue of manifests, and I agree it should stick to this. It will be confusing from a user point of view to suddenly see multi-tenancy code into a manifests repository. So my proposal is the following, after taking into consideration @juliusvonkohout @thesuperzapper and @andreyvelich's points:
Note that for multi-tenancy, I explicitly didn't mention new WGs. Although I believe this makes sense down the road, but we can start with this being a supbroject of manifests since all the dicsussions are happening there already with @juliusvonkohout and I. |
The above is to immediately unblock the effort of cleaning up Specifically I would suggest we think about and have answers with the @kubeflow/kubeflow-steering-committee on the following:
IMO with answering the above will also help the |
cc @kubeflow/wg-pipeline-leads @kubeflow/wg-training-leads @kubeflow/wg-deployment-leads @kubeflow/wg-model-registry-leads @kubeflow/kubeflow-steering-committee for the feedback. |
I agree with @kimwnasptd proposal. @kimwnasptd just to clarify: you are suggesting that the My 2 cents on:
These would all make sense as standalone projects, but they cannot stay outside of one of the existing working groups just yet. I am wondering why @juliusvonkohout suggests that we should have as few repos as possible. What is stopping us from having
Good observation. We don't have data points as to how popular this is. I don't think that this component should stay under the Notebooks WG. I think we should:
|
It adds so much overhead as Software developer, maintainer and reviewer. Just try it out yourself :-D. You will usually get lost in Processes and talking with less code and way too much communication and synchronization overhead |
Based on some discussions today I have updated my proposed "Option 3" above to suggest splitting the repo up into The goal would be to build "Notebooks 2.0" in a separate branch of |
This is a great idea, it can be WG Manifests responsibility as they are already working on both.
Maybe the overhead comes from the fact that there are too much manual steps to accomplish this? If so, we can figure out a way to automate it and trigger only when we need to cut releases for Kubeflow. |
Exactly! This seems something that should not block the creation of new repos, but rather encourage us to find ways to remove barriers and simplify process |
@thesuperzapper what do you mean with that? |
In addition to the @thesuperzapper comment above: #7549 (comment) I would like to add the following ideas based on our recent discussion. I propose the idea that we should create GitHub repos for Kubeflow components only when it makes sense to call the tool as an individual sub-project and can be deployed as a standalone application. Thus, from my perspective to find place for the "common" components (e.g. profile controller, central dashboard, TensorBoard, PVC Viewer) we should define a new entity called Kubeflow Platform which provides a way to deploy all things together and it requires those "common" components. That should help us to explain clearer how Kubeflow can be used:
Option 1: Short-term simple solutionSince we don't need to version these "common" components separately, move them to the
Option 2: Create
|
Option 2 seems to be the most future proof and avoids confusion |
I'll go with Andrey's Option 2 as well. Having KF component code into |
As far as i understand @andreyvelich the second option just implies renaming kubeflow/manifests to kubeflow/platform, but still having the same content as in Option one. I am in favor of option one with the renaming to Kubeflow/platform. Because i agree on "I propose the idea that we should create GitHub repos for Kubeflow components only when it makes sense to call the tool as an individual sub-project and can be deployed as a standalone application. |
From my point of view this is the right approach since |
I strongly believe that the manifests repo should ONLY aggregate the manifests (which are authored in the other repos). There is no benefit to bringing code into the manifests aggregation repo. We would only create problems that make it harder to develop both the code and aggregate the manifests in a usable way. The three "components" (isolatable sections of Kubeflow) that live in
Given that, I think we should create 2-3 new repos for these components, so they can be versioned on their own lifecycle:
|
Please can you explain what kind of problems we are going to have if we will combine platform components and manifests in a single repo ? From my point of view the benefit of combining
@thesuperzapper Why do you want to include PVC Management and Tensorboards to the |
Throwing in my 2c here. I think we are trying to solve a number of issues: engineering velocity, customer perception & experience, architectural "cleanliness", and ensuring maintainability of repos. IMO not all of these issues can be solved by repo organization, but maybe we can focused on the most important issues, and find compromise on less important issues. To the Andrey's point about "we should create GitHub repos for Kubeflow components only when it makes sense to call the tool as an individual sub-project [1] and can be deployed as a standalone application.[2]", I agree with the part [1]. On part [2] I think there is the chance of the "platform" growing too big, so it is not necessary to have a single "platform" repo. Rather, we could create repos for large feature areas. Today the only user facing feature in "platform" is central dashboard, so we could create a repo named as such. This will reduce confusion on the customers when they try to file issues, etc. For the common components (profile controller, etc.), there are several options: leave them with central dashboard, create separate repo, move some/all to manifest repo. I think we can give the current maintainers of these components and manifest repo to decide. It may not results in the cleanest option from architectural perspective, but I want to optimize the developer/maintainer's workflow because it's not like we have a large group of developers behind each component. Another point I want to make is whether we should separate workspace and notebook repos. IMO it may be a better product strategy to give customer confidence about product continuity from Notebook to its next version. So maybe having workspace developed in notebooks repo is a better choice from this perspective. |
@james-jwu I 100% agree that the repos should be named by their "user-facing purpose". That's why my initial proposal was to name the repo which contains In any case, it's not possible to use the While we could develop notebooks/workspaces in the same repo, because we want to version/release them separately (and allow them to be deployed alongside each other), I think separate repos is cleaner. My initial thought was to use separate branches of In any case, it's clear the next steps are to:
If @kimwnasptd agrees on steps 1 and 2, what is the process to create those new repos @james-jwu? |
As @juliusvonkohout said before: #7549 (comment) most of the multi-tenancy stuff is already living in I think, we should gather feedback from folks in the community who is planning to maintain these components (e.g. @kubeflow/wg-manifests-leads @kubeflow/wg-training-leads). Also, what do you think about this question: "What repo user might use if they get issues with Profile Controller ?"
Why you can't use the same branches for Notebooks 1.0 and 2.0 in For example, when @kimwnasptd and team worked on the new Katib UI: https://github.com/kubeflow/katib/projects/1, we created a new directory |
I agree with @james-jwu on keeping workspaces in the same repo as notebooks. It's much better to provide continuity by keeping both workstreams in the same repo, even if versioning/releasing may be slightly more challenging at the beginning. It makes much more sense from a product perspective and give much more clarity to users and contributors alike |
@andreyvelich since you asked me directly in the community call: Yes i am willing to maintain the components in the kubeflow/platform kubeflow/workspaces split and also in Matthew Wicks kubeflow/dashboard kubeflow/workspaces |
Hey @andreyvelich @james-jwu @kimwnasptd, we discussed this in the Notebooks WG meeting today, and we are happy to keep notebooks and workspaces in the same Therefore, since we are all in agreement that at least |
We need to work on a 1.9.0-rc0 release by the week of April 29th, so when working with separating notebooks from the main repo make sure this doesn't affect or block cutting a release for notebooks for 1.9.0-rc0. |
Sure, we can start that. Let's discuss this tomorrow @kubeflow/kubeflow-steering-committee. @kimwnasptd @thesuperzapper Do you want to use |
Yes, that's was I was meaning by my message in #7549 (comment). Both the existing notebooks and new workspaces code will be on the same repo (and in the same branch). |
We discussed this topic today during KSC call and we are happy to create this new repo |
@andreyvelich sounds good! |
@kubeflow/wg-notebooks-leads |
@james-jwu @zijianjoy thanks! Can we please also:
|
I have raised a separate PR to give @kimwnasptd and @thesuperzapper write access to the new |
@andreyvelich what are the next steps planned then? |
@james-jwu @zijianjoy Please can you let us know if you made changes according to the @thesuperzapper comment: #7549 (comment)
The next steps are:
Maybe we can spend a few minutes in the tomorrow's community call cc @jbottum |
@andreyvelich @james-jwu @zijianjoy I have raised a PR in the (So we don't have driveby LGTMs accidentally mering PRs which are not ready). |
Regarding "Transfer Notebooks PRs and Issues to the new repo from kubeflow/kubeflow." maybe our GSOC Student @hansinikarunarathne can help with that @rimolive |
@juliusvonkohout while I agree with some of these being closed as they are no longer relevant, we should actually transfer issues that are still relevant, rather than just closing them. Some of those issues have many many likes, and its a bad experience to just close them. Also, please leave the notebooks/dashboard/profiles issues for @kubeflow/wg-notebooks-leads to migrate once we are ready. EDIT: also, in some cases, it's ambiguous which repo the issue should live in, especially if its a suggestion for the project overall, like supporting Cilium vs Istio, and in those cases it probably makes sense to keep them on |
I think I transferred the most relevant ones for manifests/platform. For the remaining ones it probably makes sense to see how active the discussion is, especially on closed ones. If someone responds in the issues I can investigate more. For active relevant issues we first need a finalized migration plan from @andreyvelich to see which subcomponent goes where. e.g. kubeflow/notebooks, kubeflow/platform/manifests etc. Based on that KSC decision we can transfer the issues to the right place. |
@juliusvonkohout is right, we need to find maintainers and repos where we are going to transfer the Kubeflow Platform components. |
@andreyvelich see the top of the issue under "option 3" for the split we ended up deciding on. We have already created Initially, the @kubeflow/wg-notebooks-leads will maintain the new |
We haven't decided on the splitting yet, as @james-jwu said here: #7549 (comment) it is up to the component developers to decide where these components should live to simplify development and issue triaging.
So we need to know for how long @kubeflow/wg-notebooks-leads are willing to maintain the dashboard components and what should we do with other components (profile-controller, kfam, admission-webhook, crud-web-apps) ? As @juliusvonkohout mentioned above: #7549 (comment), he can maintain these component. Given that these components will always have the same release schedule as Kubeflow Platform, for me it makes sense to maintain them in a single repo together with Kubeflow manifests for Distribution Owners. Currently, @juliusvonkohout is creating Kubeflow Platform issues in the |
@kimwnasptd can confirm our thinking, but as @james-jwu said in #7549 (comment), the most sensible "user-facing" component in "kubeflow platform" is the central dashboard, so it makes sense to name the repo Regarding new maintainers, we always welcome new participants in the Notebooks WG(who currently maintain these components), and will progress people through the normal approver process as they show commitment to a specific component by raising PRs that get merged. Also, as we discussed previously, we will not be combining the dashboard with the manifests, this would not make any sense and would hinder our ability to maintain both sides. |
How are you going to explain that users should use We need to get confirmation from the current Notebooks WG leads: @StefanoFioravanzo @kimwnasptd @thesuperzapper @elikatsis @yanniszark, that you are planning to maintain these components and triage all user-related issues for long period of time.
Please can you explain how is it harden the component development cycle ? |
On moving the code of the components to Regarding having Profiles and KFAM to the same code repo as the dashboard, I have 2 proposals:
Regarding maintaining and triaging the components, yes I will from my side. These components are crucial for anyone that wants to use Kubeflow as a whole. |
Hi Folks, do we have any updates here ? If @kubeflow/wg-notebooks-leads don't have enough bandwidth to drive it, should we find more volunteers from the Kubeflow community to migrate platform's components from cc @kubeflow/kubeflow-steering-committee |
@andreyvelich current status is we are doing the "option 3" listed in the issue description: #7549 (comment) That is, we will have:
However, we can't move anything until after the 1.9.0 release (and probably 1.9.1), so we are waiting. To be ready to migrate the dashboard components once that happens, we will need @james-jwu to make us a new Note, we are already developing the Notebook v2 in the new repo, it's in a separate |
I'm also planning to help with the migration of the code of the |
Great, thank you for this @kimwnasptd and @thesuperzapper. Also, who is going to be listed in the OWNER file in |
Creating |
I feel like code transfer during the last phase of a big Kubeflow release is not a good idea, and like @thesuperzapper said in the community meeting this will affect 1.9 and its patch releases. However, I have a suggestion: What if we start with moving the issues to their respective repos? We can disable the issue tab in this repo, and following @thesuperzapper suggestion to enable Prow's transfer-issue plugin to move these issues to the new repos. |
As part of kubeflow/internal-acls#618 (giving @kimwnasptd and @thesuperzapper write access to the
kubeflow/kubeflow
repo), we were asked by the @kubeflow/kubeflow-steering-committee to make a plan for the future of thekubeflow/kubeflow
repo.Background
All components that currently live in
kubeflow/kubeflow
(under the./components/
folder) are owned and maintained by @kubeflow/wg-notebooks-leads.People have identified a few issues with this:
2.X.X
release on thekubeflow/kubeflow
repo)Options
There is not a clear "best option" for the future of the
kubeflow/kubeflow
repo, but here are the 3 ones I can see.Option 1: do nothing
We could just leave everything as is.
The existing code has lived for so long in its current location, and we can address most concerns with better documentation.
Option 2: move non-core components
REMOVED
Option 3: move everything ⭐
I think there are 2 isolatable sections of Kubeflow that live in
kubeflow/kubeflow
right now:That would leave us with the following:
kubeflow/kubeflow
:kubeflow/dashboard
access-management
(KFAM, Auth)admission-webhook
(PodDefaults) (used by both Kubeflow Notebooks and Kubeflow Pipelines)centraldashboard
profile-controller
kubeflow/notebooks
crud-web-apps
(UIs for: Volumes, Notebooks, Tensorboards*)notebook-controller
pvcviewer-controller
tensorboard-controller
*will be removed in Kubeflow 1.10example-notebook-servers
(pre-built Docker images for Notebooks 1.0)workspace-controller
workspace-spawner-ui
example-workspace-images
(pre-built Docker images for Notebooks 2.0)The text was updated successfully, but these errors were encountered: