Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal future ownership/development of Kubeflow distributions should be outside Kubeflow #434

Merged
merged 1 commit into from Nov 4, 2020

Conversation

jlewi
Copy link
Contributor

@jlewi jlewi commented Oct 27, 2020

There has been significant debate about what to do about Kubeflow distributions

See: #402
Thread
Arrikto Proposal

I believe this proposal captures the conensus among the leads.

/cc @kubeflow/project-steering-group
/cc @kubeflow/wg-notebook-leads
/cc @kubeflow/wg-pipeline-leads
/cc @kubeflow/wg-automl-leads
/cc @kubeflow/wg-training-leads

/cc @cvenets @yanniszark @RFMVasconcelos @animeshsingh @paveldournov @johnugeorge @krishnadurai @Jeffwan

@k8s-ci-robot
Copy link

@jlewi: GitHub didn't allow me to request PR reviews from the following users: cvenets.

Note that only kubeflow members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

There has been significant debate about what to do about Kubeflow distributions

See: #402
Thread
Arrikto Proposal

I believe this proposal captures the conensus among the leads.

/cc @kubeflow/project-steering-group
/cc @kubeflow/wg-notebook-leads
/cc @kubeflow/wg-pipeline-leads
/cc @kubeflow/wg-automl-leads
/cc @kubeflow/wg-training-leads

/cc @cvenets @yanniszark @RFMVasconcelos @animeshsingh @paveldournov @johnugeorge @krishnadurai @Jeffwan

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@google-cla google-cla bot added the cla: yes label Oct 27, 2020
@kubeflow-bot
Copy link

This change is Reviewable

@thesuperzapper
Copy link
Member

I have a few points I want to raise:

  • I agree that vendor distributions of Kubeflow should be maintained separate from the Kubeflow org
  • I think it is dangerous for Kubeflow to not have a minimal "generic" distribution which Kubeflow org is responsible for
    • This ensures we actually release a platform which is capable of being deployed
    • This gives us a place to agree on things like minimum Istio versions, and what other cluster-level dependencies we allow for core-components
    • This can act as a base for vendors who don't want to spend significant effort on a custom solution
    • This will help prevent any specific vendor taking control of Kubeflow

Questions:

  • Should the offical docs continue to include installation guides for vendor Kubeflow? Or should we just link to each vendors own docs?

@PatrickXYS
Copy link
Member

Should the offical docs continue to include installation guides for vendor Kubeflow? Or should we just link to each vendors own docs?

I think if vendors want to continue working on official installation docs, they can do so, otherwise, they can redirect to their own page.

Besides that, I think it's reasonable to have a minimal generic distribution, from this perspective,

  1. Help us find out potential issue which does not reveal in one Kubeflow component, but in distribution.
  2. It can help drive some feature requests, like @thesuperzapper suggests, common Kubernetes component including istio


## Proposal

Going forward all distributions of Kubeflow should be owned and maintained outside of Kubeflow.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All? I thought we were going to have a generic version owned by the deployment WG.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please review: #402 distributions of Kubeflow are out of scope for the deployments WG.


### Ownership & Development

Going forward new distributions of Kubeflow should be developed outside of the Kubeflow GitHub org. This ensures

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My issue is if this is completely outside the org, then what's the end user experience? A new user should be able to come to kubeflow/kubeflow and install a great experience (jupyter, kale, tf/pytorch job, katib, tfserving/seldon/triton/whatever).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The install experience will be

install pipelines

git clone git@github.com:kubeflow/manifests.git  git_manifests
kustomize build ./pipelines | kubectl apply -f
kustomize build ./notebooks | kubectl apply -f
....

In other words the install experience will be exactly what it is for everywhere Kubernetes application and similar to what it is for Tekton.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my 2-cents:

This "install the piece you want" experience will be ideal for some use-cases but, will weaken the "try-out Kubeflow" UX, which is key for adoption.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jlewi I am not quite sure it will be that simple, there will also be at least some cluster-level stuff which is required for all apps.

What the user chooses for the cluster-level stuff is probably going to affect the YAML for each of the sub-apps too.


## Proposal

Going forward all distributions of Kubeflow should be owned and maintained outside of Kubeflow.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I work with lots of customers trying to adopt Kubeflow in production. One major blocking issue is they cannot easily integrate with entire Kubeflow stack into their existing infrastructure.

  1. Some teams don't own underly k8s and infra team give limited permission to install entire KF stack
  2. Some teams have to integrate with their own data models and native Kubernetes solution is not acceptable. They end up build additional wrappers on top of Kubeflow concepts.

But I have to say, small teams really love upstream distribution and they get an out-of-box ML solution and don't have to put extra efforts on a lot of things.

At the same time, I see a few users use Kubeflow in a different way, they leverage lower level capability from training operators, pipeline API, etc and they pick up pieces they need from Kubeflow.

I think these are definitely different directions. Do we want to concentrate on ML on k8s techniques itself? Like how to run distributed training on k8s, how to orchestrate ML workflows on k8s. Or we want to build an e2e ML solution? Then community needs to put efforts on better integration between components, user friendly interfaces, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that there is a lot of value in distributions. That's why I think we want to embrace a philosophy of letting a 1000 flowers bloom.

There will be no one distribution that works well or even moderately well for all or even most people. So instead of arguing over what should be in the "generic" distribution I want to encourage folks to organize around use cases, target users and develop distributions optimized for those use cases.

As an example, a distribution targeting someone wanting to kick the tires on their laptop will probably be very different from one running on a small but multi-node cluster.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree on this part. If we see Kubernetes community, it only distribute binaries. There're tons of tools like Minikube, Kind, kops etc to help user provision kubernetes in different envs. The unclear part is even community folks organize their stacks and offer to users, how does Kubeflow community certify them? Or Kubeflow doesn't need to certify them, just say go check list for different solutions

@jlewi
Copy link
Contributor Author

jlewi commented Oct 28, 2020

I think it is dangerous for Kubeflow to not have a minimal "generic" distribution which Kubeflow org is responsible for

There have been multiple efforts to create generic distributions. None of them IMO have gained the necessary contributor support necessary to warrant a KF WG.

Every discussion about generic distributions ends up spiraling into debates that reflect people's various oppinions

  • How should people connect? kubectl port-forward? Should we use a loadbalancer? Should we use Dex?
  • Does it need to be able to run on a desktop install (e.g. Minikube)? How do we make it fit?
  • Should it support GPUs?
  • Should we assume a default storage provider?

This will help prevent any specific vendor taking control of Kubeflow

On the contrary; I think creating an officially endorsed "generic" distribution gives signficant weight to whatever oppinions the owners of that distribution makes. In Kubeflow decision making is based on OWNERs files. So decisions about a generic distribution would not be made by the community; it would be made by the 1-3 organizations actively developing it.

If you feel that strongly about a generic distribution then my suggestion would be to work with others on building a distribution that satisfies your goals; e.g. setup testing for different K8s versions.

Maybe take a look at what Arrikto is doing with kfctl_k8s_istio_dex.yaml and see how you can help there. Or talk to @RFMVasconcelos about the MicroK8s install

Copy link
Contributor

@rui-vas rui-vas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally agree with @thesuperzapper's argument that having a "generic" distribution would have benefits.

However, to @jlewi's point:

On the contrary; I think creating an officially endorsed "generic" distribution gives signficant weight to whatever oppinions the owners of that distribution makes. In Kubeflow decision making is based on OWNERs files. So decisions about a generic distribution would not be made by the community; it would be made by the 1-3 organizations actively developing it.

giving 1-3 orgs OWNERs right to that Kubeflow endorsed distro effectively gives these vendors control over Kubeflow. I believe this is a risk in what is happening atm with kfctl (#402).

In sum:

  1. There seems to be clear value in KF having end-to-end platform stories, not scattered apps
  2. Unfortunately, if there are 1-3 orgs providing it, which is likely if there is one "generic" distribution, then these orgs control the story


### Ownership & Development

Going forward new distributions of Kubeflow should be developed outside of the Kubeflow GitHub org. This ensures
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my 2-cents:

This "install the piece you want" experience will be ideal for some use-cases but, will weaken the "try-out Kubeflow" UX, which is key for adoption.

@rui-vas rui-vas mentioned this pull request Oct 29, 2020
@animeshsingh
Copy link
Contributor

animeshsingh commented Oct 29, 2020

+1 on moving all vendor distributions out. I still think there should be core generic distribution, else there is no reference point. Instead of calling it a distribution, it can be positioned as a sample multi-tenant architectural guidance and implementation for a particular Kubeflow release, and is not something which Kubeflow dictates or endorses. But there is still value in one developed under Kubeflow org.

Vis a vis concerns over "giving 1-3 orgs OWNERs right to that Kubeflow endorsed distro effectively gives these vendors control over Kubeflow" - please contribute, earn your stripes, and join the WG as a lead. This can be one WG where the limits on the number of OWNERS can be non-existent. If your org has a track record of contributing back to Kubeflow, join the WG.

@Jeffwan
Copy link
Member

Jeffwan commented Oct 29, 2020

Does Kubeflow want to provide some conformance tests or it's totally on vendors? from a user's perspective, I hope each component itself work fine (Kubeflow WG) and glue between components work fine (vendors)

@jlewi
Copy link
Contributor Author

jlewi commented Oct 30, 2020

Does Kubeflow want to provide some conformance tests or it's totally on vendors

Conformance tests are where we need to get to.

+1 on moving all vendor distributions out. I still think there should be core generic distribution, else there is no reference point.

I'm hoping that conformance tests provide the reference not a generic distribution. The distinction in my mind is that conformance tests should mandate functionality & capabilities not implementation.

As an example, conformance tests would require that certain KFP pipelines run that exercise functionality but are agnostic to the engine (Argo and Tekton).

So we will not have a "core" distribution. Instead we will have N conformant distributions.

@thesuperzapper
Copy link
Member

About the "generic" distribution:

Let's stop calling it "generic" and call it "reference"

I believe we are thinking of Kubeflow too much like Kubernetes. Kubernetes is a very complex platform with effectively infinite deployment methods, whereas Kubeflow is a set of apps which run on Kubernetes and comprise a data/ml platform.

Setting aside what tool you use to submit your YAML (and which Kubeflow apps you pick), there are very few deployment options for Kubeflow. If you look at the distributions in kubeflow/manifests, most currently deploy very similar YAMLs.

Realistically the "reference" distribution is just those YAMLs, and a basic validation that you can actually deploy them.

Important note:

I really doubt we will be able to enforce "conformance" tests without creating a formal governance structure in the Kubeflow ORG (which people actually buy into).

I am not talking about individual apps, but rather the whole project.
What group of people gets the final say on the direction of the Kubeflow project?

Right now, Kubeflow feels like it's drifting in the wind, especially as previously important vendors/people distance themselves from the open-source part of the project.

I think the community needs to elect a group of people who actually have steering control of Kubeflow as a whole, so they can make decisions like this.

@jlewi
Copy link
Contributor Author

jlewi commented Oct 30, 2020

@thesuperzapper

Right now, Kubeflow feels like it's drifting in the wind, especially as previously important vendors/people distance themselves from the open-source part of the project.

This is not what I see. With the formation of WGs the bulk of development & leadership has shifted to the WGs. Some of the WGs are functioning really well while others are still ramping up.

As an example KFP just

As you are one of the lead for notebooks, you would be one of the individuals everyone is looking to, to answer the following questions?

  • What is the current state of notebooks?
  • Where is notebooks going?
  • How is notebooks going to get from here to there?

Do you feel the @kubeflow/wg-notebook-leads are able to answer those questions?

I think the community needs to elect a group of people who actually have steering control of Kubeflow as a whole, so they can make decisions like this.

I would suggest we table discussions on broader governance and get back to the topic at hand.

I don't want to completely ignore this question so here's a short reply.

Since July we've focused on forming WGs. Right now the focus should be on ensuring those WGs are fully operational and successful kubeflow/projects#42.

Anyone looking to have broader influence needs to earn that influence by showing they can successfully execute on important priorities.

Realistically the "reference" distribution is just those YAMLs, and a basic validation that you can actually deploy them.

@thesuperzapper I love the enthusiasm, are you planning on contributing to one of the existing distributions or starting a new one?

Looking at https://www.kubeflow.org/docs/started/k8s/ It looks like the following might fit your description

Which one of these is closest to what you envision? What would be the next steps? Are you planning on helping to drive it forward?

Would you be willing to fix some of the open issues for the kfctl_k8s_istio_dex?

Perhaps you could help get one of these distributions ready/tested for KF 1.2?

@theadactyl
Copy link
Contributor

It seems like most of the points relating to this proposal have been settled, so I'd love to move forward with adopting this. I share @jlewi's viewpoint that this is the best path for community productivity in the near term.

/lgtm
/approve

@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jlewi, theadactyl

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 950e35b into kubeflow:master Nov 4, 2020
@Bobgy
Copy link
Contributor

Bobgy commented Nov 4, 2020

Realistically the "reference" distribution is just those YAMLs, and a basic validation that you can actually deploy them.

Those YAMLs are already maintained by @kubeflow/manifests WG and application owners should be responsible for documenting how to deploy the application with bare kubernetes. e.g. pipelines is providing https://www.kubeflow.org/docs/pipelines/installation/standalone-deployment/ that anyone can deploy to any k8s cluster.

If all we need is simply putting them together, then why do we need a WG to do that? Each vendor should be able to easily pick what they want and put together the reference distribution and adapt to their own needs. Anyone is also free to put together the "generic" distribution or form a group that contribute to one "generic" distribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet