[feature] generic viewer operator for managing user webapps in Kubeflow #5681

Bobgy · 2021-05-19T03:34:34Z

Feature Area

/area backend

What feature would you like to see?

Besides tensorboard, KFP viewer controller supports generic viewers.

A viewer is a long running container that exposes a webapp through a certain port. (along with required setup to expose it through ingress, e.g. virtualservice in istio)
It can help visualize outputs of a pipeline component, but it can also be used outside of KFP like #5651.

There are a few different use-cases we are currently getting:

tensorboard (supported)
file browser [feature] Add PVC File Browser to Viewer CRD #5651
captum insights https://captum.ai/docs/captum_insights
jupyter notebooks / vscode / rstudio (if we unify with Kubeflow notebooks controller)

All of them fit into this category, that makes it seem like a generic viewer operator that only abstracts the part of setting up ingress and lifecycle control seems like a good fit. The specific configuration for each different type of service we want to expose can be configured by users of viewer CRD.

Strawman Proposal

A generic viewer CRD like the following:

apiVersion: pipelines.kubeflow.org/v1beta2
kind: Viewer
spec:
  ingress:
    type: istio.virtualservice # maybe we can have more type supports
  containers:
  - name: main
    image: tensorflow:2.3
    command: ['python3', '-m']
    arguments: ['tensorboard', '--port', '8080', '--bind-all']
    envs:
    - name: AWS_SECRET
      valueFrom:
      - xxxx
    port: 8080

This custom resource will be used to setup the webapp for external access with:

deployment
service
virtualservice
authorizationpolicy

The major value coming from the generic viewer operator is to unify the resources needed to make this webapp available to users securely. Also, when creating/deleting this custom resource, operator will make sure the group of resources are created/deleted/updated.

I think the major controversial things to discuss is whether the viewer should encode domain knowledge about each type of service to start up. With the number of different use-cases we have seen, sounds to me that we'd better leave those domain knowledge to a different layer of abstraction. Curious about how others think about that.

What is the use case or pain point?

This also helps mitigate the problem that Kubeflow community has two operators to support these features: kubeflow/kubeflow#5921.

Love this idea? Give it a 👍. We prioritize fulfilling features with the most 👍.

The text was updated successfully, but these errors were encountered:

davidspek · 2021-05-19T09:29:23Z

I think this would be a great step forward in terms of making it possible to easy run any type of web app with Kubeflow. The example resource from above would need some additional values of course, such as resource limits and requests, etc. Another important value that needs to be added is a type classification, so that the various web apps can filter through the different types of viewer (such as the Jupyter Web App and the Tensorboards Web App).

Another possibility is to adopt the newer (I think) CRD naming scheme, resulting on something similar to notebook.webapps.kubefow.org, tensorboard.webapps.kubeflow.org and filebrowser.webapps.kubeflow.org. Although I'm not sure if the same reconciliation code can be used for each type with this setup.

If going for the first option with a type that is specified in the resource, it would be a great feature to have this configurable for the cluster admin. This way, if they have some custom web app that they would like to deploy, the existing controller can be used. Another option would be to have the type not actually affect anything in the reconciliation loop, allowing any arbitrary value to be used.

kimwnasptd · 2021-05-25T13:41:43Z

Thank you for starting this discussion @Bobgy! This is a feature that we had also been discussing for a long time kubeflow/kubeflow#3578 (comment), for Notebooks WG. Let my add some insights we had from running Notebooks, which could help us evaluate such a proposal.

The first thing I'd like to point out is we should consider exposing a PodTemplateSpec instead of ContainerSpec, in the CR's spec, since it will allow us to configure at least the following things:

A PVC to be used with the app's pod
The ServiceAccount
Affinities and Tolerations

With the number of different use-cases we have seen, sounds to me that we'd better leave those domain knowledge to a different layer of abstraction.

I totally agree with that approach. If we'd like to include different apps then we should be thinking of only the common APIs that someone would need to configure for exposing such apps.

Then it will be up to another abstraction, for example a managing web app, tο craft CRs for specific applications [Jupyter, TensorBoard, Captum etc], as you mentioned.

kimwnasptd · 2021-05-25T13:42:44Z

I'd also like to expose some necessary configurations we've seen for making the Notebook servers run under a prefix. Some apps will be expecting the requrests under the prefix path while some others will always expect requests under /.

The least necessary configuration for this will be the spec.http[i].rewrite.uri field in the VirtualService.

Also, we've seen applications, like RStudio, that require the prefix to be included in a custom header in each request.

The most flexible solution here would be to allow users to modify the entire spec of a VirtualService. But this will make it more involved to create such CRs, since the user will need to provide the Service name, gateways etc. in the spec.

kimwnasptd · 2021-05-25T13:43:24Z

Lastly, I'd also like to point out that an interesting feature would be to allow users to configure the replicas of the underlying Deployment.

This will essentially allow users to start/stop the underlying Pods, while still maintaining the CR.

So by taking all the above into consideration I'd propose the following iteration:

Strawman proposal v2

apiVersion: pipelines.kubeflow.org/v1beta2
kind: Viewer
spec:
  ingress:
    type: istio.virtualservice # maybe we can have more types
    pathRewrite: /
    httpHeaders:
    - name: X-Forwarded-Prefix
      value: /tensorboard/kubeflow/tb-instance
  replicas: 1
  template:
    spec:
      containers:
      - name: main
        image: tensorflow:2.3
        command: ['python3', '-m']
        arguments: ['tensorboard', '--port', '8080', '--bind-all']
        envs:
        - name: AWS_SECRET
          valueFrom:
          - xxxx
        port: 8080

Would really like to hear your feedback. Also I believe another useful thing to discuss is how to handle the ports the container exposes and the underlying Service. Should we take for granted that the Service will only be sending traffic to Pod's 8080 port?

Bobgy · 2021-05-27T08:56:06Z

That already looks great!
I think we can also use ports field instead, because Istio often require named ports to start with its protocol name.

Shall we put this into a design doc now?

davidspek · 2021-05-27T16:06:21Z

I was just going to say the same thing about the ports. This will also be useful for pods that expose services on multiple ports (such as metrics for example).

kimwnasptd · 2021-06-02T17:19:01Z

@Bobgy glad to hear!

I will start working on a first iteration of a design doc so that we can further iterate and evaluate some edge cases. Do you have a template for design docs in pipelines I should follow?

Bobgy · 2021-06-07T09:15:38Z

@kimwnasptd Great! I am not opinionated about a template, you can use whatever template you prefer.

ca-scribner · 2021-06-14T18:55:57Z

Related to this, would it be helpful to think about enabling generic dashboards through this same method (I'm thinking a dash or flask dashbord, or maybe even r-shiny?). Don't want to overload it, but general dashboards are the other use case that feel similar to this. In some ways I guess it is a generic version of the tensorboard use case

davidspek · 2021-06-14T19:00:17Z

@ca-scribner The viewer CRD would be the confidante for these dashboard applications as well. Creating a web app that allows you to launch dashboards is something I’ve discussed with multiple people recently and something I would like to add in the future. Some thought will need to go into the authorization policies, so that non-Kubeflow users can access the dashboards as well (probably by using an OIDC group). A similar functionality will probably be desired for KFServing endpoints as well.

ca-scribner · 2021-06-14T19:20:01Z

This all sounds perfect, thanks!

…

On Mon, Jun 14, 2021 at 15:00 DavidSpek ***@***.***> wrote: @ca-scribner <https://github.com/ca-scribner> The viewer CRD would be the confidante for these dashboard applications as well. Creating a web app that allows you to launch dashboards is something I’ve discussed with multiple people recently and something I would like to add in the future. Some thought will need to go into the authorization policies, so that non-Kubeflow users can access the dashboards as well (probably by using an OIDC group). A similar functionality will probably be desired for KFServing endpoints as well. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#5681 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ALPFPI6GXB5IBN7ZY6WKUC3TSZGVBANCNFSM45DXWWZA> .

davidspek · 2021-06-16T13:40:18Z

One thing that will need some careful consideration with the generic viewer is how to deal with RBAC permissions. For example, if you would want to allow a user to create tensorboards, but not a file browser instance. To support this I think it will be necessary to define multiple Kinds for the different viewers, but have them share (most of) the reconciliation loop. This then also allows for some domain specific implementations as well. Adding a layer of abstraction above this controller would probably require another controller, partially defeating the purpose of a single unified controller. The different specs would look similar to the following:

apiVersion: viewer.kubeflow.org/v1beta2
kind: Tensorboard
....

apiVersion: viewer.kubeflow.org/v1beta2
kind: Filebrowser
....

stale · 2021-10-02T01:03:47Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale · 2022-03-03T03:04:57Z

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.

Bobgy added the kind/feature label May 19, 2021

google-oss-robot added the area/backend label May 19, 2021

This was referenced May 19, 2021

[feature] Add PVC File Browser to Viewer CRD #5651

Closed

Kubeflow contains 2 controllers for launching Tensorboards kubeflow/kubeflow#5921

Closed

Bobgy changed the title ~~[feature] generic viewer support~~ [feature] generic viewer operator for managing user webapps in Kubeflow May 19, 2021

zijianjoy added the needs more info label May 21, 2021

zijianjoy added this to Needs triage in KFP Runtime Triage via automation May 21, 2021

zijianjoy moved this from Needs triage to Needs More Info in KFP Runtime Triage May 21, 2021

davidspek mentioned this issue May 24, 2021

[feature] Merge Pipelines Profile Controller functionality into the general profile controller #5728

Closed

kimwnasptd mentioned this issue Jun 14, 2021

Notebooks WG Roadmap for 1.4 kubeflow/kubeflow#5978

Closed

6 tasks

davidspek mentioned this issue Jun 22, 2021

Tensorboard-controller's tensorboard image should be configurable kubeflow/kubeflow#6008

Closed

stale bot added the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Oct 2, 2021

stale bot closed this as completed Mar 3, 2022

KFP Runtime Triage automation moved this from Needs More Info to Closed Mar 3, 2022

TobiasGoerke mentioned this issue Jan 5, 2023

Re-Introducing the Volumes Viewer kubeflow/kubeflow#6876

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feature] generic viewer operator for managing user webapps in Kubeflow #5681

[feature] generic viewer operator for managing user webapps in Kubeflow #5681

Bobgy commented May 19, 2021 •

edited

davidspek commented May 19, 2021

kimwnasptd commented May 25, 2021

kimwnasptd commented May 25, 2021

kimwnasptd commented May 25, 2021 •

edited

Bobgy commented May 27, 2021

davidspek commented May 27, 2021

kimwnasptd commented Jun 2, 2021

Bobgy commented Jun 7, 2021

ca-scribner commented Jun 14, 2021

davidspek commented Jun 14, 2021

ca-scribner commented Jun 14, 2021 via email

davidspek commented Jun 16, 2021

stale bot commented Oct 2, 2021

stale bot commented Mar 3, 2022

[feature] generic viewer operator for managing user webapps in Kubeflow #5681

[feature] generic viewer operator for managing user webapps in Kubeflow #5681

Comments

Bobgy commented May 19, 2021 • edited

Feature Area

What feature would you like to see?

Strawman Proposal

What is the use case or pain point?

davidspek commented May 19, 2021

kimwnasptd commented May 25, 2021

kimwnasptd commented May 25, 2021

kimwnasptd commented May 25, 2021 • edited

Bobgy commented May 27, 2021

davidspek commented May 27, 2021

kimwnasptd commented Jun 2, 2021

Bobgy commented Jun 7, 2021

ca-scribner commented Jun 14, 2021

davidspek commented Jun 14, 2021

ca-scribner commented Jun 14, 2021 via email

davidspek commented Jun 16, 2021

stale bot commented Oct 2, 2021

stale bot commented Mar 3, 2022

Bobgy commented May 19, 2021 •

edited

kimwnasptd commented May 25, 2021 •

edited