Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1 visualizations backward compatible in v2 compatible #5666

Closed
3 tasks done
Bobgy opened this issue May 18, 2021 · 9 comments
Closed
3 tasks done

v1 visualizations backward compatible in v2 compatible #5666

Bobgy opened this issue May 18, 2021 · 9 comments
Assignees
Labels
area/launcher area/sdk lifecycle/stale The issue / pull request is stale, any activities remove this label. size/L

Comments

@Bobgy
Copy link
Contributor

Bobgy commented May 18, 2021

support v1 visualizations:
https://www.kubeflow.org/docs/components/pipelines/sdk/output-viewer/

Changes needed:

@Bobgy Bobgy created this issue from a note in KFP v2 compatible mode (To do (P0)) May 18, 2021
@Bobgy
Copy link
Contributor Author

Bobgy commented May 18, 2021

/assign @chensun
for SDK
/assign @zijianjoy
for UI

@Bobgy Bobgy changed the title [v2compat] support v1 visualizations [v2compat] backward compatible support to v1 visualizations May 18, 2021
@Bobgy Bobgy changed the title [v2compat] backward compatible support to v1 visualizations [v2compat] v1 visualizations backward compatible support May 18, 2021
@Bobgy Bobgy changed the title [v2compat] v1 visualizations backward compatible support v1 visualizations backward compatible support May 18, 2021
@Bobgy Bobgy changed the title v1 visualizations backward compatible support v1 visualizations backward compatible in v2 compatible May 18, 2021
@Ark-kun
Copy link
Contributor

Ark-kun commented May 18, 2021

I'd really like to see mlpipeline-ui-metadata gone in the future.
But first there needs to be an alternative to it.
Supporting "type-based visualizations" and "visualizations as components" will provide a better alternative.

@Bobgy
Copy link
Contributor Author

Bobgy commented May 19, 2021

@Ark-kun the goal of this issue is to ease migration, when switching to v2-compatible pipelines, people can gradually move to type-based visualization with their existing pipelines.

@Bobgy Bobgy moved this from P0 ETA 6.4 to P1 ETA 6.18 in KFP v2 compatible mode May 29, 2021
@Bobgy
Copy link
Contributor Author

Bobgy commented May 29, 2021

Moving to P1, because Chen does not have time to work on this before 6.4.

We may work on the doc update first.

@Bobgy Bobgy mentioned this issue May 29, 2021
5 tasks
google-oss-robot pushed a commit that referenced this issue Jun 11, 2021
…art of #5666 (#5832)

* fix(sdk/compiler): v2 compat - fix mlpipeline-ui-metadata artifact

* fix

* add test case

* address feedback, fix the bug in less hacky way
jagadeeshi2i pushed a commit to chauhang/pipelines that referenced this issue Jun 12, 2021
…art of kubeflow#5666 (kubeflow#5832)

* fix(sdk/compiler): v2 compat - fix mlpipeline-ui-metadata artifact

* fix

* add test case

* address feedback, fix the bug in less hacky way
@zijianjoy
Copy link
Collaborator

I have a few questions regarding writing sample pipelines and visualizing on UI:

Sample Pipeline

Is there a sample pipeline which writes v1 mlpipeline-ui-metadata artifact in V2 compatible mode using Python? It will be great to have a pipeline to create both V1 and V2 visualization, so I can test the scenario where both V1 and V2 artifacts exist in single execution. Furthermore:

mlpipeline-ui-metadata Artifact naming in python code

I tried to write the following component, but was not successful because python doesn't accept - in parameter name mlpipeline-ui-metadata:

@component
def visall(mlpipeline-ui-metadata: Output[Artifact]):
  import json
    
  metadata = {
    'outputs' : [
    # Markdown that is hardcoded inline
    {
      'storage': 'inline',
      'source': '# Inline Markdown\n[A link](https://www.kubeflow.org/)',
      'type': 'markdown',
    }]
  }
  with open(mlpipeline-ui-metadata.path, 'w') as metadata_file:
    json.dump(metadata, metadata_file)

I slightly changed the name to something else so I can preview the content in output artifact. Here is the content:

{"outputs": [{"storage": "inline", "source": "# Inline Markdown\n[A link](https://www.kubeflow.org/)", "type": "markdown"}]}
Source data sample for each visualization

There are 6 different V1 outputs in https://www.kubeflow.org/docs/components/pipelines/sdk/output-viewer/#available-output-viewers, but I am looking for some sample data in a csv file or other format to be able to store in the visualization content. See below:

  • Confusion matrix
    • CONFUSION_MATRIX_CSV_FILE and vocab from the doc example.
  • Markdown
    • A google cloud storage path for Markdown file (what kind of permission we need to provide?)
  • ROC curve
    • roc_file file
  • Table
    • prediction_results file
  • TensorBoard
    • args.job_dir path
  • Web app
    • A google cloud storage path for HTML file.
Read file permission and test cases

If we use a gcs path or minio path for storing the visualization data, what kind of permission do we need to give to KFP for accessing those files? Can they be added to the KFP repo?

@Bobgy
Copy link
Contributor Author

Bobgy commented Jun 26, 2021

I have a few questions regarding writing sample pipelines and visualizing on UI:

Sample Pipeline

Is there a sample pipeline which writes v1 mlpipeline-ui-metadata artifact in V2 compatible mode using Python? It will be great to have a pipeline to create both V1 and V2 visualization, so I can test the scenario where both V1 and V2 artifacts exist in single execution. Furthermore:

It's blocked by #5831. Ideally, all v1 samples should be able to run in v2 compatible mode.

mlpipeline-ui-metadata Artifact naming in python code

I tried to write the following component, but was not successful because python doesn't accept - in parameter name mlpipeline-ui-metadata:

@component
def visall(mlpipeline-ui-metadata: Output[Artifact]):
  import json
    
  metadata = {
    'outputs' : [
    # Markdown that is hardcoded inline
    {
      'storage': 'inline',
      'source': '# Inline Markdown\n[A link](https://www.kubeflow.org/)',
      'type': 'markdown',
    }]
  }
  with open(mlpipeline-ui-metadata.path, 'w') as metadata_file:
    json.dump(metadata, metadata_file)

I slightly changed the name to something else so I can preview the content in output artifact. Here is the content:

{"outputs": [{"storage": "inline", "source": "# Inline Markdown\n[A link](https://www.kubeflow.org/)", "type": "markdown"}]}

Traditionally, KFP v1 sanitizes artifact names by turning all letters to lower case and all non letter chars to -. Therefore, you don't need to match the exact name in python.

I think it's worth considering letting UI do the name sanitization for this specific artifact name if sdk v2 doesn't.

Source data sample for each visualization

There are 6 different V1 outputs in https://www.kubeflow.org/docs/components/pipelines/sdk/output-viewer/#available-output-viewers, but I am looking for some sample data in a csv file or other format to be able to store in the visualization content. See below:

  • Confusion matrix
    • CONFUSION_MATRIX_CSV_FILE and vocab from the doc example.
  • Markdown
    • A google cloud storage path for Markdown file (what kind of permission we need to provide?)
  • ROC curve
    • roc_file file
  • Table
    • prediction_results file
  • TensorBoard
    • args.job_dir path
  • Web app
    • A google cloud storage path for HTML file.

Not quite following, do you want samples for these cases?

Read file permission and test cases

If we use a gcs path or minio path for storing the visualization data, what kind of permission do we need to give to KFP for accessing those files? Can they be added to the KFP repo?

KFP UI server fetches these artifacts, by default we already mount MinIO secrets on it. On GCP it gets cluster default service account permission. So you don't need to worry about permission, users are responsible for setting them in their cluster.

@zijianjoy
Copy link
Collaborator

zijianjoy commented Jun 26, 2021

Thank you @Bobgy for the answers!

It's blocked by #5831. Ideally, all v1 samples should be able to run in v2 compatible mode.

So we cannot create python component with - parameter name, and cannot create yaml component with space yet. Can we use lightweight python component approach to create v1 mlpipeline_ui_metadata in v2 compatible mode? https://github.com/kubeflow/pipelines/blob/master/samples/core/lightweight_component/lightweight_component.ipynb

Or is there any other approach to create a pipeline while waiting for the blocking issue to resolve?

Traditionally, KFP v1 sanitizes artifact names by turning all letters to lower case and all non letter chars to -. Therefore, you don't need to match the exact name in python.

I think it's worth considering letting UI do the name sanitization for this specific artifact name if sdk v2 doesn't.

I would like to learn more about sanitization from UI approach: Does this logic already handle the sanitization logic on SDK side? https://github.com/kubeflow/pipelines/pull/5832/files#diff-6e6c99d682c8b3b8d586a8c50d381dee529c0c58622316532c6574c4f511b655

As a result, how does user define a component in v1 fashion but is runnable on v2 compatible mode? More specifically, how does user define a mlpipeline-ui-metadata parameter in python? Currently I tried this component definition def visall(mlpipeline_ui_metadata: Output[Artifact]) but SDK fails with TypeError: 'ContainerOp' object is not callable.

Not quite following, do you want samples for these cases?

Sorry for the confusion, yes I want to have a simplified sample for these output viewer case. For example: we can define a simple confusion matrix with only 2 labels, but with actual csv file and vocab living in KFP repo, or hardcoded confusion matrix values as an array in python code. so they are convenient to use. Because these placeholders are referenced in kubeflow doc: https://www.kubeflow.org/docs/components/pipelines/sdk/output-viewer/#available-output-viewers.

Is there any existing sample which can achieve this goal? I found this file https://github.com/kubeflow/pipelines/blob/master/components/local/confusion_matrix/component.yaml but was stuck at trying to create a GCSPath object for Predictions parameter.

KFP UI server fetches these artifacts, by default we already mount MinIO secrets on it. On GCP it gets cluster default service account permission. So you don't need to worry about permission, users are responsible for setting them in their cluster.

Sounds good, thank you for clarifying !

@Bobgy
Copy link
Contributor Author

Bobgy commented Jun 26, 2021

@zijianjoy https://www.github.com/kubeflow/pipelines/tree/f3e368410e2d29cf880cd6debc95572b636dd208/samples%2Fcore%2Fvisualization%2Ftensorboard_minio.py

You can adapt this example first. The only way to write a visualization component for both v1 and v2 compatible right now is to write a yaml component with artifact name mlpipeline-ui-metadata (it doesn't have spaces in the name).

Another way for lightweight components is to return the json metadata as NamedTuple. See example https://github.com/kubeflow/pipelines/blob/master/samples/core/lightweight_component/lightweight_component.ipynb.
I think it also works for v2 compatible, but you may need to confirm.

No, I don't have any existing examples. Existing examples are all around a specific use-case. It would be very helpful building an example with all the available visualizations simply hard coded.

google-oss-robot pushed a commit that referenced this issue Jul 5, 2021
…ial #5666 (#5961)

* feat(frontend) Support v1 metrics in v2 compatible mode

* address comments

* address
google-oss-robot pushed a commit to kubeflow/website that referenced this issue Aug 12, 2021
…s#5666 (#2867)

* feat(pipelines) KFP V2 visualization doc

* ROC Curve

* scalar and final wording polish

* typo

* address comments
@stale
Copy link

stale bot commented Mar 2, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Mar 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/launcher area/sdk lifecycle/stale The issue / pull request is stale, any activities remove this label. size/L
Projects
No open projects
Development

No branches or pull requests

4 participants