diff --git a/content/en/docs/components/pipelines/v2/components/additional-functionality.md b/content/en/docs/components/pipelines/v2/components/additional-functionality.md new file mode 100644 index 0000000000..01a4382853 --- /dev/null +++ b/content/en/docs/components/pipelines/v2/components/additional-functionality.md @@ -0,0 +1,70 @@ ++++ +title = "Additional Functionality" +description = "More information about authoring KFP components" +weight = 6 ++++ + +### Component docstring format +KFP allows you to document your components and pipelines using Python docstrings. The KFP SDK automatically parses your docstrings and include certain fields in [IR YAML][ir-yaml] when you compile components and pipelines. + +For components, KFP can extract your component **input descriptions** and **output descriptions**. + +For pipelines, KFP can extract your pipeline **input descriptions** and **output descriptions**, as well as a **description of your full pipeline**. + +For the KFP SDK to correctly parse your docstrings, you should write your docstrings in the KFP docstring style. The KFP docstring style is a particular variant on the [Google docstring style][google-docstring-style], with the following changes: +* The `Returns:` section takes the same structure as the `Args:` section, where each return value in the `Returns:` section should take the form `: `. This is distinct from the typical Google docstring `Returns:` section which takes the form `: `, with no names for return values. +* Component outputs should be included in the `Returns:` section, even though they are declared via component function input parameters. This applies to function parameters annotated with [`dsl.OutputPath`][dsl-outputpath] and the [`Output[]`][output-type-marker] type marker for declaring [output artifacts][output-artifacts]. +* *Suggested:* Type information, including which inputs are optional/required, should be omitted from the input/output descriptions. This information is duplicative of the annotations. + +For example, the KFP SDK can extract input and output descriptions from the following component docstring which uses the KFP docstring style: + + +```python +@dsl.component +def join_datasets( + dataset_a: Input[Dataset], + dataset_b: Input[Dataset], + out_dataset: Output[Dataset], +) -> str: + """Concatenates two datasets. + + Args: + dataset_a: First dataset. + dataset_b: Second dataset. + + Returns: + out_dataset: The concatenated dataset. + Output: The concatenated string. + """ + ... +``` + +Similarly, KFP can extract the component input descriptions, the component output descriptions, and the pipeline description from the following pipeline docstring: + +```python +@dsl.pipeline(display_name='Concatenation pipeline') +def dataset_concatenator( + string: str, + in_dataset: Input[Dataset], +) -> Dataset: + """Pipeline to convert string to a Dataset, then concatenate with + in_dataset. + + Args: + string: String to concatenate to in_artifact. + in_dataset: Dataset to which to concatenate string. + + Returns: + Output: The final concatenated dataset. + """ + ... +``` + +Note that if you provide a `description` argument to the [`@dsl.pipeline`][dsl-pipeline] decorator, KFP will use this description instead of the docstring description. + +[ir-yaml]: /docs/components/pipelines/v2/compile-a-pipeline#ir-yaml +[google-docstring-style]: https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html +[dsl-pipeline]: https://kubeflow-pipelines.readthedocs.io/en/master/source/dsl.html#kfp.dsl.pipeline +[output-artifacts]: /docs/components/pipelines/v2/data-types/artifacts#declaring-inputoutput-artifacts +[dsl-outputpath]: https://kubeflow-pipelines.readthedocs.io/en/latest/source/dsl.html#kfp.dsl.OutputPath +[output-type-marker]: https://kubeflow-pipelines.readthedocs.io/en/latest/source/dsl.html#kfp.dsl.Output \ No newline at end of file diff --git a/content/en/docs/components/pipelines/v2/components/importer-component.md b/content/en/docs/components/pipelines/v2/components/importer-component.md index 44f7234cf5..7f97a3b6c6 100644 --- a/content/en/docs/components/pipelines/v2/components/importer-component.md +++ b/content/en/docs/components/pipelines/v2/components/importer-component.md @@ -1,5 +1,5 @@ +++ -title = "Special case: Importer Components" +title = "Special Case: Importer Components" description = "Import artifacts from outside your pipeline" weight = 5 +++ diff --git a/content/en/docs/components/pipelines/v2/pipelines/pipeline-basics.md b/content/en/docs/components/pipelines/v2/pipelines/pipeline-basics.md index c3e107c31f..6a8061f9cb 100644 --- a/content/en/docs/components/pipelines/v2/pipelines/pipeline-basics.md +++ b/content/en/docs/components/pipelines/v2/pipelines/pipeline-basics.md @@ -50,17 +50,21 @@ KFP pipelines are defined inside functions decorated with the `@dsl.pipeline` de * `name` is the name of your pipeline. If not provided, the name defaults to a sanitized version of the pipeline function name. * `description` is a description of the pipeline. * `pipeline_root` is the root path of the remote storage destination within which the tasks in your pipeline will create outputs. `pipeline_root` may also be set or overridden by pipeline submission clients. +* `display_name` is a human-readable for your pipeline. You can modify the definition of `pythagorean` to use these arguments: ```python @dsl.pipeline(name='pythagorean-theorem-pipeline', description='Solve for the length of a hypotenuse of a triangle with sides length `a` and `b`.', - pipeline_root='gs://my-pipelines-bucket') + pipeline_root='gs://my-pipelines-bucket', + display_name='Pythagorean pipeline.') def pythagorean(a: float, b: float) -> float: ... ``` +Also see [Additional Functionality: Component docstring format][component-docstring-format] for information on how to provide pipeline metadata via docstrings. + ### Pipeline inputs and outputs Like [components][components], pipeline inputs and outputs are defined by the parameters and annotations in the pipeline function signature. @@ -190,4 +194,5 @@ def pythagorean(a: float = 1.2, b: float = 1.2) -> float: [output-artifacts]: /docs/components/pipelines/v2/data-types/artifacts#using-output-artifacts [container-component-outputs]: /docs/components/pipelines/v2/components/container-components#create-component-outputs [parameters-namedtuple]: /docs/components/pipelines/v2/data-types/parameters#multiple-output-parameters -[dsl-pipeline-job-name-placeholder]: https://kubeflow-pipelines.readthedocs.io/en/master/source/dsl.html#kfp.dsl.PIPELINE_JOB_NAME_PLACEHOLDER \ No newline at end of file +[dsl-pipeline-job-name-placeholder]: https://kubeflow-pipelines.readthedocs.io/en/master/source/dsl.html#kfp.dsl.PIPELINE_JOB_NAME_PLACEHOLDER +[component-docstring-format]: /docs/components/pipelines/v2/components/additional-functionality#component-docstring-format \ No newline at end of file