## Data Types
KFP components and pipelines can accept inputs and create outputs. To do so, they must declare typed interfaces through their function signatures and annotations.

There are two groups of types in KFP: parameters and artifacts. Parameters are useful for passing small amounts of data between components. Artifacts types are the mechanism by which KFP provides first-class support for ML artifact outputs, such as datasets, models, metrics, etc.

KFP automatically tracks the way parameters and artifacts are passed between components and stores the this data passing history in ML Metadata. This enables out-of-the-box ML artifact lineage tracking and easily reproducible pipeline executions. Furthermore, KFP’s strongly-typed components provide a data contract between tasks in a pipeline.

## Pass small amounts of data between components
Parameters are useful for passing small amounts of data between components and when the data created by a component does not represent a machine learning artifact such as a model, dataset, or more complex data type. As with normal Python function, input parameters can have default values.

KFP maps Python type annotations to the types stored in ML Metadata according to the following table:

Python object   | KFP type
----------------|---------
str	            | string
int	            | number
float	        | number
bool	        | boolean
typing.List / list	| object
typing.Dict / dict	| object

Under the hood KFP passes all parameters to and from components by serializing them as JSON. For all Python Component, parameter serialization and deserialization is invisible to the user; KFP handles this automatically. For Container Components, input parameter deserialization is invisible to the user; KFP passes inputs to the component automatically. For Container Component outputs, the user code in the Container Component must handle serializing the output parameters.  

### Input parameters
Using input parameters is very easy. Simply annotate your component function with the types and, optionally, defaults.

### Output parameters
For Python Components and pipelines, output parameters are indicated via return annotations.

For Container Components, output parameters are indicated using a `dsl.OutputPath` annotation.

### Multiple output parameters 
You can specify multiple named output parameters using a `typing.NamedTuple`. You can access a named output using `.outputs['<output-key>']` on `PipelineTask`:

```python
def my_comp() -> NamedTuple('outputs', a=int, b=str)
```

## Create, use, pass, and track ML artifacts
KFP provides first-class support for creating machine learning artifacts via the `dsl.Artifact` class and other artifact subclasses. KFP maps these artifacts to their underlying ML Metadata schema title, the canonical name for the artifact type.  
Artifacts are simply a thin wrapper around some artifact properties, including the `.path` from which the artifact can be read/written and the artifact’s `.metadata`. 

### Artifact properties
To use create and consume artifacts from components, you’ll use the available properties on artifact instances. Artifacts feature four properties:
- `name`, the name of the artifact (cannot be overwritten on Vertex Pipelines).
- `.uri`, the location of your artifact object. For input artifacts, this is where the object resides currently. For output artifacts, this is where you will write the artifact from within your component.
- `.metadata`, additional key-value pairs about the artifact.
- `.path`, a local path that corresponds to the artifact’s `.uri`.

The artifact `.path` attribute is particularly helpful. When you write the contents of your artifact to the location provided by the artifact’s `.path` attribute, the pipelines backend will handle copying the file at `.path` to the URI at `.uri` automatically, allowing you to create artifact files within a component by only interacting with the task’s local filesystem.

Note that input artifacts should be treated as immutable. You should not try to modify the contents of the file at `.path` and any changes to the artifact’s properties will not affect the artifact’s metadata in ML Metadata.

### Artifacts in components
The KFP SDK supports two forms of artifact authoring syntax for components: traditional and Pythonic.

The **traditional artifact** authoring syntax is the original artifact authoring style provided by the KFP SDK. The traditional artifact authoring syntax is supported for both Python Components and Container Components. It is supported at runtime by the open source KFP backend and the Google Cloud Vertex Pipelines backend.

The **Pythonic artifact** authoring syntax provides an alterative artifact I/O syntax that is familiar to Python developers. The Pythonic artifact authoring syntax is supported for Python Components only. This syntax is not supported for Container Components. It is currently only supported at runtime by the Google Cloud Vertex Pipelines backend.

#### Traditional artifact syntax
When using the traditional artifact authoring syntax, all artifacts are provided to the component function as an input wrapped in an Input or Output type marker.

```python
def my_component(in_artifact: Input[Artifact], out_artifact: Output[Artifact]):
    ...
```

For input artifacts, you can read the artifact using its `.uri` or `.path` attribute.

For output artifacts, a pre-constructed output artifact will be passed into the component. You can update the output artifact’s properties in place and write the artifact’s contents to the artifact’s `.path` or `.uri` attribute. You should not return the artifact instance from your component.

#### New Pythonic artifact syntax
To use the Pythonic artifact authoring syntax, simply annotate your components with the artifact class as you would when writing normal Python.

```python
def my_component(in_artifact: Artifact) -> Artifact:
    ...
```

Inside the body of your component, you can read artifacts passed in as input (no change from the traditional artifact authoring syntax). For artifact outputs, you’ll construct the artifact in your component code, then return the artifact as an output.

Multiple output artifacts should be specified similarly to multiple output parameters:

```python
def train_multiple_models(dataset: Dataset,) -> NamedTuple('outputs', model1=Model, model2=Model):
    ...
```

### Artifacts in pipelines
Irrespective of whether your components use the Pythonic or traditional artifact authoring syntax, pipelines that use artifacts should be annotated with the Pythonic artifact syntax.

### Lists of artifacts
KFP supports input lists of artifacts, annotated as `List[Artifact]` or `Input[List[Artifact]]`. This is useful for collecting output artifacts from a loop of tasks using the `dsl.ParallelFor` and `dsl.Collected` control flow objects.

Pipelines can also return an output list of artifacts by using a `-> List[Artifact]` return annotation and returning a `dsl.Collected` instance.

Creating output lists of artifacts from a single-step component is not currently supported.

### Artifact types
The artifact annotation indicates the type of the artifact. KFP provides several artifact types within the DSL:

DSL object	                | Artifact schema title
----------------------------|----------------------
`Artifact`	                | system.Artifact
`Dataset`	                | system.Dataset
`Model`	                    | system.Model
`Metrics`	                | system.Metrics
`ClassificationMetrics`	    | system.ClassificationMetrics
`SlicedClassificationMetrics`|	system.SlicedClassificationMetrics
`HTML`	                    | system.HTML
`Markdown`	                | system.Markdown

`Artifact`, `Dataset`, `Model`, and `Metrics` are the most generic and commonly used artifact types. `Artifact` is the default artifact base type and should be used in cases where the artifact type does not fit neatly into another artifact category. `Artifact` is also compatible with all other artifact types. In this sense, the `Artifact` type is also an artifact “any” type.

## PipelineParameterChannel
A **PipelineParameterChannel* in Kubeflow Pipelines represents a future value that is passed between pipeline components. It can be used as a pipeline function argument, making it a pipeline artifact or parameter that appears in the ML Pipelines system UI. Essentially, it allows for the dynamic passing of data between different parts of the pipeline.

Here are some key points about PipelineParameterChannel:
- **Future Value**: It represents a value that will be available in the future, typically produced by a task within the pipeline.
- **Pipeline Function Argument**: It can be used as an argument in pipeline functions, enabling the passing of parameters or artifacts between components.
- **Intermediate Value**: It can also represent intermediate values passed between tasks, allowing for complex workflows and data dependencies.

For example, when you define a pipeline function with parameters, each parameter becomes a PipelineParameterChannel object. These objects can then be passed to components as arguments, creating tasks and managing data flow within the pipeline

> `Output[Artifact]`: gets a metadata-rich handle to the output artifact of type `Artifact`. Use `OutputArtifact.path` to access a local file path for writing. One can also use `OutputArtifact.uri` to access the actual URI file path.

> `OutputPath("Artifact")`: A locally accessible filepath for another output artifact of type `Artifact`. `OutputPath` is used to just pass the local file path of the output artifact to the function.

> `OutputPath(str)`: A locally accessible filepath for an output parameter of type string.

> `Output[Model]`: Output artifact of type of model. Use `model.get()` to get a Model artifact, which has a `.metadata` dictionary to store arbitrary metadata for the output artifact. This metadata is recorded in Managed Metadata and can be queried later. It also shows up in the Google Cloud console.

> `InputPath("Dataset")`: Use `InputPath` to get a locally accessible path for the input artifact of type `Dataset`. Directly access the passed in GCS URI as a local file (uses GCSFuse).

> `Input[Dataset]`: Use `InputArtifact` to get a metadata-rich handle to the input artifact of type `Dataset`. This gives an `Artifact` handle. Use `InputArtifact.path` to get a local file path (uses GCSFuse). Alternately, use `InputArtifact.uri` to access the GCS URI directly.

> Use `NamedTuple` to return either artifacts or parameters. When returning artifacts like this, return the contents of the artifact. The assumption here is that this return value fits in memory.