The sample pipelines give you a quick start to build and deploy machine learning pipelines with Kubeflow Pipeline.
- Follow the guide to deploy the Kubeflow pipelines service.
- Build and deploy your pipeline using the provided samples.
The samples are organized into the core set and the contrib set.
Core samples demonstrates the full KFP functionalities and are covered by the sample test infra. The current status is not all samples are covered by the tests but they will be all covered in the near future. A selected set of these core samples will also be preloaded to the KFP during deployment. The core samples will also include intermediate samples that are more complex than basic samples such as flip coins but simpler than TFX samples. It serves to demonstrate a set of the outstanding features and offers users the next level KFP experience.
Contrib samples are not tested by KFP and could potentially be moved to the core samples if the samples are of good qualty and tests are covered and it demonstrates certain KFP functionality. Another reason to put some samples in this directory is that some samples require certain platform support that is hard to support in our test infra.
In the Core directory, each sample will be in a separate directory. In the Contrib directory, there is an intermediate directory for each contributor, e.g. ibm and arena, within which each sample is in a separate directory. An example of the resulting structure is as follows:
pipelines/samples/ Core/ dsl_static_type_checking/ dsl_static_type_checking.ipynb xgboost_training_cm/ xgboost_training_cm.py condition/ condition.py recursion/ recursion.py Contrib/ IBM/ ffdl-seldon/ ffdl_pipeline.ipynb ffdl_pipeline.py README.md
Compile the pipeline specification
Follow the guide to building a pipeline to install the Kubeflow
Pipelines SDK and compile the sample Python into a workflow specification.
The specification takes one of the three forms: YAML file, YAML compressed into a
.tar.gz file, and YAML compressed into a
For convenience, you can use the preloaded samples in the pipeline system. This saves you the steps required to compile and compress the pipeline specification.
Upload the pipeline to the Kubeflow Pipeline
Open the Kubeflow pipelines UI, and follow the prompts to create a new pipeline and upload the generated workflow
Run the pipeline
Follow the pipeline UI to create pipeline runs.
Useful parameter values:
- For the "exit_handler" and "sequential" samples:
- For the "parallel_join" sample:
Notes: component source codes
All samples use pre-built components. The command to run for each container is built into the pipeline file.
For better readability and integrations with the sample test infrastructure, samples are encouraged to adopt the following conventions.
- The sample file should be either
*.ipynb, and its file name is consistent with its directory name.
*.pysample, it's recommended to have a main invoking
kfp.compiler.Compiler().compile()to compile the pipeline function into pipeline yaml spec.
*.ipynbsample, parameters (e.g.,
project_name) should be defined in a dedicated cell and tagged as parameter. (If the author would like the sample test infra to run it by setting the
run_pipelineflag to True in the associated
config.yamlfile, the sample test infra will expect the sample to use the
kfp.Client().create_run_from_pipeline_funcmethod for starting the run so that the sample test can watch the run.) Detailed guideline is here. Also, all the environment setup and preparation should be within the notebook, such as by
!pip install packages
How to add a sample test
Here are the ordered steps to add the sample tests for samples. Only the core samples are expected to be added to the sample test infrastructure.
- Make sure the sample follows the sample conventions.
- If the sample requires argument inputs, they can be specified in a config yaml file
xgboost_training_cm.config.yamlas an example. The config yaml file will be validated according to
schema.config.yaml. If no config yaml is provided, pipeline parameters will be substituted by their default values.
- Add your test name (in consistency with the file name and dir name) in
- (Optional) The current sample test infra only checks if runs succeed without custom validation logic.
If needed, runtime checks should be included in the sample itself. However, there is no custom validation logic
injection support for
*.pysamples, in which case the test infra compiles the sample, submit and run the sample, and check if it succeeds.