Skip to content

Commit

Permalink
Added and fixed links fixed branding and text (#128)
Browse files Browse the repository at this point in the history
Updated branding; added a link to the guide to building your own components; fixed 2 links; made other minor textual fixes.
  • Loading branch information
sarahmaddox authored and k8s-ci-robot committed Nov 7, 2018
1 parent 63e9c54 commit 8c75eca
Showing 1 changed file with 18 additions and 16 deletions.
34 changes: 18 additions & 16 deletions components/README.md
@@ -1,32 +1,34 @@
# ML Pipeline Components
# Kubeflow pipeline components

ML Pipeline Components are implementation of ML Pipeline tasks. Each task takes
one or more [artifacts](../artifacts) as input and may produce one or more
[artifacts](../artifacts).
Kubeflow pipeline components are implementations of Kubeflow pipeline tasks. Each task takes
one or more [artifacts](https://github.com/kubeflow/pipelines/wiki/Concepts#step-output-artifacts)
as input and may produce one or more
[artifacts](https://github.com/kubeflow/pipelines/wiki/Concepts#step-output-artifacts) as output.


## XGBoost DataProc Components
* [Setup Cluster](dataproc/xgboost/create_cluster.py)
**Example: XGBoost DataProc components**
* [Set up cluster](dataproc/xgboost/create_cluster.py)
* [Analyze](dataproc/xgboost/analyze.py)
* [Transform](dataproc/xgboost/transform.py)
* [Distributed Train](dataproc/xgboost/train.py)
* [Delete Cluster](dataproc/xgboost/delete_cluster.py)
* [Distributed train](dataproc/xgboost/train.py)
* [Delete cluster](dataproc/xgboost/delete_cluster.py)

Each task usually includes two parts:

``Client Code``
``Client code``
The code that talks to endpoints to submit jobs. For example, code to talk to Google
Dataproc API to submit Spark job.
Dataproc API to submit a Spark job.

``Runtime Code``
The code that does the actual job and usually run in cluster. For example, Spark code
that transform raw data into preprocessed data.
``Runtime code``
The code that does the actual job and usually runs in the cluster. For example, Spark code
that transforms raw data into preprocessed data.

``Container``
A container image that runs the client code.

There is a naming convention to client code and runtime code. For a task named "mytask",
there is mytask.py including client code, and there is also a mytask directory holding
all runtime code.
Note the naming convention for client code and runtime code—for a task named "mytask":

* The `mytask.py` program contains the client code.
* The `mytask` directory contains all the runtime code.

See [how to build your own components](https://github.com/kubeflow/pipelines/wiki/Build-Your-Own-Component)

0 comments on commit 8c75eca

Please sign in to comment.