Skip to content

Commit 2ae5f1b

Browse files
sarahmaddoxk8s-ci-robot
authored andcommitted
Kubeflow overview (#1339)
* WIP Initial commit of Kubeflow overview. * Added not for authors. * Updated the architectural diagram. * Trying clickable SVG. * Added link from 'About Kubeflow' to new overview. Removed links from SVG. * Finished first draft of overview. * Tweaked wording and workflow diagrams. * Added example of specific workflow (MNIST tutorial). * Addressed review comments. * Next iteration of the platform diagram. * Changed 'architectural' to 'conceptual' to better convey purpose of diagram. * Swapped to a simplified workflow for the GCP e2e example. * In progress: Addressing tech writer review comments. * Finished addressing comments from tech writer review.
1 parent 35d1e8f commit 2ae5f1b

12 files changed

+193
-30
lines changed

Diff for: content/docs/about/kubeflow.md

+13-22
Original file line numberDiff line numberDiff line change
@@ -13,24 +13,12 @@ you are running Kubernetes, you should be able to run Kubeflow.
1313

1414
## Getting started with Kubeflow
1515

16-
Follow the [getting-started guide](/docs/started/getting-started) to set up your
17-
environment.
16+
Read the [Kubeflow overview](/docs/started/kubeflow-overview/) for an
17+
introduction to the Kubeflow architecture and to see how you can use Kubeflow
18+
to manage your ML workflow.
1819

19-
Then read the [documentation](/docs/) to learn about the features of Kubeflow,
20-
including the following guides to Kubeflow components:
21-
22-
* Kubeflow includes services for spawning and managing
23-
[Jupyter notebooks](/docs/notebooks/). [Project Jupyter](https://jupyter.org/)
24-
is a non-profit, open source project that supports interactive data science
25-
and scientific computing across many programming languages.
26-
27-
* [Kubeflow Pipelines](/docs/pipelines/pipelines-overview/) is a platform for
28-
building, deploying, and managing multi-step ML workflows based on Docker
29-
containers.
30-
31-
* Kubeflow offers a number of [components](/docs/components/) that you can use
32-
to build your ML training, hyperparameter tuning, and serving workloads across
33-
multiple platforms.
20+
Follow the [getting-started guide](/docs/started/getting-started/) to set up
21+
your environment and install Kubeflow.
3422

3523
## What is Kubeflow?
3624

@@ -40,23 +28,26 @@ To use Kubeflow, the basic workflow is:
4028

4129
* Download and run the Kubeflow deployment binary.
4230
* Customize the resulting configuration files.
43-
* Run the specified scripts to deploy your containers to your specific
31+
* Run the specified script to deploy your containers to your specific
4432
environment.
4533

4634
You can adapt the configuration to choose the platforms and services that you
4735
want to use for each stage of the ML workflow: data preparation, model training,
4836
prediction serving, and service management.
4937

50-
You can choose to deploy your Kubernetes workloads locally or to a cloud
51-
environment.
38+
You can choose to deploy your Kubernetes workloads locally, on-premises, or to
39+
a cloud environment.
40+
41+
Read the [Kubeflow overview](/docs/started/kubeflow-overview/) for more details.
5242

5343
## The Kubeflow mission
5444

5545
Our goal is to make scaling machine learning (ML) models and deploying them to
5646
production as simple as possible, by letting Kubernetes do what it's great at:
5747

58-
* Easy, repeatable, portable deployments on a diverse infrastructure (laptop
59-
<-> ML rig <-> training cluster <-> production cluster)
48+
* Easy, repeatable, portable deployments on a diverse infrastructure
49+
(for example, experimenting on a laptop, then moving to an on-premises
50+
cluster or to the cloud)
6051
* Deploying and managing loosely-coupled microservices
6152
* Scaling based on demand
6253

Diff for: content/docs/gke/gcp-e2e.md

+9-2
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,14 @@ is a 7.
6060

6161
### The overall workflow
6262

63-
Here's an overview of what you accomplish by following this guide:
63+
The following diagram shows what you accomplish by following this guide:
64+
65+
<img src="/docs/images/kubeflow-gcp-e2e-tutorial.svg"
66+
alt="ML workflow for training and serving an MNIST model"
67+
class="mt-3 mb-3 border border-info rounded">
68+
69+
70+
In summary:
6471

6572
* Setting up [Kubeflow][kubeflow] in a [GKE][kubernetes-engine]
6673
cluster.
@@ -80,7 +87,7 @@ Here's an overview of what you accomplish by following this guide:
8087
* Running a simple web app to send a prediction request to the model and
8188
display the result.
8289

83-
Let's get started!
90+
It's time to get started!
8491

8592
## Set up your environment
8693

Diff for: content/docs/images/kubeflow-gcp-e2e-tutorial-simplified.svg

+1
Loading

Diff for: content/docs/images/kubeflow-gcp-e2e-tutorial.svg

+1
Loading

Diff for: content/docs/images/kubeflow-overview-platform-diagram.svg

+1
Loading

Diff for: content/docs/images/kubeflow-overview-workflow-diagram-1.svg

+1
Loading

Diff for: content/docs/images/kubeflow-overview-workflow-diagram-2.svg

+1
Loading

Diff for: content/docs/started/cloud/_index.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
+++
22
title = "Cloud Installation"
33
description = "Instructions for installing Kubeflow on a public cloud"
4-
weight = 2
4+
weight = 30
55
+++

Diff for: content/docs/started/getting-started.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
+++
2-
title = "Getting Started with Kubeflow"
3-
description = "Overview"
4-
weight = 1
2+
title = "Installing Kubeflow"
3+
description = "Overview of installation choices for various environments"
4+
weight = 20
55
+++
66

77
## Before you begin

Diff for: content/docs/started/k8s/_index.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
+++
22
title = "Kubernetes Installation"
33
description = "Instructions for installing Kubeflow on an existing Kubernetes cluster"
4-
weight = 2
4+
weight = 40
55
+++

Diff for: content/docs/started/kubeflow-overview.md

+160
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,160 @@
1+
+++
2+
title = "Kubeflow Overview"
3+
description = "How Kubeflow helps you organize your ML workflow"
4+
weight = 10
5+
+++
6+
7+
<!--
8+
Note for authors: The source of the diagrams is held in Google Slides decks,
9+
in the "Doc diagrams" folder in the public Kubeflow shared drive.
10+
-->
11+
12+
This guide introduces Kubeflow as a platform for developing and deploying a
13+
machine learning (ML) system.
14+
15+
Kubeflow is a platform for data scientists who want to build and experiment with
16+
ML pipelines. Kubeflow is also for ML engineers and operational teams who want
17+
to deploy ML systems to various environments for development, testing, and
18+
production-level serving.
19+
20+
## Conceptual overview
21+
22+
Kubeflow is *the ML toolkit for Kubernetes*.
23+
The following diagram shows Kubeflow as a platform for arranging the
24+
components of your ML system on top of Kubernetes:
25+
26+
<img src="/docs/images/kubeflow-overview-platform-diagram.svg"
27+
alt="An architectural overview of Kubeflow on Kubernetes"
28+
class="mt-3 mb-3 border border-info rounded">
29+
30+
Kubeflow builds on [Kubernetes](https://kubernetes.io/) as a system for
31+
deploying, scaling, and managing complex systems.
32+
33+
Using the Kubeflow configuration interfaces (see [below](#interfaces)) you can
34+
specify the ML tools required for your workflow. Then you can deploy the
35+
workflow to various clouds, local, and on-premises platforms for experimentation and
36+
for production use.
37+
38+
## Introducing the ML workflow
39+
40+
When you develop and deploy an ML system, the ML workflow typically consists of
41+
several stages. Developing an ML system is an iterative process.
42+
You need to evaluate the output of various stages of the ML workflow, and apply
43+
changes to the model and parameters when necessary to ensure the model keeps
44+
producing the results you need.
45+
46+
For the sake of simplicity, the following diagram
47+
shows the workflow stages in sequence. The arrow at the end of the workflow
48+
points back into the flow to indicate the iterative nature of the process:
49+
50+
<img src="/docs/images/kubeflow-overview-workflow-diagram-1.svg"
51+
alt="A typical machine learning workflow"
52+
class="mt-3 mb-3 border border-info rounded">
53+
54+
Looking at the stages in more detail:
55+
56+
* In the experimental phase, you develop your model based on initial
57+
assumptions, and test and update the model iteratively to produce the
58+
results you're looking for:
59+
60+
* Identify the problem you want the ML system to solve.
61+
* Collect and analyze the data you need to train your ML model.
62+
* Choose an ML framework and algorithm, and code the initial version of your
63+
model.
64+
* Experiment with the data and with training your model.
65+
* Tune the model hyperparameters to ensure the most efficient processing and the
66+
most accurate results possible.
67+
68+
* In the production phase, you deploy a system that performs the following
69+
processes:
70+
71+
* Transform the data into the format that your training system needs.
72+
To ensure that your model behaves consistently during training and
73+
prediction, the transformation process must be the same in the experimental
74+
and production phases.
75+
* Train the ML model.
76+
* Serve the model for online prediction or for running in batch mode.
77+
* Monitor the model's performance, and feed the results into your processes
78+
for tuning or retraining the model.
79+
80+
## Kubeflow components in the ML workflow
81+
82+
The next diagram adds Kubeflow to the workflow, showing which Kubeflow
83+
components are useful at each stage:
84+
85+
<img src="/docs/images/kubeflow-overview-workflow-diagram-2.svg"
86+
alt="Where Kubeflow fits into a typical machine learning workflow"
87+
class="mt-3 mb-3 border border-info rounded">
88+
89+
To learn more, read the following guides to the Kubeflow components:
90+
91+
* Kubeflow includes services for spawning and managing
92+
[Jupyter notebooks](/docs/notebooks/). Use notebooks for interactive data
93+
science and experimenting with ML workflows.
94+
95+
* [Kubeflow Pipelines](/docs/pipelines/pipelines-overview/) is a platform for
96+
building, deploying, and managing multi-step ML workflows based on Docker
97+
containers.
98+
99+
* Kubeflow offers several [components](/docs/components/) that you can use
100+
to build your ML training, hyperparameter tuning, and serving workloads across
101+
multiple platforms.
102+
103+
## Example of a specific ML workflow
104+
105+
The following diagram shows a simple example of a specific ML workflow that you
106+
can use to train and serve a model trained on the MNIST dataset:
107+
108+
<img src="/docs/images/kubeflow-gcp-e2e-tutorial-simplified.svg"
109+
alt="ML workflow for training and serving an MNIST model"
110+
class="mt-3 mb-3 border border-info rounded">
111+
112+
For details of the workflow and to run the system yourself, see the
113+
[end-to-end tutorial for Kubeflow on GCP](/docs/gke/gcp-e2e/).
114+
115+
<a id="interfaces"></a>
116+
## Kubeflow interfaces
117+
118+
This section introduces the interfaces that you can use to interact with
119+
Kubeflow and to build and run your ML workflows on Kubeflow.
120+
121+
### Kubeflow user interface (UI)
122+
123+
The Kubeflow UI looks like this:
124+
125+
<img src="/docs/images/central-ui.png"
126+
alt="The Kubeflow UI"
127+
class="mt-3 mb-3 border border-info rounded">
128+
129+
The UI offers a central dashboard that you can use to access the components
130+
of your Kubeflow deployment. Read
131+
[how to access the UI](/docs/other-guides/accessing-uis/).
132+
133+
### Kubeflow command line interface (CLI)
134+
135+
**Kfctl** is the Kubeflow CLI that you can use to install and configure
136+
Kubeflow. Read about kfctl in the guide to
137+
[configuring Kubeflow](/docs/other-guides/kustomize/).
138+
139+
The Kubernetes CLI, **kubectl**, is useful for running commands against your
140+
Kubeflow cluster. You can use kubectl to deploy applications, inspect and manage
141+
cluster resources, and view logs. Read about kubectl in the [Kubernetes
142+
documentation](https://kubernetes.io/docs/tasks/tools/install-kubectl/).
143+
144+
## Kubeflow APIs and SDKs
145+
146+
Various components of Kubeflow offer APIs and Python SDKs. See the following
147+
sets of reference documentation:
148+
149+
* [Kubeflow reference docs](/docs/reference/) for guides to the Kubeflow
150+
Metadata API and SDK, the PyTorchJob CRD, and the TFJob CRD.
151+
* [Pipelines reference docs](/docs/pipelines/reference/) for the Kubeflow
152+
Pipelines API and SDK, including the Kubeflow Pipelines domain-specific
153+
language (DSL).
154+
* [Fairing reference docs](/docs/fairing/reference/) for the Kubeflow Fairing
155+
SDK.
156+
157+
## Next steps
158+
159+
See how to [install Kubeflow](/docs/started/getting-started/) depending on
160+
your chosen environment (local, cloud, or on-premises).

Diff for: content/docs/started/workstation/_index.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
+++
22
title = "Workstation Installation"
33
description = "Instructions for installing Kubeflow on a workstation or server"
4-
weight = 2
4+
weight = 50
55
+++

0 commit comments

Comments
 (0)