Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time
November 23, 2023 11:41
June 28, 2017 11:38
September 29, 2023 15:24
November 1, 2023 18:49

Kubeflow Training Operator

Build Status Coverage Status Go Report Card


Starting from v1.3, this training operator provides Kubernetes custom resources that makes it easy to run distributed or non-distributed TensorFlow/PyTorch/Apache MXNet/XGBoost/MPI jobs on Kubernetes.

Note: Before v1.2 release, Kubeflow Training Operator only supports TFJob on Kubernetes.


  • Version >= 1.23 of Kubernetes cluster and kubectl


Master Branch

kubectl apply -k ""

Stable Release

kubectl apply -k ""

TensorFlow Release Only

For users who prefer to use original TensorFlow controllers, please checkout v1.2-branch, patches for bug fixes will still be accepted to this branch.

kubectl apply -k ""

Python SDK for Kubeflow Training Operator

Training Operator provides Python SDK for the custom resources. More docs are available in sdk/python folder.

Use pip install command to install the latest release of the SDK:

pip install kubeflow-training

Quick Start

Please refer to the and Kubeflow Training User Guide for more information.

API Documentation

Please refer to following API Documentation:


The following links provide information about getting involved in the community:

This is a part of Kubeflow, so please see readme in kubeflow/kubeflow to get in touch with the community.


Please refer to the DEVELOPMENT

Change Log

Please refer to CHANGELOG

Version Matrix

The following table lists the most recent few versions of the operator.

Operator Version API Version Kubernetes Version
v1.0.x v1 1.16+
v1.1.x v1 1.16+
v1.2.x v1 1.16+
v1.3.x v1 1.18+
v1.4.x v1 1.23+
v1.5.x v1 1.23+
latest (master HEAD) v1 1.23+


This project was originally started as a distributed training operator for TensorFlow and later we merged efforts from other Kubeflow training operators to provide a unified and simplified experience for both users and developers. We are very grateful to all who filed issues or helped resolve them, asked and answered questions, and were part of inspiring discussions. We'd also like to thank everyone who's contributed to and maintained the original operators.