Skip to content

natalian98/training-operator

 
 

Repository files navigation

Training Operator

Overview

This repository hosts the Kubernetes Training Operator for Kubeflow training jobs.

Description

The Kubeflow Training Operator provides Kubernetes custom resources to run distributed or non-distributed training jobs, such as TFJobs and PytorchJobs. The Training Operator in this repository is a Python script which wraps the latest released Kubeflow Training Operator manifests, providing lifecycle management and handling events (install, upgrade, integrate, remove). It is one of the Charmed Kubeflow operators.

Usage

While it is possible to deploy the Training Operator as a standalone operator, it works best when deployed alongside other components included in the Kubeflow bundle. For installation steps, please refer to the installation guide.

About

Kubeflow Training Operator

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%