Skip to content

paipaoso/common

Repository files navigation

Kubeflow common for operators

Build Status Go Report Card

This repo contains the libraries for writing a custom job operators such as tf-operator and pytorch-operator. To write a custom operator, user need to do following steps

import (
    commonv1 "github.com/kubeflow/common/pkg/apis/common/v1"
)

// reuse commonv1 api in your type.go
RunPolicy *commonv1.RunPolicy                              `json:"runPolicy,omitempty"`
TestReplicaSpecs map[TestReplicaType]*commonv1.ReplicaSpec `json:"testReplicaSpecs"`
 testJobController := TestJobController {
    ...
 }
  • Instantiate a JobController struct object and pass in the custom controller written in step 1 as a parameter
import "github.com/kubeflow/common/pkg/controller.v1/common"

jobController := common.JobController {
    Controller: testJobController,
    Config:     v1.JobControllerConfiguration{EnableGangScheduling: false},
    Recorder:   recorder,
}
    reconcile(...) {
    	// Your main reconcile loop. 
    	...
    	jobController.ReconcileJobs(...)
    	...
    }

Note that this repo is still under construction, API compatibility is not guaranteed at this point.

API Reference

Please refer to the API documentation.

The API files are located under pkg/apis/common/v1:

  • constants.go: the constants such as label keys.
  • interface.go: the interfaces to be implemented by custom controllers.
  • controller.go: the main JobController that contains the ReconcileJobs API method to be invoked by user. This is the entrypoint of the JobController logic. The rest of the code under job_controller/ folder contains the core logic for the JobController to work, such as creating and managing worker pods, services, etc.