Skip to content

Latest commit



217 lines (151 loc) · 8.83 KB

File metadata and controls

217 lines (151 loc) · 8.83 KB

Design patterns, conventions and practices.

Error Handling

Errors are either handled locally within a method, or logged and returned. The custom Logger ensures that the error is logged only once. Subsequent Error() and Trace() calls higher up the stack are ignored.

This ensures:

  • The logged stack trace reflects where the error occurred.
  • Errors are always handled or logged.
  • Consistency makes PR review easier.

Errors that cannot be handled locally are deemed unrecoverable and are logged and returned to the Reconcile().

The reconciler will:

  • Log the error
  • Set a ReconcileFailed condition
  • Re-queue the event


All constructs should be organized, scoped, and named based on a specific topic or concern. Constructs named util, helper, misc are highly discouraged as they are an anti-pattern. Everything should fit within an appropriately named: package, .go (file), struct, function. Thoughtful organization and naming reflects a thoughtful design.


Packages should have a narrowly focused concern and be placed in the heirarchy as locally as possible.

Top level infrastructure packages:

Provides Kubernetes API types.

The model.go provides convenience functions to fetch k8s resources and CRs. All of the functions swallow NotFound error and return nil. This means that any error returned should be logged and returned as well. Also, the caller must check for the returned nil pointer.

The resource.go provides the MigResource interface. ALL of the CRs implement this interface which defines common behavior.

The labels.go provides support for correlation labels which are used to correlate resources created by a controller to one of our CRs.

Provides controllers.

Provides a custom logger that supports de-duplication of logged errors. In addition, it provides a Trace() method which is like Error() but does not require a message. The logger includes a short header in the form of: <name>|<short digest>: <message>. The digest is updated on each Reset() and provides a means to correlate all of the entries for the call chain (such as a specific reconcile). The Logger also filters out error=ConflictError entries as they are considered noisy and unhelpful.


if err != nil {
    return err

The Logger.Reset() must be called at the beginning of each call chain. This is usually the Reconciler.Reconcile().

Provides k8s compatability. This includes a custom Client which performs automatic type conversion to/from the cluster based on the cluster's version. The Client also implements the DiscoveryInterface and includes the REST Config; cluster version Major, Minor. To use these extended capabilities, the client must be type-asserted.


dClient := client.(dapi.DiscoveryInterface)

Provides application settings. The global Settings object loads and includes settings primarily from environment variables. All settings are scoped by concern.

  • Role - Manager roles
  • Proxy - Manager proxy settings
  • Plan - Plan controller settings
  • Migration - Migration controller settings

Provides support for CR references. The global Map correlates resources referenced by ObjectReference fields on the CR to the CR itself. When watched using the provided watch event Handler, a reconcile event is queued for the owner CR instead of an event for the watched (target) resource.

Provides support for Pod actions such as: PodExec.


Each controller provides a Reconciler which has a main method named Reconcile(). In an effort to keep this method maintainable, it delegates all application logic to a method defined in a separate .go file. Each Reconcile() has the standard anatomy:

  • Logger.Reset()
  • Fetch the resource.
  • Begin condition staging.
  • Perform validation (call r.validate() defined in validation.go).
  • Reconcile (delegate to methods).
  • End condition staging.
  • Mark as reconciled (See: ObservedGeneration|ObservedDigest)
  • Update the resource.

On error, the reconciler will:

  • Log the error.
  • Return ReconcileResult{Requeue: true}

Method follow the naming convention of Ensure prefix. For example: EnsureSomething().


The validation.go file contains a validate() error method which performs validations. Each discrete validation is delegated to separate method and roughly corresponds to a specific condition (or group of conditions). Since all conditions have been unstaged, the validation only needs to set conditions. They do not need to delete (clear) them. In the event that a validation is skipped, the related condition should be re-staged.


Each CR status includes the Conditions collection and the Condition object. The collection is basically a list of Condition that provides enhanced functionality.

The Conditions collection also introduces the concept of staging. The goal of staging is to preserve conditions across reconciles. Condition staging provides these benefits:

  1. Preservation of condition timestamps
  2. Support for durable conditions
  3. Re-staging of conditions when validations are skipped.

A condition is set using SetCondition():

    Type:     SomeCondition,
    Status:   True,
    Reason:   NotSet,
    Category: Critical,
    Message:  "Something happened.",

A condition is re-staged using StageCondition():


A condition may be marked as Durable: true which means it's never un-unstaged. Durable conditions must be explicitly deleted using DeleteCondition().

The Condition.Items array may be used to list details about the condition. The Message field may contains [] which is substituted with the Items when staging ends.

All Conditions methods are idempotent and most support varargs.

Working with Conditions

  1. SetCondition() is required to create all conditions. If a condition doesn't exist yet, you can't StageCondition() it into existence
  2. StageCondition() will look to see if a condition already exists in the conditions array from the previous reconcile, and if it does, will stop it from being removed. This is useful to preserve original timestamps and stop flickering conditions.
  3. Durable SetCondition() is the same as SetCondition() but does not need to be re-staged on every reconcile
  4. Any non-durable conditions that are not re-staged during a reconcile will disappear
  5. DeleteCondition() should only be used for removing durable conditions, since regular conditions will be removed simply by not re-staging them


User Experience

The impact of new changes in MTC on the overall user experience of migrations must be taken into account.

Progress Reporting

Migrations in production environments can take a significant amount of time due to the huge scale of deployed resources. The easiest way to provide better user experience in such cases is by making the migration process transparent to the end user. Migration controller provides a way to report information about ongoing work back to the user in the status field of MigMigration CR in the form of progress messages. Consider leveraging this existing progress reporting mechanism to improve visibility into the migration process.

Progress messages are arrays of strings and are associated with Migration Steps. The progress messages are written to the MigMigration CR at the end of every reconciliation. Until then, they are stored in-memory in Task.Status.Pipeline. To report a progress message, simply use task.setProgress(string []) function. This sets the array of progress messages in-memory and they will applied before next reconcile returns.

The standard format followed for each progress message in the array is:

<kind> <namespace>/<name>: <message>

For instance, if migration controller is waiting for a stage pod to come up, the progress message would look like:

Pod test-app/stage-pod-1: Pending

Please note that the Migration UI is designed to read the progress messages set through the t.setProgress() function. Only progress messages that follow the above format will be parsed by the UI.