Skip to content

Feature: Stream hook logs back to helm client #2298

@johnw188

Description

@johnw188

When performing helm release operations such as install, upgrade, delete and rollback, the helm client will block without feedback while hooks execute. If chart owners aren't careful with their error handling, kubernetes can put the hook job into an infinite loop of failure. Conversely, long running hooks such as database migrations can be executing successfully while the user is unaware that code is executing.

To mitigate these issues helm should stream hook logs back to the helm client.

UI

Successful install:

› helm install internal/jenkins
Executing pre-install hooks: ldap-update-job, credential-update-job
[credential-update-job] Updating Jenkins credentials
[ldap-update-job] Updating LDAP configuration
[credential-update-job] Validating credentials
[credential-update-job] Successfully updated credentials!
[ldap-update-job] Successfully updated LDAP!

Executing post-install hooks: migrate-db
[migrate-db] Starting database migration
[migrate-db] Database migration completed

Install was a success! Happy Helming!

Install with issues:

› helm install internal/jenkins
[credential-update-job] Updating Jenkins credentials
[ldap-update-job] Updating LDAP configuration
[credential-update-job] Validating credentials
[credential-update-job] Successfully updated credentials!
[ldap-update-job] ERROR: Invalid password/token for user: john.welsh
[ldap-update-job] Updating LDAP failed
[credential-update-job] Updating Jenkins credentials
[ldap-update-job] ERROR: Invalid password/token for user: john.welsh
[ldap-update-job] Updating LDAP failed
...

Open questions

  • Timestamps?
  • Should pod name be surfaced somewhere?
    • Note that the pod name can change if the hook fails or gets evicted before it completes
    • Pod names are super long (john-ci-jenkins-update-credentials-xzz5f-1z166)

Future enhancements

  • Retry counter for jobs?

Implementation

In the tiller proto definition:

// GetReleaseLogs gets a stream of log events for a given release.
    rpc GetReleaseLogs(GetReleaseLogsRequest) returns (stream GetReleaseLogsResponse) {
}

message GetReleaseLogsRequest {
    // Name is the name of the release
    string name = 1;
    // Version is the version of the release
    int32 version = 2;
}

message GetReleaseLogsResponse {
    // Source is the name of the hook that generated the log
    string source = 1;
    // Log is a single log line
    string log = 2;
}

The helm client will then make two gRPC calls - one to install, the other to grab logs.

Questions

  • Is this too chatty? Should logs be collected and passed in chunks?
  • Should the API be made specific to hooks (GetReleaseHookLogsRequest) or left more generic for future expansion?
  • What other fields should be in the response object?

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions