Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Send success for pods for a deployment when its rolled out successfully #6534

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
27 changes: 20 additions & 7 deletions pkg/skaffold/kubernetes/status/resource/deployment.go
Expand Up @@ -30,6 +30,7 @@ import (
"github.com/GoogleContainerTools/skaffold/pkg/skaffold/kubectl"
"github.com/GoogleContainerTools/skaffold/pkg/skaffold/output/log"
"github.com/GoogleContainerTools/skaffold/proto/v1"
protoV2 "github.com/GoogleContainerTools/skaffold/proto/v2"
)

const (
Expand Down Expand Up @@ -132,12 +133,24 @@ func (d *Deployment) CheckStatus(ctx context.Context, cfg kubectl.Config) {

details := d.cleanupStatus(string(b))

ae := parseKubectlRolloutError(details, err)
if ae.ErrCode == proto.StatusCode_STATUSCHECK_KUBECTL_PID_KILLED {
ae.Message = fmt.Sprintf("received Ctrl-C or deployments could not stabilize within %v: %v", d.deadline, err)
}

ae := parseKubectlRolloutError(details, d.deadline, err)
d.UpdateStatus(ae)
// send event update in check status.
event.ResourceStatusCheckEventCompleted(d.String(), ae)
eventV2.ResourceStatusCheckEventCompleted(d.String(), sErrors.V2fromV1(ae))
// if deployment is successfully rolled out, send pod success event to make sure
// all pod are marked as success in V2
// See https://github.com/GoogleCloudPlatform/cloud-code-vscode-internal/issues/5277
if ae.ErrCode == proto.StatusCode_STATUSCHECK_SUCCESS {
for _, pod := range d.pods {
eventV2.ResourceStatusCheckEventCompletedMessage(
pod.String(),
fmt.Sprintf("%s %s: running.\n", tabHeader, pod.String()),
protoV2.ActionableErr{ErrCode: proto.StatusCode_STATUSCHECK_SUCCESS},
)
}
return
Comment on lines +144 to +152
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this mean we'll send duplicate pod success messages since we're also sending them in fetchPods() func?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think this would be the case unless I am confusing something. From the fetchPods() snippet here - https://github.com/GoogleContainerTools/skaffold/blob/main/pkg/skaffold/kubernetes/status/resource/deployment.go#L298-L303:

			case proto.StatusCode_STATUSCHECK_SUCCESS:
				event.ResourceStatusCheckEventCompleted(p.String(), p.ActionableError())
				eventV2.ResourceStatusCheckEventCompletedMessage(
					p.String(),
					fmt.Sprintf("%s running.\n", prefix),
					sErrors.V2fromV1(p.ActionableError()))

I believe we also send the same ResourceStatusCheckEventCompletedMessage event there

Copy link
Member Author

@tejal29 tejal29 Aug 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I agree there could be duplicate sucessful events depending upon timings.
e.g.

  1. dep check -> waiting for rollout
  2. pod check -> unhealhty
    skaffold send a resource in progress event for pod
    3 ) pod becomes healthy
  3. Status check sleeps for 100 ms
  4. dep rollout status -> successful.
    skaffold send a resource complete event for pod

however in case where.

  1. dep check -> waiting for rollout
  2. pod becomes healthy and running
  3. pod check -> returns succes.
    skaffold sends a resource complete event for pod
  4. Status check sleeps for 100 ms
  5. dep rollout status -> successful.
    skaffold send a another resource complete event for pod

It should be fine to send 2 successful resource complete event for a pod. It would be a no-op on the IDE side as the UI node was already marked green.

}
if err := d.fetchPods(ctx); err != nil {
log.Entry(ctx).Debugf("pod statuses could not be fetched this time due to %s", err)
}
Expand Down Expand Up @@ -233,7 +246,7 @@ func (d *Deployment) cleanupStatus(msg string) string {
// $kubectl logs testPod -f
// 2020/06/18 17:28:31 service is running
// Killed: 9
func parseKubectlRolloutError(details string, err error) proto.ActionableErr {
func parseKubectlRolloutError(details string, deadline time.Duration, err error) proto.ActionableErr {
switch {
case err == nil && strings.Contains(details, rollOutSuccess):
return proto.ActionableErr{
Expand All @@ -253,7 +266,7 @@ func parseKubectlRolloutError(details string, err error) proto.ActionableErr {
case strings.Contains(err.Error(), killedErrMsg):
return proto.ActionableErr{
ErrCode: proto.StatusCode_STATUSCHECK_KUBECTL_PID_KILLED,
Message: msgKubectlKilled,
Message: fmt.Sprintf("received Ctrl-C or deployments could not stabilize within %v: %s", deadline, msgKubectlKilled),
}
default:
return proto.ActionableErr{
Expand Down
8 changes: 6 additions & 2 deletions pkg/skaffold/kubernetes/status/resource/deployment_test.go
Expand Up @@ -23,12 +23,15 @@ import (
"os"
"path/filepath"
"testing"
"time"

"github.com/GoogleContainerTools/skaffold/pkg/diag/validator"
"github.com/GoogleContainerTools/skaffold/pkg/skaffold/runner/runcontext"
latestV1 "github.com/GoogleContainerTools/skaffold/pkg/skaffold/schema/latest/v1"
"github.com/GoogleContainerTools/skaffold/pkg/skaffold/util"
"github.com/GoogleContainerTools/skaffold/proto/v1"
"github.com/GoogleContainerTools/skaffold/testutil"
testEvent "github.com/GoogleContainerTools/skaffold/testutil/event"
)

func TestDeploymentCheckStatus(t *testing.T) {
Expand Down Expand Up @@ -100,6 +103,7 @@ func TestDeploymentCheckStatus(t *testing.T) {
for _, test := range tests {
testutil.Run(t, test.description, func(t *testutil.T) {
t.Override(&util.DefaultExecCommand, test.commands)
testEvent.InitializeState([]latestV1.Pipeline{{}})

r := NewDeployment("graph", "test", 0)
r.CheckStatus(context.Background(), &statusConfig{})
Expand Down Expand Up @@ -140,7 +144,7 @@ func TestParseKubectlError(t *testing.T) {
err: errors.New("signal: killed"),
expectedAe: proto.ActionableErr{
ErrCode: proto.StatusCode_STATUSCHECK_KUBECTL_PID_KILLED,
Message: msgKubectlKilled,
Message: "received Ctrl-C or deployments could not stabilize within 10s: kubectl rollout status command interrupted\n",
},
},
{
Expand All @@ -162,7 +166,7 @@ func TestParseKubectlError(t *testing.T) {
}
for _, test := range tests {
testutil.Run(t, test.description, func(t *testutil.T) {
ae := parseKubectlRolloutError(test.details, test.err)
ae := parseKubectlRolloutError(test.details, 10*time.Second, test.err)
t.CheckDeepEqual(test.expectedAe, ae)
})
}
Expand Down
2 changes: 0 additions & 2 deletions pkg/skaffold/kubernetes/status/status_check.go
Expand Up @@ -288,8 +288,6 @@ func (s *Monitor) printStatusCheckSummary(out io.Writer, r *resource.Deployment,
// another deployment failed
return
}
event.ResourceStatusCheckEventCompleted(r.String(), ae)
eventV2.ResourceStatusCheckEventCompleted(r.String(), sErrors.V2fromV1(ae))
out, _ = output.WithEventContext(context.Background(), out, constants.Deploy, r.String())
status := fmt.Sprintf("%s %s", tabHeader, r)
if ae.ErrCode != proto.StatusCode_STATUSCHECK_SUCCESS {
Expand Down
8 changes: 0 additions & 8 deletions pkg/skaffold/kubernetes/status/status_check_test.go
Expand Up @@ -596,14 +596,6 @@ func TestPollDeployment(t *testing.T) {
"Pending",
proto.ActionableErr{ErrCode: proto.StatusCode_STATUSCHECK_NODE_DISK_PRESSURE},
[]string{"err"})},
// pod recovered
{validator.NewResource(
"test",
"pod",
"dep-pod",
"Running",
proto.ActionableErr{ErrCode: proto.StatusCode_STATUSCHECK_SUCCESS},
nil)},
},
expected: proto.StatusCode_STATUSCHECK_SUCCESS,
},
Expand Down