Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

k8s.io/apimachinery/pkg/util/wait.PollUntilContextCancel immediately executes condition twice #119533

Closed
invidian opened this issue Jul 24, 2023 · 6 comments · Fixed by #119762
Closed
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@invidian
Copy link
Member

invidian commented Jul 24, 2023

What happened?

While trying to update https://github.com/flatcar/flatcar-linux-update-operator/blob/7a1b0ff2769af91e8ff2bae82fc93f2f5baf0b0c/pkg/agent/agent.go#L328 to use latest version of wait package and to migrate away from deprecated PollImmediateUntil, function, our tests fail as it seems new PollUntilContextCancel function immediately executes given condition twice, which is not mentioned anywhere.

Result of executed test specified below:

$ go test . -v
=== RUN   Test_test
=== PAUSE Test_test
=== CONT  Test_test
=== RUN   Test_test/PollUntilContextCancel
    prog_test.go:25: Executed at 2009-11-10 23:00:00 +0000 UTC m=+0.000000001
    prog_test.go:25: Executed at 2009-11-10 23:00:00 +0000 UTC m=+0.000000001
    prog_test.go:25: Unexpected early timestamp
=== RUN   Test_test/PollImmediateUntil
    prog_test.go:31: Executed at 2009-11-10 23:00:00 +0000 UTC m=+0.000000001
    prog_test.go:31: Executed at 2009-11-10 23:00:02 +0000 UTC m=+2.000000001
--- FAIL: Test_test (3.00s)
    --- FAIL: Test_test/PollUntilContextCancel (0.00s)
    --- PASS: Test_test/PollImmediateUntil (3.00s)
FAIL

What did you expect to happen?

Since k8s.io/apimachinery/pkg/util/wait.PollImmediateUntil has been deprecated and it recommends using k8s.io/apimachinery/pkg/util/wait.PollUntilContextCancel, I expected it to have the equivalent semantics to allow easy migration from deprecated functions, as no changelog or documentation mentions new semantics.

How can we reproduce it (as minimally and precisely as possible)?

Playground: https://go.dev/play/p/qQBnWM4nufF

Local:

  1. go mod init foo && go get k8s.io/apimachinery/pkg/util/wait
  2. Create main_test.go file with the following content:
package main

import (
	"context"
	"testing"
	"time"

	"k8s.io/apimachinery/pkg/util/wait"
)

const (
	testTimeout           = 3 * time.Second
	callsExpectedInterval = 2 * time.Second
)

func Test_test(t *testing.T) {
	t.Parallel()

	ctx, cancel := context.WithTimeout(context.Background(), testTimeout)
	t.Cleanup(cancel)

	t.Run("PollUntilContextCancel", func(t *testing.T) {
		f := testIntervals(t)

		wait.PollUntilContextCancel(ctx, callsExpectedInterval, true, func(context.Context) (bool, error) { f(); return false, nil })
	})

	t.Run("PollImmediateUntil", func(t *testing.T) {
		f := testIntervals(t)

		wait.PollImmediateUntil(callsExpectedInterval, func() (bool, error) { f(); return false, nil }, ctx.Done())
	})
}

func testIntervals(t *testing.T) func() {
	t.Helper()

	timestamps := make(chan time.Time, 1)

	timestamps <- time.Now().Add(-1 * 2 * callsExpectedInterval)

	return func() {
		t.Helper()

		t.Logf("Executed at %s", time.Now())

		if timestamp := <-timestamps; timestamp.After(time.Now().Add(-1 * callsExpectedInterval / 2)) {
			t.Fatalf("Unexpected early timestamp")
		}

		timestamps <- time.Now()
	}
}
  1. Run as go test . -v.

Anything else we need to know?

No response

Kubernetes version

$ kubectl version
# paste output here

Cloud provider

OS version

# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

@invidian invidian added the kind/bug Categorizes issue or PR as related to a bug. label Jul 24, 2023
@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jul 24, 2023
@invidian
Copy link
Member Author

/sig api-machinery

@k8s-ci-robot k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jul 24, 2023
@invidian
Copy link
Member Author

CC @smarterclayton @aojea since you were involved in #118671 #118686 #107826

@alexzielenski
Copy link
Contributor

/assign @aojea
/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jul 25, 2023
@AxeZhan
Copy link
Member

AxeZhan commented Jul 26, 2023

I'd like to work on this one if it's allowed @aojea @smarterclayton
I think this is a simple fix with few lines removed:

	if immediate {
		if ok, err := func() (bool, error) {
			defer runtime.HandleCrash()
			return condition(ctx)
		}(); err != nil || ok {
			return err
		}
-       } else {
-               timeCh = t.C()
-               select {
-               case <-doneCh:
-                       return ctx.Err()
-               case <-timeCh:
-               }
+       }
+
+       timeCh = t.C()
+       select {
+       case <-doneCh:
+               return ctx.Err()
+       case <-timeCh:
        }

i.e.: instead of only wait a loop if immediate is false, we wait a time loop after special execution also if the immediate is true.

	if immediate {
		if ok, err := func() (bool, error) {
			defer runtime.HandleCrash()
			return condition(ctx)
		}(); err != nil || ok {
			return err
		}
	}

	timeCh = t.C()
	select {
	case <-doneCh:
		return ctx.Err()
	case <-timeCh:
	}

@AxeZhan
Copy link
Member

AxeZhan commented Aug 4, 2023

/assign

@aojea
Copy link
Member

aojea commented Aug 6, 2023

/assign @AxeZhan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants
@alexzielenski @aojea @invidian @k8s-ci-robot @AxeZhan and others