Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(test): wait enough time to Trigger Running Hook. Fixes: #12844 #12855

Merged
merged 4 commits into from Mar 30, 2024

Conversation

shuangkun
Copy link
Member

sleep 1s is sometimes too short to trigger the running hook

Fixes #12844

Motivation

Modifications

Verification

Signed-off-by: shuangkun <tsk2013uestc@163.com>
test/e2e/hooks_test.go Outdated Show resolved Hide resolved
Signed-off-by: shuangkun <tsk2013uestc@163.com>
Signed-off-by: shuangkun <tsk2013uestc@163.com>
@shuangkun
Copy link
Member Author

shuangkun commented Mar 28, 2024

I saw there are many comment:

	// TODO: Temporarily comment out this assertion since it's flaky:
	// 	  The running hook is occasionally not triggered. Possibly because the step finishes too quickly
	//	  while the controller did not get a chance to trigger this hook.

This one we can sleep longer. Because it only verifies running hooks, unlike others that also verify other types of hooks.

Signed-off-by: shuangkun <tsk2013uestc@163.com>
@tczhao
Copy link
Member

tczhao commented Mar 28, 2024

Could you let us how you tested it to make sure it doesn't flake

@shuangkun
Copy link
Member Author

shuangkun commented Mar 28, 2024

Could you let us how you tested it to make sure it doesn't flake

It is indeed not guaranteed, but now sleep 1s does often lead to a higher probability of failure of this test. In many other tests, the running hook check is skipped. If it is better for you to skip the task,I can skip it.

@tczhao
Copy link
Member

tczhao commented Mar 28, 2024

I wonder if we can run https://argo-workflows.readthedocs.io/en/stable/running-locally/#running-e2e-tests-locally a few times to reproduce the original issue

@shuangkun
Copy link
Member Author

Could you let us how you tested it to make sure it doesn't flake

It is indeed not guaranteed, but now sleep 1s does often lead to a higher probability of failure of this test. In many other tests, the running hook check is skipped. If it is better for you to skip the task, you can skip it.

I wonder if we can run https://argo-workflows.readthedocs.io/en/stable/running-locally/#running-e2e-tests-locally a few times to reproduce the original issue

It is difficult to reproduce, but it may happen in the test after submitting the PR. Like TestStoppedWorkflow and TestTemplateExecutor, It is almost difficult to reproduce locally when running a test alone. It is easy to occur only after the PR is submitted and several people trigger the pipeline at the same time.

Copy link
Member

@tczhao tczhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can keep monitor and see if this change reduces flakiness

@tczhao tczhao self-assigned this Mar 28, 2024
@shuangkun
Copy link
Member Author

I think we can keep monitor and see if this change reduces flakiness

Yes, I think so. Thanks

@agilgur5 agilgur5 added area/build Build or GithubAction/CI issues area/hooks labels Mar 29, 2024
Copy link
Member

@toyamagu-2021 toyamagu-2021 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! We should observe this change resolves the issue.

I wonder if we can run argo-workflows.readthedocs.io/en/stable/running-locally/#running-e2e-tests-locally a few times to reproduce the original issue

It might be hard to reproduce this issue by local testing, because this fakiness may be caused by poor spec of CI runner (2core, 7mem).

Ref: #11534

@agilgur5 Thanks for mentioning me :)

@agilgur5
Copy link
Member

poor spec of CI runner (2core, 7mem).

This did get doubled recently (see also Slack thread), it's now 4 vCPU & 16GB RAM. But 4 vCPU is still likely less vCPU than most local dev setups

@agilgur5 agilgur5 merged commit daa5f7e into argoproj:main Mar 30, 2024
27 checks passed
@agilgur5 agilgur5 added this to the v3.5.x patches milestone Apr 19, 2024
agilgur5 pushed a commit that referenced this pull request Apr 19, 2024
…12855)

Signed-off-by: shuangkun <tsk2013uestc@163.com>
(cherry picked from commit daa5f7e)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/build Build or GithubAction/CI issues area/hooks
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TestTemplateLevelHooksWaitForTriggeredHook fail sometimes (flakey test)
4 participants