New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add timeout config for node_e2e tests #83268
Conversation
@odinuge: The label(s) In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
hack/make-rules/test-e2e-node.sh
Outdated
@@ -95,6 +95,7 @@ if [ "${remote}" = true ] ; then | |||
cleanup=${CLEANUP:-"true"} | |||
delete_instances=${DELETE_INSTANCES:-"false"} | |||
preemptible_instances=${PREEMPTIBLE_INSTANCES:-"false"} | |||
test_timeout=${TIMEOUT:-"45m"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we avoid hardcoding "45m" here (ref. https://github.com/kubernetes/kubernetes/blob/master/test/e2e_node/remote/remote.go#L34), and only use the test-timeout
flag when the timeout is given? And if so, should we then still write Defaults to 45m
in the Makefile?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm plus one on avoiding hardcoding 45m - lets keep remote.go
as the source of truth for the default. I think the Makefile
should also indicate that remote.go
contains the source of truth.
/cc BenTheElder |
@kubernetes/sig-node-pr-reviews |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can only see benefits from adding the ability to configure the timeout :) None of the default behaviors are changed, nor is the behavior when running via CI. We are just giving the individual dev more flexibility.
LGTM modulo I agree with your comment around not hard coding "45m" in two places. Please ping me when you've updated the diff and I'll mark as "lgtm".
hack/make-rules/test-e2e-node.sh
Outdated
@@ -95,6 +95,7 @@ if [ "${remote}" = true ] ; then | |||
cleanup=${CLEANUP:-"true"} | |||
delete_instances=${DELETE_INSTANCES:-"false"} | |||
preemptible_instances=${PREEMPTIBLE_INSTANCES:-"false"} | |||
test_timeout=${TIMEOUT:-"45m"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm plus one on avoiding hardcoding 45m - lets keep remote.go
as the source of truth for the default. I think the Makefile
should also indicate that remote.go
contains the source of truth.
e5fcce1
to
14d386a
Compare
/test pull-kubernetes-node-e2e-containerd |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
@@ -151,6 +155,7 @@ if [ "${remote}" = true ] ; then | |||
--delete-instances="${delete_instances}" --test_args="${test_args}" --instance-metadata="${metadata}" \ | |||
--image-config-file="${image_config_file}" --system-spec-name="${system_spec_name}" \ | |||
--preemptible-instances="${preemptible_instances}" --extra-envs="${extra_envs}" --test-suite="${test_suite}" \ | |||
"${timeout_arg}" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does this do if empty? is an empty arg tolerated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, yeah, that works fine! Running with TIMEOUT=20s
, TIMEOUT=""
and without TIMEOUT
works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ping @liggitt
@odinuge would you be able to continue working on this? |
14d386a
to
08d808b
Compare
Some test suits use more than the default 45m, resulting in the test to crash.
08d808b
to
6dfdf05
Compare
/retest |
1 similar comment
/retest |
Looking through owners of build and hack... |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/remove-lifecycle stale
/approve
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: derekwaynecarr, odinuge, spiffxp The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest Review the full test history for this PR. Silence the bot with an |
What type of PR is this?
/kind feature
/sig testing
What this PR does / why we need it:
Some test suits use more than the default 45m, resulting in the test to
crash. Without invoking the go runner manually, it is currently impossible to set the timeout when running the tests.
Here is an example of a test suite taking ~2 hours.
The current 45m limit comes from here: https://github.com/kubernetes/kubernetes/blob/master/test/e2e_node/remote/remote.go#L34
Which issue(s) this PR fixes:
Fixes #
N/A
Special notes for your reviewer:
Does this PR introduce a user-facing change?:
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: