Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCPBUGS-19037: Handle agent tui failure gracefully #7490

Merged

Conversation

zaneb
Copy link
Member

@zaneb zaneb commented Sep 15, 2023

If agent-tui fails to run, just carry on and don't block the console and ssh login.

By setting agent-interactive-service RequiredBy sshd and systemd-logind,
we prevent the user from ever logging in to the host if the agent-tui
process fails. This makes it impossible to debug the failure.

We want to start the agent-tui whenever tty1 exists (which in practice
seems to be always, even if there is only a serial console). To ensure
that we don't block subsequent login to the console after a failure, use
WantedBy instead of RequiredBy. The other install dependencies are not
needed.
If agent-tui fails to start, we still want to switch back to the regular
VT and re-enable systemd status. Since ExecStartPost commands only run
if ExecStart succeeds, use ExecStopPost instead. This means disabling
RemainAfterExit (which is not needed anyway) so that the service also
stops on success.
@openshift-ci-robot openshift-ci-robot added jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. labels Sep 15, 2023
@openshift-ci-robot
Copy link
Contributor

@zaneb: This pull request references Jira Issue OCPBUGS-19037, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.15.0) matches configured target version for branch (4.15.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @mhanss

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

If agent-tui fails to run, just carry on and don't block the console and ssh login.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. label Sep 15, 2023
@zaneb
Copy link
Member Author

zaneb commented Sep 15, 2023

/cherry-pick release-4.14

@openshift-cherrypick-robot

@zaneb: once the present PR merges, I will cherry-pick it on top of release-4.14 in a new PR and assign it to you.

In response to this:

/cherry-pick release-4.14

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@zaneb
Copy link
Member Author

zaneb commented Sep 15, 2023

/retest-required

@zaneb
Copy link
Member Author

zaneb commented Sep 15, 2023

/cc @rwsu

@openshift-ci openshift-ci bot requested a review from rwsu September 15, 2023 09:48
@andfasano
Copy link
Contributor

/test e2e-agent-sno-ipv4-pxe
/test e2e-agent-compact-ipv4-none-platform

@bfournie
Copy link
Contributor

/approve
/retest

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 15, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bfournie

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 15, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 15, 2023

@zaneb: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-agent-compact-ipv4-appliance 2357623 link false /test e2e-agent-compact-ipv4-appliance
ci/prow/okd-scos-e2e-agent-compact-ipv4 2357623 link false /test okd-scos-e2e-agent-compact-ipv4
ci/prow/okd-e2e-agent-compact-ipv4 2357623 link false /test okd-e2e-agent-compact-ipv4
ci/prow/e2e-agent-compact-ipv4-none-platform 2357623 link false /test e2e-agent-compact-ipv4-none-platform
ci/prow/okd-scos-e2e-agent-sno-ipv6 2357623 link false /test okd-scos-e2e-agent-sno-ipv6
ci/prow/okd-e2e-agent-sno-ipv6 2357623 link false /test okd-e2e-agent-sno-ipv6

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@andfasano
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Sep 18, 2023
@openshift-merge-robot openshift-merge-robot merged commit 4e4eec6 into openshift:master Sep 18, 2023
25 of 31 checks passed
@openshift-ci-robot
Copy link
Contributor

@zaneb: Jira Issue OCPBUGS-19037: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-19037 has been moved to the MODIFIED state.

In response to this:

If agent-tui fails to run, just carry on and don't block the console and ssh login.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-cherrypick-robot

@zaneb: new pull request created: #7497

In response to this:

/cherry-pick release-4.14

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants