Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCPBUGS-22200: Retry fetching OpenStack hostname if it fails #3990

Merged

Conversation

mdbooth
Copy link
Contributor

@mdbooth mdbooth commented Oct 20, 2023

Common cause with #3979

- What I did

Firstly, a refactor to use an EnvironmentFile via a dropin on kubelet.service rather than systemd config in kubelet.service.d. This is a slightly nicer way of doing it and is consistent with the approach for AWS in #3979. It requires an upgrade to move any existing file to the new location and format, but this can be removed in a release after all legacy nodes have been upgraded.

Most significantly, updates the generator script to retry indefinitely until it gets a value returned.

- How to verify it

  • Ensure a newly installed machine gets /etc/kubernetes/node.env correctly populated with instance name.
  • Ensure an upgraded machine gets /etc/systemd/system/kubelet.service.d/20-openstack-node-name.conf upgraded to /etc/kubernetes/node.env. The upgrade will be logged in the openstack-kubelet-nodename system service on first boot after upgrade.
  • Ensure nodename is set to instance name, not hostname, on clouds where these are different. I believe Vexxhost is a candidate.

- Description for the changelog

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 20, 2023
@mdbooth mdbooth changed the title Retry fetching OpenStack hostname if it fails OCPBUGS-22200: Retry fetching OpenStack hostname if it fails Oct 20, 2023
@openshift-ci-robot openshift-ci-robot added jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels Oct 20, 2023
@openshift-ci-robot
Copy link
Contributor

@mdbooth: This pull request references Jira Issue OCPBUGS-22200, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.15.0) matches configured target version for branch (4.15.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @sunzhaohua2

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

Common cause with #3979

- What I did

- How to verify it

- Description for the changelog

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot
Copy link
Contributor

@mdbooth: This pull request references Jira Issue OCPBUGS-22200, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.15.0) matches configured target version for branch (4.15.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @sunzhaohua2

In response to this:

Common cause with #3979

- What I did

Firstly, a refactor to use an EnvironmentFile via a dropin on kubelet.service rather than systemd config in kubelet.service.d. This is a slightly nicer way of doing it and is consistent with the approach for AWS in #3979. It requires an upgrade to move any existing file to the new location and format, but this can be removed in a release after all legacy nodes have been upgraded.

Most significantly, updates the generator script to retry indefinitely until it gets a value returned.

- How to verify it

  • Ensure a newly installed machine gets /etc/kubernetes/node.env correctly populated with instance name.
  • Ensure an upgraded machine gets /etc/systemd/system/kubelet.service.d/20-openstack-node-name.conf upgraded to /etc/kubernetes/node.env. The upgrade will be logged in the openstack-kubelet-nodename system service on first boot after upgrade.
  • Ensure nodename is set to instance name, not hostname, on clouds where these are different. I believe Vexxhost is a candidate.

- Description for the changelog

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@JoelSpeed
Copy link
Contributor

/test all

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 23, 2023

@mdbooth: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/okd-scos-e2e-aws-ovn c738160 link false /test okd-scos-e2e-aws-ovn

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@EmilienM
Copy link
Member

EmilienM commented Nov 9, 2023

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Nov 9, 2023
Copy link
Contributor

openshift-ci bot commented Nov 9, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: EmilienM, mdbooth

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@EmilienM
Copy link
Member

EmilienM commented Nov 9, 2023

/retest-required

@openshift-merge-bot openshift-merge-bot bot merged commit 8e2f544 into openshift:master Nov 10, 2023
14 of 15 checks passed
@openshift-ci-robot
Copy link
Contributor

@mdbooth: Jira Issue OCPBUGS-22200: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-22200 has been moved to the MODIFIED state.

In response to this:

Common cause with #3979

- What I did

Firstly, a refactor to use an EnvironmentFile via a dropin on kubelet.service rather than systemd config in kubelet.service.d. This is a slightly nicer way of doing it and is consistent with the approach for AWS in #3979. It requires an upgrade to move any existing file to the new location and format, but this can be removed in a release after all legacy nodes have been upgraded.

Most significantly, updates the generator script to retry indefinitely until it gets a value returned.

- How to verify it

  • Ensure a newly installed machine gets /etc/kubernetes/node.env correctly populated with instance name.
  • Ensure an upgraded machine gets /etc/systemd/system/kubelet.service.d/20-openstack-node-name.conf upgraded to /etc/kubernetes/node.env. The upgrade will be logged in the openstack-kubelet-nodename system service on first boot after upgrade.
  • Ensure nodename is set to instance name, not hostname, on clouds where these are different. I believe Vexxhost is a candidate.

- Description for the changelog

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-merge-robot
Copy link
Contributor

Fix included in accepted release 4.15.0-0.nightly-2023-11-13-174800

@mandre mandre deleted the retry-openstack-hostname branch November 17, 2023 06:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants