Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test(hotplug): fix race getting ipv6 #5271

Merged
merged 1 commit into from
Jun 4, 2024

Conversation

aciba90
Copy link
Contributor

@aciba90 aciba90 commented May 7, 2024

Proposed Commit Message

test(hotplug): fix race getting ipv6

Add retry logic for _get_ip_addr to properly get the ipv6 address.

The  output of `ip --brief addr` can show
ens6 UP 192.168.13.34/20 metric 200 fe80::8fd:afff:fea3:f4ad/64 instead of
ens6 UP 192.168.13.34/20 metric 200 2a05:d012:ea0:c500:1373:45f4:aa83:517c/128 fe80::8fd:afff:fea3:f4ad/64 if executed so early that the kernel didn't expose the wanted ipv6 address.

Additional Context

https://jenkins.canonical.com/server-team/view/cloud-init/job/cloud-init-integration-jammy-ec2/lastCompletedBuild/testReport/tests.integration_tests.modules/test_hotplug/test_multi_nic_hotplug_vpc/

...
            assert "routing-policy" in new_nic_cfg
>           assert [
                {"from": secondary_priv_ip4, "table": 101},
                {"from": secondary_priv_ip6, "table": 101},
            ] == new_nic_cfg["routing-policy"]
E           AssertionError: assert [{'from': '192.168.1.57', 'table': 101}, {'from': 'fe80::f5:aeff:fe8a:d7d', 'table': 101}] == [{'from': '192.168.1.57', 'table': 101}, {'from': '2600:1f16:7b9:e900:ab5a:a297:8501:714', 'table': 101}]
E             
E             At index 1 diff: {'from': 'fe80::f5:aeff:fe8a:d7d', 'table': 101} != {'from': '2600:1f16:7b9:e900:ab5a:a297:8501:714', 'table': 101}
E             
E             Full diff:
E               [
E                   {
E                       'from': '192.168.1.57',
E                       'table': 101,
E                   },
E                   {
E             -         'from': '2600:1f16:7b9:e900:ab5a:a297:8501:714',
E             +         'from': 'fe80::f5:aeff:fe8a:d7d',
E                       'table': 101,
E                   },
E               ]

Test Steps

#!/bin/bash
set -euxo pipefail

export CLOUD_INIT_PLATFORM=ec2
export CLOUD_INIT_OS_IMAGE=jammy

for i in $(seq 1 10); do
        tox -e integration-tests -- --pdb tests/integration_tests/modules/test_hotplug.py::test_multi_nic_hotplug_vpc
done

Checklist

Merge type

  • Squash merge using "Proposed Commit Message"
  • Rebase and merge unique commits. Requires commit messages per-commit each referencing the pull request number (#<PR_NUM>)

@aciba90 aciba90 requested a review from a-dubs May 7, 2024 19:10
Copy link
Collaborator

@a-dubs a-dubs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@holmanb holmanb self-assigned this May 8, 2024
Copy link
Member

@holmanb holmanb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aciba90 Nice, thanks for fixing this!

A couple of minor questions, but this looks good to me!

  1. I don't remember why we call the .launch() context manager twice, is that intentional?

  2. We already have _wait_till_hotplug_complete(), and the purpose of this operation is to wait for hotplug to complete (the outcome of hotplug, if not necessarily the systemd service). Would the fix make more sense in that function maybe?

@aciba90
Copy link
Contributor Author

aciba90 commented May 9, 2024

Thanks for the reviews!

1. I don't remember why we call the `.launch()` context manager twice, is that intentional?

We need a second instance in the same subnet to test that connectivity works properly from there. It is explained in the test docstring.

2. We already have _wait_till_hotplug_complete(), and the purpose of this operation is to wait for hotplug to complete (the outcome of hotplug, if not necessarily the systemd service). Would the fix make more sense in that function maybe?

I do not think so, as if so, _wait_till_hotplug_complete would:

  1. Grow a lot of logic, regarding NICs, interfaces state, etc, that IMO would make the function more complex.
  2. While I also think that the fix (retry-logic) does not 100% belong to the current function, I think it fits more natural than in _wait_till_hotplug_complete.

Thoughts?

Copy link

Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close.

If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging TheRealFalcon, and he will ensure that someone takes a look soon.

(If the pull request is closed and you would like to continue working on it, please do tag TheRealFalcon to reopen it.)

@github-actions github-actions bot added the stale-pr Pull request is stale; will be auto-closed soon label May 24, 2024
@aciba90 aciba90 requested a review from holmanb May 27, 2024 14:39
@aciba90 aciba90 removed the stale-pr Pull request is stale; will be auto-closed soon label May 27, 2024
@blackboxsw blackboxsw added this to the cloud-init-24.2 milestone May 27, 2024
Add retry logic for _get_ip_addr to properly get the ipv6 address.

The  output of `ip --brief addr` can show
ens6 UP 192.168.13.34/20 metric 200 fe80::8fd:afff:fea3:f4ad/64
instead of
ens6 UP 192.168.13.34/20 metric 200 2a05:d012:ea0:c500:1373:45f4:aa83:517c/128 fe80::8fd:afff:fea3:f4ad/64
if executed so early that the kernel didn't expose the wanted ipv6
address.
@aciba90 aciba90 merged commit 7e4d293 into canonical:main Jun 4, 2024
23 checks passed
@aciba90 aciba90 deleted the aws-hotplug-ipv6-test branch June 4, 2024 16:36
holmanb pushed a commit that referenced this pull request Jun 28, 2024
Add retry logic for _get_ip_addr to properly get the ipv6 address.

The  output of `ip --brief addr` can show
ens6 UP 192.168.13.34/20 metric 200 fe80::8fd:afff:fea3:f4ad/64
instead of
ens6 UP 192.168.13.34/20 metric 200 2a05:d012:ea0:c500:1373:45f4:aa83:517c/128 fe80::8fd:afff:fea3:f4ad/64
if executed so early that the kernel didn't expose the wanted ipv6
address.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants