Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

overlay: initramfs teardown: handle vlan devices during teardown #544

Merged
merged 1 commit into from Aug 6, 2020

Conversation

dustymabe
Copy link
Member

When we are taking down network devices it's possible that taking
down one will remove another (if they are connected in some way).
This will cause the down_interface() call to fail if the device
has already disappeared. Let's do another check to see if the device
still exists right before trying to take it down.

For an example of this failing. If we have kargs of

rd.neednet=1
vlan=vlan251:bond0
ip=192.168.122.111::192.168.122.1:255.255.255.0:initrdhost:vlan251:none:192.168.122.1
bond=bond0:ens2,ens3:mode=active-backup,miimon=100

Then we have a vlan on top of a bond and we get an error during teardown:

[  191.091838] coreos-teardown-initramfs[746]: info: taking down network device: ens2
[  191.093836] coreos-teardown-initramfs[763]: RTNETLINK answers: Operation not supported
[  191.098439] coreos-teardown-initramfs[746]: info: taking down network device: ens3
[  191.100318] coreos-teardown-initramfs[767]: RTNETLINK answers: Operation not supported
[  191.105143] coreos-teardown-initramfs[746]: info: taking down network device: vlan251
[  191.110188] coreos-teardown-initramfs[772]: Cannot find device "vlan251"
[  191.114853] coreos-teardown-initramfs[775]: Cannot find device "vlan251"
[  191.117254] systemd[1]: coreos-teardown-initramfs.service: Control process exited, code=exited, status=1/FAILURE
[  191.119333] systemd[1]: coreos-teardown-initramfs.service: Failed with result 'exit-code'.

@dustymabe dustymabe marked this pull request as draft August 4, 2020 20:16
@dustymabe dustymabe marked this pull request as ready for review August 4, 2020 21:17
@dustymabe
Copy link
Member Author

example of this working:

[   11.792368] coreos-teardown-initramfs[707]: info: skipping teardown of vlan251. It no longer exists.

Copy link
Member

@jlebon jlebon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One bikeshed, but LGTM as is.

# If the device we're about to take down has disappeared
# since the start of this loop then skip taking it down.
if [ ! -e $f ]; then
echo "info: skipping teardown of ${interface}. It no longer exists."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bikeshed:

Suggested change
echo "info: skipping teardown of ${interface}. It no longer exists."
echo "info: skipping teardown of ${interface}; no longer exists"

to keep with the theme of single info sentences.

Copy link
Contributor

@darkmuggle darkmuggle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Although, quoting the ! -e $f would be safer, it's unlikely to ever because of the preceding for f in ... that would populate the var.

When we are taking down network devices it's possible that taking
down one will remove another (if they are connected in some way).
This will cause the `down_interface()` call to fail if the device
has already disappeared. Let's do another check to see if the device
still exists right before trying to take it down.

For an example of this failing. If we have kargs of

```
rd.neednet=1
vlan=vlan251:bond0
ip=192.168.122.111::192.168.122.1:255.255.255.0:initrdhost:vlan251:none:192.168.122.1
bond=bond0:ens2,ens3:mode=active-backup,miimon=100
```

Then we have a vlan on top of a bond and we get an error during teardown:

```
[  191.091838] coreos-teardown-initramfs[746]: info: taking down network device: ens2
[  191.093836] coreos-teardown-initramfs[763]: RTNETLINK answers: Operation not supported
[  191.098439] coreos-teardown-initramfs[746]: info: taking down network device: ens3
[  191.100318] coreos-teardown-initramfs[767]: RTNETLINK answers: Operation not supported
[  191.105143] coreos-teardown-initramfs[746]: info: taking down network device: vlan251
[  191.110188] coreos-teardown-initramfs[772]: Cannot find device "vlan251"
[  191.114853] coreos-teardown-initramfs[775]: Cannot find device "vlan251"
[  191.117254] systemd[1]: coreos-teardown-initramfs.service: Control process exited, code=exited, status=1/FAILURE
[  191.119333] systemd[1]: coreos-teardown-initramfs.service: Failed with result 'exit-code'.
```
@dustymabe
Copy link
Member Author

Addressed code review comments. Going through another round of testing real quick.

@dustymabe
Copy link
Member Author

Tests look good locally.

The CI failure is because koji is under scheduled maintenance right now.

@dustymabe
Copy link
Member Author

Will merge when I can get CI green since the earlier code reviews were already ✅

@dustymabe dustymabe merged commit 31c5bc8 into coreos:testing-devel Aug 6, 2020
@dustymabe dustymabe deleted the dusty-fix-vlan-teardown branch August 6, 2020 02:39
c4rt0 pushed a commit to c4rt0/fedora-coreos-config that referenced this pull request Mar 27, 2023
Revert "kola-denylist: skip crio test on x86_64"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants