-
Notifications
You must be signed in to change notification settings - Fork 837
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net: add the ability to blacklist network interfaces based on driver during enumeration of physical network devices #591
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for putting this together. I've some suggestions and some questions inline.
cloudinit/sources/DataSourceAzure.py
Outdated
@@ -529,6 +543,9 @@ def _get_data(self): | |||
except Exception as e: | |||
LOG.warning("Failed to get system information: %s", e) | |||
|
|||
if self.distro is not None: | |||
self.distro.networking.blacklist_drivers = BLACKLIST_DRIVERS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is OK, but wanted to check with @OddBloke about restoring a datasource from obj.pkl where distro.networking might not be present? What did we do about that?
Further, if the distro.networking is there, would we need to check if it hasattr? before setting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not really an issue for Azure datasource because the obj.pkl gets deleted on every boot (due to Azure's requirement to have network re-configured on boot).
I also did test on upgrading an existing VM with this new code and keeping the obj.pkl (by disabling the systemd unit that would delete the obj.pkl on boot) and it works fine. If the datasource was pickled the _get_data method would not get called since the data was cached
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From our IRC conversation: there might be an existing issue before my change with networking being an instance variable. This change, however, might make that bug easier to run into. @OddBloke mentioned he had a local commit that I can incorporate into this PR to address the issue altogether.
Thanks for the review! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unrelatedly…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update. A couple more suggestions and I think we need to move Azure blacklist driver testing to the test_azure.py so we can operate on an instantiated Azure datasource object to confirm the changes are plumbed through from the datasource to the networking class it creates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good. One last comment, convert the comment to a docstrict for the unittest.
Proposed Commit Message
net: add the ability to blacklist network interfaces based on driver during enumeration of physical network devices
Additional Context
Relevant bug: https://bugs.launchpad.net/cloud-init/+bug/1874544
Test Steps
Deploy a Standard_HB120rs_V2 instance on Azure
Install latest Mellanox OFED driver
Reboot and observe Traceback in log about ib0 and eth1 having duplicate mac addresses
Apply this PR's patch, reboot
Observe no traceback in cloud-init log
Checklist: