Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add network teaming support (and clean up bond support) for RHEL/CentOS #57775

Merged
merged 13 commits into from
Jul 30, 2020

Conversation

terminalmage
Copy link
Contributor

@terminalmage terminalmage commented Jun 24, 2020

This makes the following changes:

  1. Network teaming support added via two new interface types: team and teamport.

  2. Bond support cleaned up significantly. A lot of code was duplicated in the functions that build bonding opts for the various bonding modes. This duplicated code now resides in helper functions. In addition, for the bonding types that have required configuration parameters, the error messages now actually make sense and lead you to what is missing.

  3. Fix shadowing of type in network.managed state by pulling it out of the kwargs. In addition, type now defaults to eth if no type is provided. If you just want to disable an interface, you shouldn't be required to also include an interface type.

  4. Documentation for the network.managed state has been completely overhauled, with explanations for the different interface types, instead of just one large code-block with example states and zero explanation/context.

  5. Various weirdness introduced by blackening of the codebase has been cleaned up.

  6. Interface templates for EL5 and EL6 removed, as Salt is no longer supported on these platforms.

  7. Usage of six has been retired, as it is no longer needed.

  8. Unit tests for each bond mode, as well as the new team/teamport interface types, have been added.

  9. ip.get_bond and ip.build_bond removed from rh_ip since no supported release needs them.

This makes the following changes:

1. Network teaming support added via two new interface types: `team` and
   `teamport`.

2. Bond support cleaned up significantly. A lot of code was duplicated
   in the functions that build bonding opts for the various bonding
   modes. This duplicated code now resides in helper functions. In
   addition, for the bonding types that have required configuration
   parameters, the error messages now actually make sense and lead you
   to what is missing.

3. Fix shadowing of `type` in `network.managed` state by pulling it out
   of the kwargs. In addition, `type` now defaults to `eth` if no type
   is provided. If you just want to disable an interface, you shouldn't
   be required to also include an interface type.

4. Documentation for the `network.managed` state has been completely
   overhauled, with explanations for the different interface types,
   instead of just one large code-block with example states and zero
   explanation/context.

5. Various weirdness introduced by blackening of the codebase has been
   cleaned up.

6. Interface templates for EL5 and EL6 removed, as Salt is no longer
   supported on these platforms.

7. Usage of six has been retired, as it is no longer needed.

8. Unit tests for each bond mode, as well as the new team/teamport
   interfaces types, have been added.
@terminalmage terminalmage requested a review from a team as a code owner June 24, 2020 00:12
@ghost ghost requested review from DmitryKuzmenko and removed request for a team June 24, 2020 00:13
@ixs
Copy link
Contributor

ixs commented Jun 24, 2020

Hahahahaha. I just pushed #57776 which is adding teaming support. 🤦🏻‍♂️

Small thing I noticed though on your change: you're using DEVICETYPE as well, which is sensible for better compatibility.
DEVICETYPE is not team specific though, but can be used by other interface types as well. So removing the team prefix might make sense.

@terminalmage
Copy link
Contributor Author

terminalmage commented Jun 24, 2020

@ixs Haha, great timing! 😆

The reason that I am using both DEVICETYPE and TYPE for team interfaces is that I noticed that my team interfaces would not work properly when DEVICETYPE was omitted.

I did not consider trying to leave out TYPE.

@ixs
Copy link
Contributor

ixs commented Jun 24, 2020

Have a look at https://unix.stackexchange.com/a/157252 which explains what is going on better than I could.
Basically it comes down to if you use DEVICETYPE an RPM can ship an ifcfg-script that can extend the base scripts.
If you use TYPE support needs to be baked into the base scripts.
That's why I'm saying DEVICETYPE is not teaming specific.

When you use DEVICETYPE, TYPE should be omitted to prevent the wrong scripts from running.

On another note: I noticed you're deleting the rh5 and rh6 templates. Sure that's a good idea? That would mean no support for managing network interfaces on those OS releases. rhel5 might make sense but rhel6 is still supported by RH for at least half a year. I do not believe we should be breaking that.
And if that is really the right thing to do, you should also remove the references to rh_major for 6 inside the code.

@terminalmage
Copy link
Contributor Author

terminalmage commented Jun 24, 2020

Have a look at https://unix.stackexchange.com/a/157252 which explains what is going on better than I could.
Basically it comes down to if you use DEVICETYPE an RPM can ship an ifcfg-script that can extend the base scripts.
If you use TYPE support needs to be baked into the base scripts.
That's why I'm saying DEVICETYPE is not teaming specific.

When you use DEVICETYPE, TYPE should be omitted to prevent the wrong scripts from running.

OK, I will get this changed on Wednesday.

On another note: I noticed you're deleting the rh5 and rh6 templates. Sure that's a good idea? That would mean no support for managing network interfaces on those OS releases. rhel5 might make sense but rhel6 is still supported by RH for at least half a year. I do not believe we should be breaking that.

As noted in the OP, Salt version 3001 does not support EL6 (actually now that I look, I did not specifically mention 3001), so there is no point in keeping these templates.

And if that is really the right thing to do, you should also remove the references to rh_major for 6 inside the code.

It looks like the references there were a bit out-of-date, and should reflect newer Fedora using an rh_major of 8. Thanks for the heads-up, I've just pushed a commit to take care of this.

salt/modules/rh_ip.py Outdated Show resolved Hide resolved
@ixs
Copy link
Contributor

ixs commented Jun 24, 2020

Okay, I had a quick look at the code, teaming support looks good. You solved it basically the same way I did with the only difference that I did a |tojson in the template and you do json.dumps() in the module. 😂

There's only one small issue at the hwaddr/addr_auto step where it should only do this for the team master and not for the team members.
Other than that and the already known DEVICETYPE issue, this looks good to me.

Did not review the bonding code as I do not use it.

@terminalmage
Copy link
Contributor Author

@ixs Can you have another look when you get a moment?

@terminalmage
Copy link
Contributor Author

@ixs I saw your review comment but it seems to be gone now. I'm working on a change to address that.

@terminalmage
Copy link
Contributor Author

OK, just pushed a change to warn when "addr" is used to set the interface's hwaddr. I checked the other ip modules and none of them use "addr" for anything.

@ixs
Copy link
Contributor

ixs commented Jun 24, 2020

I wouldn't worry about addr. I deleted it when I noticed that nobody ever used addr and every sls uses hwaddr.

@terminalmage
Copy link
Contributor Author

Ah, OK... well, probably still doesn't hurt to have the warning there.

@terminalmage
Copy link
Contributor Author

@ixs Thanks a lot for your help!

@ixs
Copy link
Contributor

ixs commented Jun 24, 2020

Did have another look. Looks good. I initially thought the addr -> hwaddr change would break existing SLS files, but I was mistaken.

The team change looks good, DEVICETYPE being a general thing now is good, the hwaddr logic fix looks good.
I'll had a quick look at the bonding part to see if I can make sense of it. But if I think about bonding and the fact that we are getting rid of RHEL5 and RHEL6 support, I am wondering if it wouldn't make sense to simplify the bonding config completely:

RHEL4 was the last OS where the bonding options go into the modprobe.d config file.

Since RHEL5 there's a ifcfg file parameter called BONDING_OPTS that gets all the kernel config parameters. Which saltstack already supports.

  • So maybe we should just get rid of get_bond() and build_bond()? That would mean some simplification?
  • _RH_CONFIG_BONDING_OPTS seems to be unused. Can we get rid of it? Am I missing something? If not, we could get rid of the special handler in
    # Setup up bond modprobe script if required

Otherwise I'll have a closer look at the bonding changes. Quick glance looks good, but I need to read closer to get the logic right.

@ixs
Copy link
Contributor

ixs commented Jun 24, 2020

Ohh, another note. It might make sense for you to review #57772 and land that in master before applying your change.

That's a one line fix for a KeyError cascade in rh_ip. Spotted that yesterday while doing my work for teaming.

These are redundant and no longer necessary. The `network.managed`
state has been modified to only try to build modprobe configurations if
the functions to do so exist, allowing other platforms to continue to
work as before.
@terminalmage
Copy link
Contributor Author

terminalmage commented Jun 24, 2020

@ixs You're right about get_bond and build_bond. In fact, salt was configuring the bonding opts in both places (modprobe configs and ifcfg-XXXXX files). I removed these functions and modified the network.managed state so that it only tries to build the modprobe configs if there are functions in the __salt__ dunder to do so. This will allow the other two supported platforms to continue to use this method until we can determine whether or not that functionality can be removed.

I also pulled in your fix from #57772, and I added a release notes file for the upcoming version 3002 which mentions the function removal as well as the new network teaming support.

@ixs
Copy link
Contributor

ixs commented Jun 24, 2020

Nice. Looks good. Of course, now my commit stats to salt look so much worse. 😂

I'm gonna have a look over the current change in a bit with a focus on the reworked bonding now.

@terminalmage
Copy link
Contributor Author

Haha, sorry! I've been working on this for a few weeks and I did mention it in the community slack's #develop channel. That said, you came up with a lot that I missed, so thanks very much for that.

Copy link
Contributor

@dwoz dwoz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good. Is it possible to add an integration test in addition to the unit tests for this?

@@ -808,7 +722,7 @@ def _parse_settings_eth(opts, iface_type, enabled, iface):
result["enable_ipv6"] = opts["enable_ipv6"]

valid = _CONFIG_TRUE + _CONFIG_FALSE
for opt in [
for opt in (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why you are changing this from a list to tuple here but in other places are changing these kinds of checks from tuples to lists? Would it be better to standardize on tuples if they don't need to be mutable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think any of my changes involved changing tuples to lists, did I miss anything?

@terminalmage
Copy link
Contributor Author

@dwoz I felt that unit tests were better here for a few reasons:

  1. Integration tests would alter networking on the test VMs, which if done incorrectly could mess with the test suite.
  2. The existing integration tests for network.managed don't do anything but run the states with test=True and check the state return dictionary. This doesn't really give you any insight into what would be done, nor does it build the interface configuration.
  3. Unit tests, on the other hand, allowed me to run the underlying ip.build_interface function that creates the network configuration file, using test=True to return the rendered template as a string instead of writing it to disk, and then craft asserts against the expected contents of those files.

That said, I'm 100% open to adding integration tests, if you have any ideas on what pieces would be feasible to test.

DmitryKuzmenko
DmitryKuzmenko previously approved these changes Jun 25, 2020
@ixs
Copy link
Contributor

ixs commented Jun 25, 2020

Haha, sorry! I've been working on this for a few weeks and I did mention it in the community slack's #develop channel.

Wait? you worked on that for a few weeks? Sorry man, I hacked mine up in half an evening. No wonder your's better and has tests. 😀
I never understand how these mocked tests work in the salt test suite. If you ever wanna do a writeup about "How to write good unit tests for Salt contributors" I'd be the first one to sing your praise. :-)

Anyway, I had a final read over rh_ip.py looking at the bonding stuff. Looks good, tests finish, commit and ship it!

@terminalmage
Copy link
Contributor Author

terminalmage commented Jun 25, 2020

Haha, sorry! I've been working on this for a few weeks and I did mention it in the community slack's #develop channel.

Wait? you worked on that for a few weeks? Sorry man, I hacked mine up in half an evening. No wonder your's better and has tests. 😀

Well it wasn't the only thing I worked on during that time 😄

I never understand how these mocked tests work in the salt test suite. If you ever wanna do a writeup about "How to write good unit tests for Salt contributors" I'd be the first one to sing your praise. :-)

One of the core devs streams testing stuff on Twitch, if you have Google Calendar you can add the SaltStack calendar via this link.

Mostly I just learned by looking at tests to see what other people already did. Also, this article is a good resource for learning how to use mocking in unit tests.

Anyway, I had a final read over rh_ip.py looking at the bonding stuff. Looks good, tests finish, commit and ship it!

👍

ixs
ixs previously approved these changes Jun 25, 2020
@DmitryKuzmenko
Copy link
Contributor

@terminalmage thank you for still to be here! =)

@terminalmage
Copy link
Contributor Author

Just added a changelog file.

Copy link
Contributor

@ixs ixs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably also mention Fedora? Or maybe just say RH family of operating systems?

@terminalmage
Copy link
Contributor Author

@ixs How does it look now?

@dwoz dwoz merged commit b2bf8ef into saltstack:master Jul 30, 2020
@sagetherage sagetherage added the Magnesium Mg release after Na prior to Al label Sep 14, 2020
@terminalmage terminalmage deleted the network-teaming branch February 12, 2024 23:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Magnesium Mg release after Na prior to Al Tests-Passed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants