Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Debian networking fix #38043

Merged
merged 16 commits into from
Dec 7, 2016
Merged

Debian networking fix #38043

merged 16 commits into from
Dec 7, 2016

Conversation

MTecknology
Copy link
Contributor

Resolves #38042; regression introduced in 2016.11.0.

Not all interfaces will have an address. Bug #38042
I wouldn't consider this a complete solution, for the reason left in the comment, but it /does/ at least help.
After looking at this a bit more closely, I believe "interface.address" was meant to be "interface.addrfam" which /does/ make sense on this line.

Because of the previous notion that all interfaces will have an address, it seems a second line was created that assumed the only scenario where an iface won't have an address is when dhcp is used. Again, not an accurate assumption. I believe this /should/ handle most expected situations.
@cachedout
Copy link
Contributor

I would like to see if @jbfriedrich could look at this, since it appears he was in this code over in #33622.

@MTecknology
Copy link
Contributor Author

Heheh... I have no qualms thoroughly debating every line of this template! :D

I see #33622 seemed to mostly focus on inet vs. inet6. I believe that should be handled by this chunk.

{%- if data.data['inet'] -%}
  [...]
{%- endif%}
{%- if data.data['inet6'] -%}
  [...]
{%- endif%}

If that doesn't prevent two inet lines from being added, we should probably focus on these lines and not on individual lines within.

I also noticed this line is commented out. I'm curious why (looking up git history on it).

{# if interface.hwaddr %}    hwaddress {{interface.hwaddr}}
{%endif#}

@cachedout
Copy link
Contributor

@MTecknology No worries! I just like to bring as many voices in as I can. :] Thanks!

@MTecknology
Copy link
Contributor Author

I see my commit is failing on this test.

https://github.com/MTecknology/salt/blob/2016.11/tests/unit/modules/debian_ip_test.py#L152

I'm not sure if the lo interface was created prior to this test running. If so, it would explain the unexpected extra line. It's also possible this was from a change in defaults (but seems less likely).

I'm planning to update the test to check for both valid conditions.

Added a check to make sure inet6 iface lines have the variables needed to prevent a render error.

Also added some whitespace and jinja comment blocks to more easily identify what's going on in this file.
Tests were failing because _parse_interfaces is returning the correct data per mocked data. The assertion was expecting the iface line to have been stripped.
Commit b2fa638 added a lot of extra options. One of these lines (hwaddr) was commented out when it was added. It appears that this was a typo because documentation indicates this line /should/ show up in the interfaces file.
@jbfriedrich
Copy link
Contributor

If I remember correctly my issue at the time was that the config generator always produced an "inet" object, and therefore the additional empty inet6 line was always generated. I didnt keep track of development in these files, just saw that a lot was backported from develop before 2016.11 was released. But i I'll be happy to apply the PR locally and see if it doesnt re-introduce the issue I had. From what I can see so far, it looks great. The sooner we can merge this, the sooner I can ditch my own version an can use stock 2016.11 again. 😃

@cachedout
Copy link
Contributor

@jbfriedrich I appreciate the quick response. If you wouldn't mind testing this locally so we can be sure we don't regress then I'll be happy to go ahead and get this in once @MTecknology is ready and the tests pass. Thanks so much! I really appreciate it.

@cachedout cachedout changed the title 2016.11 Debian networking fix Dec 2, 2016
@MTecknology
Copy link
Contributor Author

Man, I didn't expect to end up spending this much time on it. Now is probably an excellent time to mention that we can thank Suitable Technologies for sponsoring my work on this!

I see that the unit tests are finally completing successfully. This is great, but I'm a bit concerned by the simplicity of the current tests. I believe they are now all excellent and valid tests, but I'd like to add two more tests that check things a bit more.

I'm in the process of writing those tests now and hoping they'll be relatively easy and just copy/paste/tweak.

@cachedout
Copy link
Contributor

@MTecknology We appreciate your hard work here! Thanks also to your employer for supporting open-source. Awesome!

@jbfriedrich
Copy link
Contributor

👍 Kudos for your hard work and your employer to support Salt and open-source! Seriously, I wish I was allowed to spend as much time on this as necessary at work.

@MTecknology
Copy link
Contributor Author

.... Nooooo!!!

auto lo
iface lo inet loopback

auto eth2
iface eth2 inet static
        address 10.48.7.20
        netmask 255.255.255.0
        gateway 10.48.7.1

auto bond0
iface bond0 inet manual
        slaves eth0 eth1
        bond_miimon 100
        bond_mode 802.3ad
        bond_xmit_hash_policy layer3+4

auto vmbr0
iface vmbr0 inet static
        address  10.48.0.20
        netmask  255.255.255.0
        gateway  10.48.0.1
        bridge_ports bond0
        bridge_stp off
        bridge_fd 0

I'm noticing that the slaves line is not read. :'(

This /should/ be possible, but will be stripped.

iface bond0 inet manual
        slaves eth0 eth1
@MTecknology
Copy link
Contributor Author

I'm struggling to get this unit test figured out. I did a push to share, but this is definitely gonna fail.

If anyone is feeling particularly teachy today... I'm on IRC and more than eager to learn!

Sorry, I just don't have the time to learn all of this right now.
@MTecknology
Copy link
Contributor Author

Sorry, I just don't have the time to learn/write unit tests right now. I rolled that change back and I'm gonna offer this PR up for review.

@MTecknology
Copy link
Contributor Author

I can't reproduce that failed test locally and it doesn't look like it has anything to do with any of my changes. :S

@jbfriedrich
Copy link
Contributor

I just applied this PR to my 2016.11 version and tested it with my IPv4 and IPv6 configuration. Unfortunately it introduces the old behaviour I tried to fix in #33622. I ran the following state file

eth0:
  network.managed:
    - type: eth
    - enable_ipv6: False
    - proto: none
    - ipaddr: {{ pillar['ipv4subnet']['ip3'] }}
    - netmask: {{ pillar['ipv4subnet']['ntmsk'] }}
    - gateway: {{ pillar['ipv4subnet']['rtr'] }}
    - noifupdown: True
    - dns:
      - 8.8.8.8
      - 8.8.4.4

eth1:
  network.managed:
    - type: eth
    - enable_ipv6: True
    - ipv6proto: static
    - ipv6ipaddr: 2a01:xxx:xxx:xxxx::3
    - ipv6netmask: 64
    - ipv6gateway: 2a01:xxx:xxx:xxxx::1

eth2:
  network.managed:
    - type: eth
    - enable_ipv6: False
    - proto: none
    - ipaddr: 192.168.17.8
    - netmask: 255.255.255.0
    - noifupdown: True

eth2_routes:
  network.routes:
    - name: eth2
    - routes:
      - name: vpn
        ipaddr: 10.8.0.0
        netmask: 255.255.0.0
        gateway: 192.168.17.1

and end up with the following network configuration file:

auto lo
iface lo inet loopback
auto eth0
iface eth0 inet static
    address 138.xxx.xx.xx5
    netmask 255.255.255.248
    gateway 138.xxx.xx.xx3
    dns-nameservers 8.8.8.8 8.8.4.4
auto eth1
iface eth1 inet static
iface eth1 inet6 static
    address 2a01:xxx:xxx:xxxx::3
    netmask 64
    gateway 2a01:xxx:xxx:xxxx::1
auto eth2
iface eth2 inet static
    address 192.168.17.8
    netmask 255.255.255.0

As you can see eth1 has an inet and inet6 line, even though I only applied an IPv6 address to the interface. I will take another look at the config generator and the Jinja template later and hope I will have an epiphany.

@MTecknology
Copy link
Contributor Author

@jbfriedrich It's late at night and I'm no longer sane, perhaps you could sanity check this one for me? :)

@jbfriedrich
Copy link
Contributor

Tested. Unfortunately still the same. If I assign only an IPv6 address to the interface, I end up with an inet and inet6 line in the configuration.

@MTecknology
Copy link
Contributor Author

At this point, I believe the issues with the template have been resolved.

I started stepping through this process start to end with pudb to track down where things go awry and I think I found it. This template expects to create individual blocks for every ethernet device, including a separate block for IPv4 and IPv6. To my knowledge, this is the correct behavior.

However, because of that behavior, the template also assumes it will either create an IPv4 OR an IPv6 configuration block. This seems like a logical choice by the template designer, but the assumption is misleading...

In the file modules/debian_ip.py at approx. line 1200, there is a function _parse_settings_eth( which builds the data passed to the template. I've verified the correct data structures/values are being passed to this function. However!!! Looking at just this function alone, we can see that it will ALWAYS return an "inet" dictionary. Because these "inet" bits are always added, the template thinks that the interface always have IPv4 stuff, even if it's meant to be an IPv6 block.

So!!! I'm going to step away from this for a few hours and come back to the logic in this function.

@jbfriedrich
Copy link
Contributor

Yes, exactly what I found the last time. I just went and retraced my steps and compared my notes from back then. I didn't feel comfortable enough to change the logic of the _parse_settings_eth function, don't feel that confident now either 🙁 . Will take a look non-the-less as well and will be happy to take a look and test. Thanks again for spending so much time on this!

@MTecknology
Copy link
Contributor Author

By the logic of this file, it's impossible to ever have an IPv6 interface be used for a bond, vlan, slave, etc. type of interface. I have an idea... wish me luck!

@MTecknology
Copy link
Contributor Author

/Should/ it be valid to specify both ipaddr AND ipv6addr in the same state?

@jbfriedrich
Copy link
Contributor

jbfriedrich commented Dec 5, 2016

@MTecknology If you change line 1224 in debian_ip.py from

iface_data['inet']['addrfam'] = 'inet'

to

if opts['enable_ipv6'] == False:
    iface_data['inet']['addrfam'] = 'inet'

it seems to work for my test cases. So far I am only getting correctly generated configurations.

@MTecknology
Copy link
Contributor Author

MTecknology commented Dec 5, 2016

I've been using this configuration for testing. I really hope it's not as simple as that because I'm making a mess out of this thing. I've been testing with the following states. Feel free to prod me (MTecknology) on Freenode if you're interested!

# IPv4 Only
eth0:
  network.managed:
    - type: eth   
    - enable_ipv6: False
    - proto: none 
    - ipaddr: 192.168.4.9
    - netmask: 255.255.255.0
    - gateway: 192.168.4.1
    - noifupdown: True
    - dns:
      - 8.8.8.8   
      - 8.8.4.4   

# IPv6 Only
eth1:
  network.managed:
    - type: eth   
    - enable_ipv6: True
    - ipv6proto: static
    - ipv6ipaddr: 2001:db8:dead:beef::3
    - ipv6netmask: 64
    - ipv6gateway: 2001:db8:dead:beef::1
    - noifupdown: True

# IPv4 and IPv6   
eth2:
  network.managed:
    - type: eth   
    - noifupdown: True

    # IPv4
    - proto: static
    - ipaddr: 192.168.4.9
    - netmask: 255.255.255.0
    - gateway: 192.168.4.1
    - enable_ipv6: True

    # IPv6
    - ipv6proto: static
    - ipv6ipaddr: 2001:db8:dead:c0::3
    - ipv6netmask: 64
    - ipv6gateway: 2001:db8:dead:c0::1
    # override shared; makes those options v4-only
    - ipv6ttl: 15 

    # Shared
    - mtu: 1480   
    - ttl: 18
    - dns:
      - 8.8.8.8   
      - 8.8.4.4   

# Interface with routes
eth3:
  network.managed:
    - type: eth   
    - enable_ipv6: False
    - proto: none 
    - ipaddr: 192.168.17.8
    - netmask: 255.255.255.0
    - noifupdown: True

eth3_routes:
  network.routes: 
    - name: eth3  
    - routes:
      - name: vpn 
        ipaddr: 10.8.0.0
        netmask: 255.255.0.0
        gateway: 192.168.17.1
    - noifupdown: True

# Slave interfaces
eth4:
  network.managed:
    - enabled: True
    - type: slave 
    - master: bond0
    - noifupdown: True

eth5:
  network.managed:
    - enabled: True
    - type: slave 
    - master: bond0
    - noifupdown: True

eth6:
  network.managed:
    - enabled: True
    - type: slave 
    - master: bond1
    - noifupdown: True

eth7:
  network.managed:
    - enabled: True
    - type: slave 
    - master: bond1
    - noifupdown: True

# Bonded interface with IPv4 and IPv6
bond0:
  network.managed:
    - noifupdown: True
    - type: bond

    - mode: 802.3ad
    - proto: static
    - slaves: eth4 eth5

    # IPv4
    - ipaddr: 10.1.0.1
    - netmask: 255.255.255.0
    - enable_ipv6: False

    # IPv6
    - ipv6proto: static
    - ipv6ipaddr: 2001:db8:dead:c0::3
    - ipv6netmask: 64
    - ipv6gateway: 2001:db8:dead:c0::1

bond0.3:
  network.managed:
    - noifupdown: True
    - type: vlan  
    - ipaddr: 10.1.0.3
    - use:
      - network: bond0
    - require:
      - network: bond0

# Bonded interface without IP addresses
bond1:
  network.managed:
    - noifupdown: True
    - type: bond
    - mode: 802.3ad
    - proto: static
    - slaves: eth6 eth7

bond1.4:
  network.managed:
    - noifupdown: True
    - type: vlan  
    - ipaddr: 10.4.0.8
    - use:
      - network: bond1
    - require:
      - network: bond1

@jbfriedrich
Copy link
Contributor

I made a little mess in my local files as well, so I would like to test some more before I will say that the easy solution above works for all configurations 😄 . Thanks for the config, I will use it for testing as well. It is 2:44am here (CET/GMT+1) and I have to work tomorrow, but I will ping you in IRC after work.

This is a bit of a refactor that may need to come with a documentation
update as well. It appears our previous authors made a few assumptions
for the sake of simplicity. Thankfully, they were nice enough to write
the rest of the system to be flexible enough for me to wedge this in.

I rewrote this so that it can handle an interface having v4-only,
v6-only, or v4-and-v6 addresses assigned. It's possible some options I
copied into the V6IF block are not valid inet6 options, but these can be
removed as needed

An IPv6 interface can have any number of IPv6 addresses attached to it,
this will also need to be included.
@MTecknology
Copy link
Contributor Author

That last commit came with dragons. However, it seems to resolve a lot of bugs that I came across while reviewing the logic. I'm worried your patch would produce errors if someone tried to set up a {bond,vlan,ppoe,slave,bride} device without having IPv4 settings set.

The last commit attempts to allow a person to have any of ipv4-only / ipv6-only / ipv4-and-ipv6 addresses assigned to an interface. The "ipv6" prefixed options (ex ipv6addr/ipv6netmask) take precedence in the inet6 block, so if a v4 version of the option were encountered, the v4 option was the default overridden by the v6 option.

I just realized my testing isn't including any bonded interfaces. I'll update my previous post with a better example.

@MTecknology
Copy link
Contributor Author

We may want to consider dropping support for ipaddrs and ipv6addrs. I noticed some of the configurations suggested in the docs [1] aren't valid and wouldn't yield a valid networking configuration file. We should correct the documentation with simplified, tested, and useful examples.

[1] https://docs.saltstack.com/en/latest/ref/states/all/salt.states.network.html

Also removed the merge between IPv4 and IPv6 which seemed to not work
correctly in the first place. This removes the assumption that all IPv4
options are valid IPv6 options (this is not an accurate assumption).
@MTecknology
Copy link
Contributor Author

I know this patch can be cleaned up by someone that knows python better than me, but I think the logic is finally correct. I attempted to remove previous assumptions and bugs without modifying functionality.

To test, I used the big blob above, removed /etc/net/if, and ran a highstate (using the big blob above) which brought back the whole file. After the correct file was generated, I modified lines and added another pretend interface to the file. The subsequent highstate left my manually added interface and corrected other lines.

I welcome testing by others!! :D

@cachedout
Copy link
Contributor

@jbfriedrich How are you feeling about this one as it sits?

@jbfriedrich
Copy link
Contributor

@cachedout From all my tests I can say it works pretty well. I had one or two test cases where it took quite some time for the state execution to finish - but produced valid configs. But that long runtime could be due to interface timeouts in my VMs (it seems that adding vNICs after deploying sometimes get my interfaces shuffled around). I will build a new VM with all vNICs directly built-in on deployment. But I am good with it. Appreciate all the hard work @MTecknology has put into this!

@MTecknology
Copy link
Contributor Author

I also had issues with very long delays. They only occurred when I was trying to manage non-existent interfaces and didn't include - noifupdown: True. At first I thought it was a bug, then I realized I have no idea.

Is there ever a situation where ifup would succeed if the interface is currently unavailable? If so, this behavior should remain. If not, then there is another bug further down the line unrelated to this PR that should be very easily addressed by doing the same as -noifupdown:True if the interface doesn't exist. If we can be sure /sys is always present, then if interface in glob(/sys/class/net/*): seems like an easy check. Heck, even if it's not present, os.path.exists() has us covered.

Anyway... it's definitely outside the scope of this PR!

@cachedout
Copy link
Contributor

I'm going to take from the comments above that this is ready to go and if there are lingering issues, we can solve them in follow-up PRs. I'll go ahead and merge this.

Thanks, @MTecknology for your really hard work here and to @jbfriedrich for your quick feedback. We're really grateful!

@cachedout cachedout merged commit fd06bab into saltstack:2016.11 Dec 7, 2016
@MTecknology
Copy link
Contributor Author

👍 What's it take to also push this to develop so we don't lose the changes in 2017.x.0? I imagine 2016.11 merges don't automatically make it to develop, or do they?

@gtmanfred
Copy link
Contributor

gtmanfred commented Dec 7, 2016 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants