Fedora Networking Issues (Still) #1997

Closed
nelz9999 opened this Issue Jul 27, 2013 · 23 comments

Projects

None yet

10 participants

@nelz9999

I had hoped that #1738 as released in 1.2.5 was going to fix this issue for me, but it hasn't.

I'm on:
vagrant - 1.2.5
virtualbox - 4.2.16
operating system - OS X 10.8.4

Here's the error:


The following SSH command responded with a non-zero exit status.
Vagrant assumes that this means the command failed!

/sbin/ifup p7p1 2> /dev/null

Stdout from the command:

ERROR : [/etc/sysconfig/network-scripts/ifup-eth] Device eth1 does not seem to be present, delaying initialisation.

Stderr from the command:


Here's more from the cmd-line after the failure:

nelz-air:fedora nelz$ vagrant ssh
Last login: Sun Mar 17 21:11:02 2013 from 10.0.2.2
[vagrant@localhost ~]$ sudo /sbin/ifup p7p1
ERROR : [/etc/sysconfig/network-scripts/ifup-eth] Device eth1 does not seem to be present, delaying initialization.
[vagrant@localhost ~]$ sudo /sbin/ifup eth1
ERROR : [/etc/sysconfig/network-scripts/ifup-eth] Device eth1 does not seem to be present, delaying initialization.

Here's the Gist with my debugging output and my Vagrantfile: https://gist.github.com/nelz9999/6096052

I've been trying to do my due diligence with this report, but I'm new to the Ops-y side of things, coming from a long history of DEV-side. Let me know if there's more info I can provide.

@nelz9999

New Info!

So I get to work today, and tell my coworker about this bug report. And just to verify my work he downloads the .box file and tries it. AND IT WORKED! (angry face)

The differences between his environment and mine: he has 0 plugins whereas I have 2 (vagrant-berkshelf and vagrant-cachier); and he's on 1.2.2 whereas I am on 1.2.5.

So, I rip out my plugins, and it's still failing for me.

So, I go back to 1.2.2 and LO AND BEHOLD: It works!

So, I start iterating through the releases:
1.2.2 - Success
1.2.3 - Success
1.2.4 - Success
1.2.5 - Failure
1.2.6 - Failure
1.2.7 - Failure

So, the workaround for me is to go back to 1.2.4.

I have updated my gist with a new debug-level output. (The striking difference between the 1.2.4 version is 2.5k lines for the success and 6.0K lines for the failure.)

@Sharpie
Contributor
Sharpie commented Aug 14, 2013

Something is screwy. After #1738, ifup should not be referencing p7p interfaces and should instead be using eth.

@mitchellh
Owner

This should be fixed with #1738, let me know if it isn't. I don't think it is in a release yet, but will be soon.

@mitchellh mitchellh closed this Aug 29, 2013
@paul-krohn
Contributor

Can this possibly be re-opened? I'm the co-worker of Nelz9999.

Systemd has implemented something called "Predictable Network Interface Names", the intent being that when hardware is added or removed, the names will remain that same. This means you get ethernet device names like "enp0s3" instead of "eth0". It's pretty well documented here: http://www.freedesktop.org/wiki/Software/systemd/PredictableNetworkInterfaceNames/

The workaround is pretty simple:
ln -s /dev/null /etc/udev/rules.d/80-net-name-slot.rules

The presence of this symlink over-rides the "predictable" names and uses the old-school "ethN".

Critically, this needs to be done in the packaged up box. Also, this issue only appears when you have more than one interface set up. So the guidance is, set up your box/image this way, and you are all set.

At some point, it would be great to detect and notify Vagrant users of this circumstance, or be able to handle it gracefully.

Cheers,

Paul

@mitchellh
Owner

@paul-krohn I'd be willing to look at a PR for such functionality, but can't pursue this personally at the moment. For guest OSes I mostly depend on community contributions. This issue comment helps a lot though. Thanks!

@ryansb
Contributor
ryansb commented Jan 2, 2014

This should be fixed by #2742

@rdefauw
rdefauw commented Feb 19, 2014

It looks like 8a3d7aa reverted the fixes for Fedora and the private networking issue is cropping up again.

@dguaraglia

@rdefauw is right, Vagrant 1.5.0 breaks on our Fedora 18 box. What's the ideal solution for this?

@mitchellh
Owner

Oy that's annoying. We need to figure out a good solution to this. Because it always ping pongs. We need to detect whether its eth or p7p.

@dguaraglia

It seems to be an issue with the way the Fedora boxes are configured. Some of them implement "Predictable Network Interface Names" (let's set on PNIN for short), some others don't. The base box we used happens to have PNIN disabled, maybe because back in the day Vagrant always assumed eth0.

I could try to re-enable PNIN on my box (even though it seem to be disabled using a non-standard method), but a more permanent solution would be check whether PNIN is enabled on the box before trying to bring the network up. Maybe checking the net.ifnames kernel setting or whether the biosdevname parameter is set on boot (I'm trying to figure out how to check that without parsing the grub.cfg file.)

I'd be happy to contribute a patch if I find a consistent solution.

@ryansb
Contributor
ryansb commented Mar 11, 2014

This may not be the best solution, but checking the output of ip link show dev eth0 could be a solution.

Cases:

  • command ip does not exist: use eth0 because it's pre-ip command so won't have modern systemd/udev and will use the old naming scheme
  • ip link show dev eth0 exits 1 means eth0 does not exist, so we can be confident that PNIN is enabled
  • ip link show dev eth0 exits 0 means eth0 exists, so it's safe to use.

I can write up a patch for this tonight if it sounds like the way we want to go.

@dguaraglia

@ryansb the problem with that approach is a configuration like ours, where both eth0 and p7p1 exist. This is the output from ifconfig in my Vagrant box:

eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.0.2.15  netmask 255.255.255.0  broadcast 10.0.2.255
        inet6 fe80::a00:27ff:feec:43e  prefixlen 64  scopeid 0x20<link>
        ether 08:00:27:ec:04:3e  txqueuelen 1000  (Ethernet)
        RX packets 2262  bytes 530255 (517.8 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 2276  bytes 327389 (319.7 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 16436
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 0  (Local Loopback)
        RX packets 148  bytes 25217 (24.6 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 148  bytes 25217 (24.6 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

p7p1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.33.10  netmask 255.255.255.0  broadcast 192.168.33.255
        inet6 fe80::a00:27ff:fe9d:69df  prefixlen 64  scopeid 0x20<link>
        ether 08:00:27:9d:69:df  txqueuelen 1000  (Ethernet)
        RX packets 160126  bytes 118630466 (113.1 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 116971  bytes 43710658 (41.6 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

I guess this is because we are using host-only networking.

@ryansb
Contributor
ryansb commented Mar 11, 2014

But in that case would using eth0 by default necessarily be wrong? The problem is picking one that doesn't exist, but if they both exist we just need to pick a sane default.

@khiro
Contributor
khiro commented Mar 11, 2014

I tried to use "public_network", but this issue was occurred at my environment Fedora20, Vagrant 1.5.0, VirtualBox 4.3.6.
Bridged interface "enp0s8" and nat interface "p2p1" were created on guest using PNIN.
However vagrant could not set a specific ip address to bridged interface "enp0s8".

Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
  config.vm.box = "fedora20_x86_64"
  config.vm.network :public_network, ip: "192.168.140.201"
  config.ssh.forward_agent = true
end
$ vagrant up
Bringing machine 'default' up with 'virtualbox' provider...
...
The following SSH command responded with a non-zero exit status.
Vagrant assumes that this means the command failed!
/sbin/ifup p7p1 2> /dev/null
Stdout from the command:
ERROR    : [/etc/sysconfig/network-scripts/ifup-eth] Device p7p1 does not seem to be present, delaying initialization.
Stderr from the command:
$ vagrant ssh
Last login: Tue Mar 11 17:38:43 2014 from 10.0.2.2
[vagrant@localhost]$ ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: p2p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:bd:39:7f brd ff:ff:ff:ff:ff:ff
    inet 10.0.2.15/24 scope global dynamic p2p1
       valid_lft 85490sec preferred_lft 85490sec
    inet6 fe80::a00:27ff:febd:397f/64 scope link
       valid_lft forever preferred_lft forever
3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:91:de:dd brd ff:ff:ff:ff:ff:ff
    inet 192.168.140.114/24 scope global dynamic enp0s8
       valid_lft 258290sec preferred_lft 258290sec
    inet6 fe80::a00:27ff:fe91:dedd/64 scope link
       valid_lft forever preferred_lft forever
@ryansb
Contributor
ryansb commented Mar 11, 2014

@khiro would you be able to provide a box URL (if the box is public) so I can repro your issue?

As a note, we may need to actually take the effort to understand the interface name, which are generated here and not hardcode the slot numbers.

@khiro
Contributor
khiro commented Mar 11, 2014

@ryansb I used fedora20 box package is here: https://vagrantcloud.com/chef/fedora-20

@khiro khiro added a commit to khiro/vagrant that referenced this issue Mar 14, 2014
@khiro khiro Fix a network configuration issue of Fedora [GH-1997]
Support Predictable Network Interface Names.
9edb299
@khiro
Contributor
khiro commented Mar 14, 2014

I sent PR #3207

@ryansb
Contributor
ryansb commented Mar 14, 2014

Excellent @khiro, I'll test this on a few of my fedora boxes tonight.

@khiro
Contributor
khiro commented Mar 14, 2014

@ryansb Thanks, if you find a bug, tell me about it.

@nickchappell

@ryansb, @khiro, did either of you have any luck in getting this to work? I'm running into the same underlying issue with CentOS 7 and Vagrant 1.6.3.

@khiro
Contributor
khiro commented Jul 14, 2014

@nickchappell I think CentOS 7 and RHEL 7 have same bug (#4078 #4107 #4171).
This bug might be fixed using #3207.
But, my patch causes other issue #4104.

@sontek
sontek commented Aug 2, 2014

I just ran into the issue. Is there a temporary fix? Is the solution just downgrading vagrant?

[default] Configuring and enabling network interfaces...
The following SSH command responded with a non-zero exit status.
Vagrant assumes that this means the command failed!

ARPCHECK=no /sbin/ifup eth1 2> /dev/null

Stdout from the command:

ERROR    : [/etc/sysconfig/network-scripts/ifup-eth] Device eth1 does not seem to be present, delaying initialization.


Stderr from the command:

Fedora 20 Guest, Ubuntu 14.04 Host

$ vagrant --version
Vagrant 1.4.3

and I'm using VirtualBox 4.3.14

@sontek
sontek commented Aug 2, 2014

Nevermind, 1.6.3 made it work :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment