Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing configuration failed at startup between bridge and dhcp6c on WAN #3199

Closed
tduboys opened this issue Feb 6, 2019 · 31 comments
Closed
Assignees
Labels
bug Production bug
Milestone

Comments

@tduboys
Copy link
Contributor

tduboys commented Feb 6, 2019

Hi all,

Following opnsense/dhcp6c#7, I've an issue that is probably not on the dhcp6c client but maybe on config testers during boot sequence.

I'm using OPNSense following this setup : https://wiki.opnsense.org/manual/how-tos/orange_fr_fttp.html

My box is set up as :

  • igb0 as WAN
  • igb1, igb2, igb3… as bridge0
  • bridge0 as LAN

I'm using a QOTOM hardware with 6x Intel interfaces.

On startup, dhcp6c client parses the config file and see that my LAN interface (bridge0) is tracking the WAN interface to get IPv6 configuration, but bridge0 is not set up at this moment, so the config fails and dhcp6c didnt start.

Jan  8 20:19:12 opnsense dhcp6c[38999]: /var/etc/dhcp6c_wan.conf:13 invalid interface (bridge0): Device not configured
Jan  8 20:19:12 opnsense dhcp6c[38999]: called
Jan  8 20:19:12 opnsense dhcp6c[38999]: failed to parse configuration file

I need to force a dhcp refresh by saving the WAN page settings and click on « apply », on each startup, to get IPv6 working correctly.

How to reproduce :

  • set up a WAN interface with dhcp6c
  • set up a LAN interface with bridge on multiple physical interfaces
  • set up id-assoc to the bridge interface
id-assoc pd 0 {
  prefix-interface bridge0 {
    sla-id 0;
    sla-len 8;
  };
};

  • reboot
  • see logs

What expected :

  • no errors during boot, ipv6 working directly

I had this issue on 1.8.7 and also after upgrading to 1.9.1

@marjohn56
Copy link
Member

Are you still having this issue?

@tduboys
Copy link
Contributor Author

tduboys commented Mar 6, 2019

Hi,
I'm on 19.1.1 version, and I rebooted this morning.
The issue is still present with the same message on dhcpd.log

@marjohn56
Copy link
Member

Can you send me your config.xml file, I'll take a look and see what gives.

@marjohn56
Copy link
Member

The thing that confuses me is that you say bridge0 is not setup, yet the configuration is finding an interface called bridge0 - You must have entered that somewhere.

@tduboys
Copy link
Contributor Author

tduboys commented Mar 6, 2019

I just updated to 19.1.2 but nothing changed.
I've the config.xml (just cleaned my passwords/keys and dhcp auth settings (not other dhcp settings))
But what's the best way to send it to you ?

@marjohn56
Copy link
Member

marjohn56 commented Mar 6, 2019

dropbox or martin@team-rebellion.net or you can attach it here

@tduboys
Copy link
Contributor Author

tduboys commented Mar 6, 2019

Sent by mail.
The part from the config about bridge is that one :

    <opt5>
      <if>bridge0</if>
      <descr>LAN</descr>
      <enable>1</enable>
      <spoofmac/>
      <ipaddr>192.168.0.254</ipaddr>
      <subnet>24</subnet>
      <ipaddrv6>track6</ipaddrv6>
      <track6-interface>wan</track6-interface>
      <track6-prefix-id>0</track6-prefix-id>
    </opt5>

@marjohn56
Copy link
Member

So you have a bridge declared then,..

If you are not using a bridge then delete it,

@tduboys
Copy link
Contributor Author

tduboys commented Mar 6, 2019

I need to bridge 2 interfaces for my LAN, and some times 3 interfaces, it's why I've choosen a hardware with multiple interfaces.

@marjohn56
Copy link
Member

I use a bridge interface on both of my Qotoms I have zero issues with dhcp6c so it's a config error.

Config file please.

@tduboys
Copy link
Contributor Author

tduboys commented Mar 6, 2019

Config file please.

It's on your mail

@marjohn56
Copy link
Member

Not received here. Push it to dropbox and send me a link.

@marjohn56
Copy link
Member

Could you also send the dhcp6c conf file when the error happens. We have seen an issue with OR France when using RAW options, and I wonder if its that thats causing an issue.

@tduboys
Copy link
Contributor Author

tduboys commented Mar 6, 2019

@marjohn56 link for my config file : https://dl.plik.ovh/file/scPayUvR7BADuU9x/nMn0eN3q9y1250eu/config.xml (temporary link)
I just cleaned some options

About dhcp6c config file, there is mine :

interface igb0_vlan832 {
  send ia-pd 0;
  send raw-option 6 00:0b:00:11:00:17:00:18;
  send raw-option 15 00:2b:46:53:56:44:53:4c:5f:6c:69:[…];
  send raw-option 16 00:00:04:0e:00:05:73:61:67:65:6d;
  send raw-option 11 00:00:00:00:00:00:00:00:00:00:00:1a:09:00:00:05:58:01:03:41:01:0d:66:74:69:2f:67:[…];
  script "/var/etc/dhcp6c_wan_script.sh";
};
id-assoc pd 0 {
  prefix-interface bridge0 {
    sla-id 0;
    sla-len 8;
  };
};

The raw options worked great, but only if I reload interface (or all services) after a reboot.

@marjohn56
Copy link
Member

Yes, Think it's the same issue we are having here:

https://github.com/opnsense/core/issues/2774#issuecomment-469763514

@marjohn56
Copy link
Member

Odd... very odd. I have just tested this - using RAW options and I cannot break it. I must have tested it thirty or more times with no issues. Gets a v6 address straight from boot, I can neither emulate your issue or the other one with dhcp6c.

Can you try sending your config.xml to me again. Check you have the correct address martin@team-rebellion.net

@marjohn56
Copy link
Member

OK, I've been able to replicate this and sure enough the bridge interface is not ready for dhcp6c when it tries to add an address, thus causing dhcp6c to fail and exit. I have sent Thomas a workaround for testing and we'll see where we go from there.

What's happening is during boot dhcp6c gets run too early so I have moved it the call to fire it up and that seems to cure it, however it needs much testing as it may break something else!

@fichtner - FYI

@tduboys
Copy link
Contributor Author

tduboys commented Mar 12, 2019

Hi all,
Unfortunately, the fix didnt resolve my issue.

I suppose that the sentence during boot is something like that :

  • /usr/local/etc/rc.bootup is executed
  • this file will do sequencially some stuffs, as set up interfaces in a certain order (probably alphabetical)
  • it will call the function to generate interface config and execute it

In my case, the first interface igb0 is the wan with dhcp6c client
Other interfaces igb1-5 are on a bridge
So, sequencially, it generates the script /var/etc/rtsold_igb0_vlan832_script.sh (in my case) and call it
and after that, generate the bridge

But here, the script for wan need to have the bridge already set up.

I tried this thing just for test :

  • add a « sleep 5 » in the script rtsold (in the interfaces.inc)
  • call the rtsold script in background
    And it's worked (for this part), but creates other issues, it's not a fix, just a test.

I suppose that a better fix should be to create bridge interfaces before others (like for lagg and vlans, it seems)

In this code in interfaces.inc :

 foreach (get_configured_interface_with_descr() as $if => $ifname) {
        $realif = $config['interfaces'][$if]['if'];
        if (strstr($realif, "bridge")) {
            $bridge_list[$if] = $ifname;
        } elseif (strstr($realif, 'gre') || strstr($realif, 'gif') || strstr($realif, 'ovpn')) {
            $delayed_list[$if] = $ifname;
        } elseif (!empty($config['interfaces'][$if]['ipaddrv6']) && $config['interfaces'][$if]['ipaddrv6'] == 'track6') {
            $track6_list[$if] = $ifname;
        } else {
            interface_configure($verbose, $if);
        }

If I understand, it creates the bridge only when the first interface of this bridge is set up.

I will try some things here (and maybe break all my system :) )

@fichtner fichtner self-assigned this Apr 22, 2019
@fichtner fichtner added the feature Adding new functionality label Apr 22, 2019
@fichtner fichtner added this to the 19.7 milestone Apr 22, 2019
@fichtner fichtner added bug Production bug and removed feature Adding new functionality labels Apr 24, 2019
fichtner added a commit that referenced this issue Apr 24, 2019
This isn't meant as a fix.  Need to find out what this code really does...
@fichtner
Copy link
Member

@tduboys Okay, I'm classifying this as bug now after a deep-dive into the code. The reason for that is that handling for track6,bridge, delayed items is completely out of whack... 561a783 works for me, but is merely meant as a base for further work. I'm not sure if it will be in 19.1.x or wait for 19.7 to shine. Nevertheless you can try it via:

# opnsense-patch 561a783

@tduboys
Copy link
Contributor Author

tduboys commented Apr 24, 2019

Thanks @fichtner I will try this patch. Currently, I'm using my old patch. I will revert it and let you know if all worked.

BTW I didnt know the opnsense-patch command :)

@fichtner
Copy link
Member

both opnsense-patch and opnsense-revert are handy sometimes :)

I'm not 100% satisfied with the current state but it seems to work well enough to cover your use case without fuzz.

@tduboys
Copy link
Contributor Author

tduboys commented May 2, 2019

Hi @fichtner , I confirm that your patch is working great for my case.

@fichtner
Copy link
Member

fichtner commented May 2, 2019

I had to rewrite a larger portion of that function and I'm ok with the new state. Would you be able to try the development version of 19.1.7 to confirm this is also working fine for you?

@tduboys
Copy link
Contributor Author

tduboys commented May 2, 2019

I just updated to the last 19.1.7 stable version plus your patch, but I'm ok to test the dev version.
I just need to change the Release type to dev, rollbacking your patch, and checking for updates ?

@fichtner
Copy link
Member

fichtner commented May 2, 2019

No need to roll back, just switch to dev, save, check for updates and hit upgrade

@tduboys
Copy link
Contributor Author

tduboys commented May 2, 2019

I just switched to dev, updated and rebooted. I got an IPv6, fix seems worked.

@fichtner
Copy link
Member

fichtner commented May 2, 2019

@tduboys ok great. so the last question would be shipping a fix when... are you ok with reapplying the patch for all 19.1.x updates because I would like to leave the real rework on the development track for 19.7. What do you think?

@tduboys
Copy link
Contributor Author

tduboys commented May 2, 2019

Yes, no problem to manually apply the patch for 19.1.x branch. It's probably safer that not changing this part in a « small » update.

@fichtner
Copy link
Member

fichtner commented May 2, 2019

Ok thanks, you will have to do it for every update you perform since patches aren't sticky (they are a test tool after all), but then I can close the issue now and test further to make sure 19.7 is as good as it gets. :)

Thank you,
Franco

@fichtner fichtner closed this as completed May 2, 2019
EugenMayer pushed a commit to KontextWork/opnsense_core that referenced this issue Jul 22, 2019
This isn't meant as a fix.  Need to find out what this code really does...
EugenMayer pushed a commit to KontextWork/opnsense_core that referenced this issue Jul 22, 2019
This isn't meant as a fix.  Need to find out what this code really does...
@chantra
Copy link

chantra commented Nov 18, 2020

There is an equivalent bug report in pfsense tracker: https://redmine.pfsense.org/issues/3965

@fichtner
Copy link
Member

This was solved a long time ago?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Production bug
Development

No branches or pull requests

4 participants