-
Notifications
You must be signed in to change notification settings - Fork 759
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ix0 no carrier #2591
Comments
|
the joys of intel driver updates :( run this and reboot |
|
ps booting from kernel.old should also work via boot menu |
|
Thanks |
|
if this indeed works we need to take this to FreeBSD soon if 11.2 has the same defect |
|
Doesn't help :( |
|
makes no sense at all ?! what's the output of: |
|
FreeBSD asterix2.lan.neratec.com 11.1-RELEASE-p11 FreeBSD 11.1-RELEASE-p11 21b4c8ea1d5(stable/18.7) amd64 |
|
Did the |
|
ok, kernels are correctly replaced. I'm unsure how any other change would relate to the reported issue of the driver reporting no carrier anymore. |
|
Would it make sense to try install new with 18.7 and import the config? |
|
going back to 18.1.13 would make more sense than trying again with 18.7 (18.1.6 config import + install and update back to 18.1.13). but the chances for ix0 finding its carrier is not more than 50%. it could be hardware related. |
|
|
Could it be that FreeBSD 11.2 ships with a new binary firmware blob that bricked your NIC? That's the only theory I have besides it's fully the hardware's fault. |
|
No idea, the server has a second NIC (ix1), will try this one and report... |
So with ix NICs better not update to 18.7. System Mainboard Firmware versions NIC CPU |
|
@fichtner I have two identical boxed like above configured but not in production, ping me the usual way when you need access. |
|
FYI: a server reset or power off/on by ipmi/bmc doesn't "fix" the NIC hang. Had to remove power from the box. |
|
I am seeing something that could well be related to this issue. After upgrading my company setup (A HA setup of two firewalls) to 18.7, the result was that none of my VLAN interfaces on ix1 activate properly, however the non-VLAN ix0 works fine. On the main page, the VLAN interfaces are marked with red saying "Ethernet autoselect", and under System --> Interfaces --> Overview their status says "no carrier". All of my igb* interfaces come up without any issue. I have tried changing the VLAN configuration a bit hoping that a re-write of the configuration would solve it, but to no avail. Update: I just tried booting kernel.old, as was suggested earlier in this thread, and just doing that has resolved the problem on both of my nodes! I did not have to do any actual power cycling, just booting into kernel.old did it. |
|
@Tsuroerusu Can you try the 18.1.11 kernel as well? (reboot) @abplfab said you need to remove the power, otherwise the carrier will not come up on a quick reboot. |
It was mentioned that several ix(4) devices are stuck in "no carrier", see opnsense/core#2591 This reverts commit ae8c90c.
It was mentioned that several ix(4) devices are stuck in "no carrier", see opnsense/core#2591 This reverts commit ae8c90c.
|
Relevant driver update was reverted and will be gone from 18.7.1. It's unclear if the issue exists in FreeBSD 11.2 but we'll find out soon (@mimugmail could you check this with your test system). |
|
@fichtner Update is in progress .. need to figure out what exactly happens with and without VLANs. |
|
@mimugmail thanks a lot! |
|
@fichtner I am using the hardware that I have, because it was working just fine with 18.1, I also said that I fully understand why developers cannot test my particular setup. The only thing I insisted on is that I am not using incompatible hardware as everything I am using is officially validated by Supermicro, WHICH WAS WORKING WITH 18.1, and thus my modest claim was that it was unreasonable for you guys to keep telling me that I just need to change my hardware and spend hundreds of euros on that. Why do you keep putting words into my mouth? I never said anything about the "stance" of the OPNsense project, all I wanted to ask was about whether there were driver backports planned in 19.1, that was all! I am not using this platform for anything! I simply asked a question about drivers, and then I ended up responding to the absurd accusations that you made against me, how is that unreasonable? I fully accept that I am responsible for my hardware. I am, frankly, a little bit shocked that this is how you choose to treat your users, who, 1. have reported a problem, 2. tried to help, and 3. just asked a question. But do rest assured, I can promise you that I have no intention of participating in this discussion anymore, and I will not be reporting any problems in the future given that I simply got insulted and shown contempt for simply asking a question. I wish you and everybody else a pleasant Sunday. |
|
@Tsuroerusu Okay, listen. Tell me what you in very precise words want us to do and I'll objectively explain why that may or may not be feasible. |
This is the heart of the matter, Franco. I did not, and still do not actually want you to do anything, and that is why I have been so amazed (in the negative sense) by your responses today (specifically). The ONLY thing that I requested was a simple "yes/no" answer to this question: |
|
@Tsuroerusu I understand that you're frustrated. Believe me, I'm too! But instead of arguing, why not test the latest 19.1 release to see if it works? That would help all users far more IMHO. Maybe I can save a few bucks on a new cable. :D |
@enoch85 I actually agree with you on that, and that is why I was so disappointed that instead of a simple answer to the question I asked, I got absurd accusations thrown at me (not by you), which I then had to respond to. Your suggestion of testing the 19.1 beta, I have no issue with considering that, it is a perfectly reasonable thing to ask of me. In fact, let me just go further and say, that the reason I asked about the drivers in 19.1 was precisely because I was interested in potentially testing it with my setup, however for that it would be useful to know which drivers I would actually be testing (Because earlier I got the vanilla FreeBSD 11.2 live media to work fine). My setup is at production-level and because of that, I have to image it before testing things, and before spending an hour or two doing that, I simply needed some information as I have explained. |
|
Some update from me, i tried the current 19.1 beta image via live cd, and both ports of the intel x520-da2 are online now. the driver version is the same as before: [1] ix0: <Intel(R) PRO/10GbE PCI-Express Network Driver, Version - 3.2.12-k> port 0xc020-0xc03f mem 0xdfe80000-0xdfefffff,0xdff04000-0xdff07fff irq 22 at device 0.0 on pci4 I'll install the beta now, hopefully config from 18.7 can be imported. edit: performance is really bad, iperf3 measures around 1.2 gbit/s while hitting massive iowait. |
The answer is no, but not in the sense that the backported drivers of 18.7 are not different from 19.1, because the backport is from FreeBSD 11.2 and 19.1 is based on stock HardenedBSD/FreeBSD 11.2 so the same drivers are included. I was under the impression that this has been communicated clearly in several of places and I apologise if that wasn't actually the case. |
Thank you for answering my question. :)
Earlier you linked to a forum post, which stated the following: However, for me, that did not exclude the possibility of any potential driver backports from 11-STABLE or from Intel directly, so that was why I sought clarification from you regarding that point. The reason why I was interested in this is that shortly after I started experiencing problems with 18.7, I tried booting up the FreeBSD 11.2 live USB media, and configured my ix interfaces with VLANs and that worked in the way I expected them to. So my thinking was that if 19.1 did not contain any modifications to the ixgbe driver compared to the one in FreeBSD 11.2, perhaps it would fix my problem. I hope this makes sense from my perspective of being a user/sysadmin, and not developer, just trying to use pattern matching and logical inferences to try to find solutions. |
|
The cable won't show up until 2018-12-13 , so I won't be able to test if it's a cabling issue or not until then. Can someone else please confirm that 19.1 works? |
|
19.1 (installed from ISO) works for me without changing the cables. interestingly, i switched back to 18.7 by doing a fresh install and found the following: running in live cd mode: ix0 and ix1 are working. the first boot of the fresh install from disk: ix0 & ix1 online. but if i restore a backup, or change the interface settings via GUI ix0 "dies" after the reboot with the ominous "no carrier" status. ix1 stays online. |
|
Can you make a diff of config.xml and your backup? |
Please I will get the cable tomorrow, so I will test this weekend. :D |
|
IT WORKS! What I did:
Conclusion |
|
I am on 19.1-rc1 for a few weeks now, until today both ix0 and ix1 were fine. After a reboot ix0 didn't come online -> the good old "no carrier" problem is back to haunt me. I tried a few things, but for now ix0 stays dead. But, i made an interesting discovery: If i reset the machine via reset button, the link comes up at 10 Gbit/s - then, the link stays online until "Configuring LAN interface..." at the OPNSense bootup, then it goes down until the next reboot. Maybe there is some invalid config applied ? Are there any files or something i can provide ? |
|
I just upgraded one of my firewalls to 19.1.5 from 18.7 (but using "kernel.old", i.e. the kernel fra 18.1), which I had been running since August because of the issue I had with VLANs on my ix interfaces saying "no carrier", as described earlier in this thread. However, I am sad to say that the issue still persists despite the update to FreeBSD 11.2 in OPNsense, which is really depressing :-( The strange thing for me is that, as I mentioned before, when 11.2 came out, I tried using the Live boot with my machines, and I could configure VLANs without any issue and run network traffic through them. Which makes this issue even more baffling to me. And before anybody jumps in with this. Before I went to do the upgrade, the firewalls were working fine with the kernel from 18.1. So this is not a cabling issue, unless the newer Intel drivers have some change that in itself causes compatibility issues with the Supermicro cables that I am using. At this point, I am millimeters from giving up and buying some different NIC card, and hoping that I will not face the same issue, because at this point I have been without security updates since August. Unfortunately, that will cost me 500 euros before I even know whether it will actually solve the problem. |
|
I switched to a Mellanox Connect-X3 EN 2x SFP+ (used), which is working without any flaws. You could try to compile a newer version of the intel driver (last post in this thread) |
Thanks for the suggestion, I appreciate that. Unfortunately, the newer driver has the same issue. |
|
yes, i do, but just only one (which lead to the "no carrier" problem on the intel x550) |
I must say, the ix driver sure is a wuss, 'ey? A single VLAN, in your case, and it melts down! It would be funny if it wasn't so annoying. I run something like 8 VLANs and I REALLY need them to work, so Mellanox it is for me then. Thanks for the recommendation, as I think I have found a good source for them! Hopefully this will solve my problem. :-) |
Mention of no carrier bug - recommending the 3.3.6 driver This may also be related Talk of permanent allow unsupported SFP in driver, dated but similar principle. Another patch for ix driver: Would be great if no artificial restrictions where present in the driver and we only have to worry about actual hw compatibility. |
|
With 20.1 the problem is back. "no carrier" on the 10Ge I/F. :( |
|
Booting kernel.old (19.7) doesn't help. |
|
Sounds occasional, what happens when plugging of the cable and in again? |
|
Doesn't help. Exactly the same behavior as in the beginning of this thread. Hardware unchanged, "only" updated to 20.1. |
|
ifconfig -vvvvvv please It worked with 19.7.10? |
|
Yes with 19.7.10 it worked. ix0: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500 |
|
19.7.x OS = 20.1.x OS with no modifications. It does not look like this is a software issue if it works sometimes, but not always. |
|
19.7.x always stable. After upgrading to 20.1.x no chance to get it working. |
|
It’s probably a boot timing issue then. On 19.7 the netgraph drivers are loaded, on 20.1 not. See https://forum.opnsense.org/index.php?topic=15653.0
I can’t think of anything else that would make sense software-wise.
… On 31. Jan 2020, at 10:04, Fabian Abplanalp ***@***.***> wrote:
19.7.x always stable. After upgrading to 20.1.x no chance to get it working.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
|
other sfp modules (or cables) very often help fix these kind of issues in our experience, often unstable connections point to issues there. The connector contains the transceiver, which is responsible for the connection (some cards even check for specific firmware in the transceiver). |

After upgrading to opnsense 18.7 the ix NIC (attached with a DAC to a switch) reports "media: no carrier". Setting the media to fixed 10Gbase-Twinax doesn't help...
The text was updated successfully, but these errors were encountered: