-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Linux AX25: address already in use
error after first connection
#352
Comments
I've been trying to reproduce this today running two Pat instances (P2P) (
The connection teardown works as intended, with no dangling connections. This leads me to believe that this is either a bug in recent Linux kernels or has already been resolved in recent commits. Can you please try building the develop branch and see if you're able to reproduce the issue? Thanks! |
This specific kernel bug (others) has finally been getting some review. Please see https://www.spinics.net/lists/linux-hams/ for some of the archives. Specifically, here is the thread on the stale AX.25 session issue: https://www.spinics.net/lists/linux-hams/msg04952.html . Once a viable fix is made available aligning to all the other fixes that are in flight, it will take some time (months / quarters) until a standard Linux distribution might publish them for enduser consumption |
Same here with latest Pat and Linux 5.15.0-37-generic #39-Ubuntu. |
Not sure what you mean by "unplug/plug" but an ugly work around is to do: |
The symptom described by Tyler-2 is the symptom fixed for kernel 4.2.0 in June of 2016. See submission here: AX.25: Close socket connection on session completion. @Tyler-2 which Linux kernel are you using? |
I don't know how to check the status of that patch in this custom kernel but naturally I assume it's in there. |
I am seeing the same behavior. I am running pat v0.13 with Linux 5.4.0-124-generic #140-Ubuntu SMP Thu Aug 4 02:23:37 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux. This is the current version of Linux Mint. The direwolf version is 1.7. |
As this is most likely a kernel bug, I'm tagging this issue as wontfix. Keeping it open for now, so we can track progress and discuss this further. It would be great if someone could dig into this and investigate why the bug still exists in Linux >= 5.4 if the patch was merged in 4.2.0. Maybe a new bug was introduced? 🤔 I guess a good place to start would be to verify that the patch did in fact fix the issue. I saw similar issues back when I was running Debian Jessie (kernel 3.6). When I upgraded to Debian Stretch (kernel 4.9) the issue was resolved IIRC. Maybe a similar bug was introduced in a later kernel release? 🤷 Anyway, the local RMS packet node in my area has been QRT for a couple of years now, so it's difficult for me to investigate further right now 😞. |
Unable to establish connection to remote: address already in use
after first connection
Unable to establish connection to remote: address already in use
after first connectionaddress already in use
error after first connection
For what it's worth, I did a custom build of Linux 5.19.9 on Debian Bullseye and this issue no longer seems to be present. It was present for me on |
stumbled upon this issue while researching. I suspect this kernel patch might resolve the issue entirely https://github.com/torvalds/linux/blob/v5.19/net/ax25/af_ax25.c - so 5.19 kernels might be ok now - yet to test |
Just upgraded to 5.19 kernel on Ubuntu and this appears to be resolved :) |
I am still stuck with this issue on Linux Mint. Curiously, I have the same version of Linux running on 2 different machines: |
There have been fixes in very new versions of the Linux kernel but I don't think they will be backported to the old 5.4.0 series. It's not clear what version of Mint you're running but does Mint have the HWE (hardware enablement) kernel like what Ubuntu offers? This should get you a newer kernel. For example, read this: https://forums.linuxmint.com/viewtopic.php?t=367736 |
they are the same mint release: |
Ok but on my u20.04 system, I'm running the follow HWE kernel: 5.15.0-53-generic #59~20.04.1-Ubuntu SMP Thu Oct 20 15:10:22 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
From looking at the linux kernel changes you are probably going to have a bad time with anything older than 5.19. |
they are the same mint release: |
OK before I update the kernel, can someone please explain how the same kernel and OS-release result in different behaviors on the 2 different machines. I am having a hard time seeing the logic in this. |
I believe the issue was related to race conditions with the timers used for managing AX25 connections - so different CPUs and different workloads would have different results. On my hardware I could only get the issue to occur occasionally. |
I could upgrade to Mint 21 but that only has kernel 5.15. How do i get upgraded to the 5.19 version? |
Workaround I ran into this issue. I now use the AGWPE branch of Pat on Linux and I have no problems. The AGWPE feature does not require kernel AX.25 support, and communicates directly direwolf. Install info is at: Note that I had an error installing, so I used:
|
Closing this one now. It's a kernel issue. AGWPE support has been added, so affected users might want to try that 🙂 |
Very new kernels (6.x) should have this issue fix now though I'm not exactly clear which specific modern kernels or older kernels have backport fixes committed. |
Thanks! That is certainly good news 😊 Debian Bookworm (current stable) has 6.1 🥳 |
I've got Direwolf running as a software TNC with the
-p
command, and I runkissattach
to connect the Linux AX.25 stack to that port.This works, and when I
connect
in Pat, my emails are sent and received, and Pat disconnects.Afterwards, any attempts to connect again are met with:
I have no similar problems when using simply
axcall
. I can connect and reconnect all day, even after Pat gets into this bad state. But Pat will not work.Restarting Pat doesn't resolve it - I have to restart everything all the way up to Direwolf.
Pat Version 0.12.1-2+b1
Direwolf Version 1.6
$ uname -rv
5.17.5-surface #3 SMP PREEMPT Sat May 7 16:37:05 UTC 2022
(A custom kernel for Surface Laptops running Debian)
The text was updated successfully, but these errors were encountered: