Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Worker process frequently exits with Recv() error Out of memory [-5] #235

Closed
database64128 opened this issue May 17, 2022 · 10 comments
Closed

Comments

@database64128
Copy link

database64128 commented May 17, 2022

  • Client: Windows 11 Build 22621
  • ksmbd: 3.4.2, Arch Linux stock kernel 5.17.7
  • ksmbd-tools: 3.4.4 from AUR
ksmbd[910]: [ksmbd-worker/910]: ERROR: Recv() error Out of memory [-5]
ksmbd[909]: [ksmbd.mountd/909]: ERROR: WARNING: child process exited abnormally: 910
ksmbd[909]: [ksmbd.mountd/909]: ERROR: Fatal IPC error. Terminating. Check dmesg.
ksmbd[909]: [ksmbd.mountd/909]: ERROR: can't execute kill 910: No such process
ksmbd[909]: [ksmbd.mountd/909]: INFO: Exiting. Bye!

It says out of memory, even though the system didn't actually run out of memory.

Afterward ksmbd.service has to be manually restarted to recover from this.

@namjaejeon
Copy link
Member

Can you provide how to reproduce this issue to us ?

@database64128
Copy link
Author

database64128 commented May 17, 2022

This just happens randomly several times a week on daily use. The client runs Windows 11 Build 22621. The server is on the latest stock kernel of Arch Linux (5.17.7). But I've also seen this on previous kernel versions like 5.16.x. This usually happens when I'm not actively transferring anything but the client is still connected to the network. I was not able to find any specific method to reliably reproduce.

@namjaejeon
Copy link
Member

Let me know your ksmbd module and ksmbd-tools version.

@database64128
Copy link
Author

The ksmbd module 3.4.2 came with the kernel. ksmbd-tools 3.4.4 was built and installed from AUR.

@namjaejeon
Copy link
Member

The ksmbd module 3.4.2 came with the kernel.

Ah, I want to know your kernel version.

@database64128
Copy link
Author

I mentioned it earlier:

The server is on the latest stock kernel of Arch Linux (5.17.7). But I've also seen this on previous kernel versions like 5.16.x.

I just edited these info into the issue body so it's easier to find.

@mmakassikis
Copy link

ksmbd[909]: [ksmbd.mountd/909]: ERROR: Fatal IPC error. Terminating. Check dmesg.

Can you check if dmesg contains any message from ksmbd when the error occurs ?

@database64128
Copy link
Author

Can you check if dmesg contains any message from ksmbd when the error occurs ?

Nothing.

@database64128
Copy link
Author

According to thom311/libnl#104, both NetworkManager and systemd-networkd set their Netlink socket's buffer size to 8MiB to work around the same issue. I'm going to try to do the same thing here and see if this resolves my issue.

database64128 added a commit to database64128/ksmbd-tools that referenced this issue Aug 22, 2022
database64128 added a commit to database64128/ksmbd-tools that referenced this issue Aug 22, 2022
database64128 added a commit to database64128/ksmbd-tools that referenced this issue Aug 22, 2022
database64128 added a commit to database64128/ksmbd-tools that referenced this issue Aug 22, 2022
database64128 added a commit to database64128/ksmbd-tools that referenced this issue Aug 22, 2022
database64128 added a commit to database64128/ksmbd-tools that referenced this issue Aug 22, 2022
@database64128
Copy link
Author

I can confirm that this issue can be fixed by raising the Netlink socket buffer size. Opened #277 for the fix.

database64128 added a commit to database64128/ksmbd-tools that referenced this issue Aug 25, 2022
This commit fixes the intermittent `Recv() error Out of memory [-5]`
crashes of worker process by raising the netlink socket's buffer size.

A netlink socket's default buffer size is 32KiB. When a message exceeds
the socket buffer size, the recvmsg(2) call returns -ENOBUFS, which is
then translated to the Out of memory error above by libnl.

Both NetworkManager and systemd-networkd raise their netlink socket's
buffer size to 8MiB, which seems to be reasonable. So we do the same.

Fixes cifsd-team#235.

Signed-off-by: database64128 <free122448@hotmail.com>
database64128 added a commit to database64128/ksmbd-tools that referenced this issue Sep 4, 2022
This commit fixes the intermittent `Recv() error Out of memory [-5]`
crashes of worker process by raising the netlink socket's receive
buffer size to 1 MiB.

A netlink socket's default receive buffer size is 208 KiB (taken from
net.core.rmem_default). When incoming messages fill up the receive
buffer, the recvmsg(2) call returns -ENOBUFS, which is then translated
to the Out of memory error above by libnl.

Both NetworkManager and systemd-networkd raise their netlink socket's
buffer size to work around the same issue. More details on
systemd-networkd's decision can be found at
systemd/systemd#14417 and
systemd/systemd#14434.

Fixes cifsd-team#235.

Signed-off-by: database64128 <free122448@hotmail.com>
database64128 added a commit to database64128/ksmbd-tools that referenced this issue Sep 4, 2022
This commit fixes the intermittent `Recv() error Out of memory [-5]`
crashes of worker process by raising the netlink socket's receive
buffer size to 1 MiB.

A netlink socket's default receive buffer size is 208 KiB (taken from
net.core.rmem_default). When incoming messages fill up the receive
buffer, the recvmsg(2) call returns -ENOBUFS, which is then translated
to the Out of memory error above by libnl.

Both NetworkManager and systemd-networkd raise their netlink socket's
buffer size to work around the same issue. More details on
systemd-networkd's decision can be found at
systemd/systemd#14417 and
systemd/systemd#14434.

Fixes cifsd-team#235.

Signed-off-by: database64128 <free122448@hotmail.com>
database64128 added a commit to database64128/ksmbd-tools that referenced this issue Sep 4, 2022
This commit fixes the intermittent `Recv() error Out of memory [-5]`
crashes of worker process by raising the netlink socket's receive
buffer size to 1 MiB.

A netlink socket's default receive buffer size is 208 KiB (taken from
net.core.rmem_default). When incoming messages fill up the receive
buffer, the recvmsg(2) call returns -ENOBUFS, which is then translated
to the Out of memory error above by libnl.

Both NetworkManager and systemd-networkd raise their netlink socket's
buffer size to work around the same issue. More details on
systemd-networkd's decision can be found at
systemd/systemd#14417 and
systemd/systemd#14434.

Fixes cifsd-team#235.

Signed-off-by: database64128 <free122448@hotmail.com>
namjaejeon pushed a commit that referenced this issue Sep 7, 2022
This commit fixes the intermittent `Recv() error Out of memory [-5]`
crashes of worker process by raising the netlink socket's receive
buffer size to 1 MiB.

A netlink socket's default receive buffer size is 208 KiB (taken from
net.core.rmem_default). When incoming messages fill up the receive
buffer, the recvmsg(2) call returns -ENOBUFS, which is then translated
to the Out of memory error above by libnl.

Both NetworkManager and systemd-networkd raise their netlink socket's
buffer size to work around the same issue. More details on
systemd-networkd's decision can be found at
systemd/systemd#14417 and
systemd/systemd#14434.

Fixes #235.

Signed-off-by: database64128 <free122448@hotmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
namjaejeon pushed a commit to namjaejeon/ksmbd-tools that referenced this issue Sep 8, 2022
This commit fixes the intermittent `Recv() error Out of memory [-5]`
crashes of worker process by raising the netlink socket's receive
buffer size to 1 MiB.

A netlink socket's default receive buffer size is 208 KiB (taken from
net.core.rmem_default). When incoming messages fill up the receive
buffer, the recvmsg(2) call returns -ENOBUFS, which is then translated
to the Out of memory error above by libnl.

Both NetworkManager and systemd-networkd raise their netlink socket's
buffer size to work around the same issue. More details on
systemd-networkd's decision can be found at
systemd/systemd#14417 and
systemd/systemd#14434.

Fixes cifsd-team#235.

Signed-off-by: database64128 <free122448@hotmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
namjaejeon pushed a commit that referenced this issue Sep 8, 2022
This commit fixes the intermittent `Recv() error Out of memory [-5]`
crashes of worker process by raising the netlink socket's receive
buffer size to 1 MiB.

A netlink socket's default receive buffer size is 208 KiB (taken from
net.core.rmem_default). When incoming messages fill up the receive
buffer, the recvmsg(2) call returns -ENOBUFS, which is then translated
to the Out of memory error above by libnl.

Both NetworkManager and systemd-networkd raise their netlink socket's
buffer size to work around the same issue. More details on
systemd-networkd's decision can be found at
systemd/systemd#14417 and
systemd/systemd#14434.

Fixes #235.

Signed-off-by: database64128 <free122448@hotmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants