Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inotify_init() failing does not result in a clear error message #1019

Closed
Alex-K37 opened this issue Apr 16, 2024 · 12 comments · Fixed by #1020
Closed

inotify_init() failing does not result in a clear error message #1019

Alex-K37 opened this issue Apr 16, 2024 · 12 comments · Fixed by #1020
Labels
defect unison fails to meet its specification (but doesn't crash; see also "crash") effort-low issue is likely resolvable with <= 5h of effort impact-low low importance

Comments

@Alex-K37
Copy link

especially when running Unison, I quite often experienced the following error:

Fatal error: exception Unix.Unix_error(Unix.EBADF, "set_nonblock", "")

Running "strace unison-fsmonitor" yielded:

fcntl(0, F_GETFL)                       = 0x802 (flags O_RDWR|O_NONBLOCK)
fcntl(0, F_SETFL, O_RDWR|O_NONBLOCK)    = 0
fcntl(1, F_GETFL)                       = 0x802 (flags O_RDWR|O_NONBLOCK)
fcntl(1, F_SETFL, O_RDWR|O_NONBLOCK)    = 0
inotify_init()                          = -1 EMFILE (Too many open files)
fcntl(-1, F_GETFL)                      = -1 EBADF (Bad file descriptor)
write(2, "Fatal error: exception Unix.Unix"..., 71Fatal error: exception Unix.Unix_error(Unix.EBADF, "set_nonblock", "")) = 71
exit_group(2)                           = ?
+++ exited with 2 +++

I have fiddled around with 'ulimit -n', without lasting success.

The real solution was to change the following:

sysctl -w fs.inotify.max_user_instances=256 >> /etc/sysctl.d/unison.conf

The original OS vendor setting (OpenSuSE) was at 128. And with Nextcloud and another sync tool running besides Unison my system seemed to have too many inotify resources used.

=> I would appreciate if you could add a hint to your documentation somewhere, that on Linux inotify kernel limits may have to be increased if the before mentioned error is encountered. Cf. #803 .

@gdt
Copy link
Collaborator

gdt commented Apr 16, 2024

Can you explain how many inotify resources unison uses? If it's not many, this feels like an OpenSuse bug, not a unison bug. You left out the URL where you filed a bug with them to fix the default...

@tleedjarv
Copy link
Contributor

128 does sound unlikely to be the default. I'd expect something more like 8192, at least.

@gdt this is not so much about Unison using these resources but all applications combined. It's like asking who is at fault for OOM when no process is objectively leaking or wasting memory.

A fix is incoming in 3...2...1...

@gdt
Copy link
Collaborator

gdt commented Apr 16, 2024

Sure, but if unison uses 4, and a default is 128, and others are piggy, then this is not a unison issue. I would like people to report issues in other software to that software, instead of expecting unison to remediate them.

@tleedjarv
Copy link
Contributor

I don't think that's how it works. If you have too little RAM then you're going to run out of memory even if all programs work as intended. The same here; it doesn't mean that some software is buggy (but it might); it just means you have limited the resources too much for your intended usage.

Either way, unison-fsmonitor should only call inotify_init(2) once. Does that count as 1 "instance", I don't know.

@Alex-K37
Copy link
Author

Alex-K37 commented Apr 16, 2024

To avoid misunderstanding: I am not asking to have this treated as a bug, but rather as a request for documentation. I have had issues with this for many months and across multiple distribution upgrades. I do not regularly fiddle with kernel settings - I need to work productively with my system. This is under the assumption, that also(?) something else consumes inotify resources, but I do not know, what.

The current settings after my intervention on fs.inotify.max_user_instances are:

fs.inotify.max_queued_events = 16384
fs.inotify.max_user_instances = 256
fs.inotify.max_user_watches = 65536

It seems "instances" is something different. I believe "watches" is related to the number of files directly and this default seems sufficient. I found a hint with respect to fs.inotify.max_user_instances somewhere.

@gdt
Copy link
Collaborator

gdt commented Apr 16, 2024

Sure, it's like RAM. I just meant that only programs that are piggy about RAM should be expected to talk about it.

How many instances does unison use? if more than a tiny number, is that ok? If a tiny number, then this is not really about unison. As always I would like issue entries to have clarity, and if they don't, then the mailinglist is more appropiate.

@Alex-K37
Copy link
Author

You might consider it a bug that

inotify_init() = -1 EMFILE (Too many open files)

does not become visible to the user, but only Unix.EBADF on set_nonblock. This is nothing to follow up on for anyone but a programmer.

@gdt gdt changed the title Add documentation: Unix.EBADF with unison / unison-fsmonitor - change /proc/sys/fs/inotify/max_user_instances inotify_init() failing does not result in a clear error message Apr 16, 2024
@gdt gdt added defect unison fails to meet its specification (but doesn't crash; see also "crash") effort-low issue is likely resolvable with <= 5h of effort impact-low low importance labels Apr 16, 2024
@gdt
Copy link
Collaborator

gdt commented Apr 16, 2024

Sure, but that's different from what this issue asked for. I've retitled it.

@Alex-K37
Copy link
Author

I still recommend to add some hint to the FAQ commenting about potential need to increase inotify resources on Linux. It took me months to find out the message was not really about open files.

I am quite satisfied with Unison for years, but this behaviour made me look for alternatives.

@tleedjarv
Copy link
Contributor

You might consider it a bug that

inotify_init() = -1 EMFILE (Too many open files)

does not become visible to the user, but only Unix.EBADF on set_nonblock. This is nothing to follow up on for anyone but a programmer.

Yes, it absolutely is a bug and there is a PR to fix it.

@gdt gdt closed this as completed in #1020 Apr 16, 2024
@Alex-K37
Copy link
Author

Alex-K37 commented Apr 17, 2024

Reported at https://bugzilla.opensuse.org/show_bug.cgi?id=1222946

Thank you for taking this seriously and so quickly!

@gdt
Copy link
Collaborator

gdt commented Apr 17, 2024

Thanks for reporting to opensuse. It will be interesting to see their reaction. 128 strikes me as enough, if programs do not open many instances, which is seems they shouldn't - but I really don't know.

There is now a better error message in unison (merging that closed this issue), so that should help people running into this. It doesn't contain specific remediation hints, which I think is right because those would be necessarily non-portable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defect unison fails to meet its specification (but doesn't crash; see also "crash") effort-low issue is likely resolvable with <= 5h of effort impact-low low importance
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants