Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Long running php processes: LDAP timeout #4831
Bug initially reported to nextant (nextant issue #175), but could be caused by a more global user_ldap problem.
For long running jobs some LDAP servers (Samba domain controller in my case) seems to kill the LDAP connection after some time. This connection is not re-established by nextcloud/user_ldap when required.
Possible explanation: During normal operation most php processes of nextcloud run in apache config mode with a relatively short runtime - typically 30s. Most likely not enough to run into any LDAP timeouts. However nextclouds cron.php (when started by system cron) does run with php cli config (with unlimited runtime) - in this case a re-connect to LDAP may be required.
As apps are (most likely?) unaware of the actual user backend, the problem cannot be solved by the app developer, but needs to be handled in the user backend itself.
Steps to reproduce
Job should finish normally.
Jobs breaks after indexing all files (takes days to finish!)
Nextcloud version: (see Nextcloud admin page)
Updated from an older Nextcloud/ownCloud or fresh install:
I did some code research, and the cause of this seems to be the LDAP error handling here:
Looks like an exception is thrown, but nobody is willing to handle it.
The LDAP code at this point is a little hard to read and understand for someone not working with it on a daily basis. I had to match the numeric error codes like "-1":
from LDAP header files:
As a side note: it'd be nice to have those constants mentioned in the source for better readability.
this could potentially result in other bugs, when paged results are being used. If those are restarted, it's possible to run again into a timeout and the fun starts again.
Perhaps it makes more sense to set/request a higher timeout from the LDAP Server when a background task is running (keyword
Alternatively, can the background job be split into smaller chunks?
The handling of long running paged results may require special treatment, that's right.
I'm not sure wether splitting a background job into smaller junks will help: As far as I observed, the LDAP connection during "nextant:index" is opened (and used?) at the very beginning, then a long time with no LDAP activity follows - and I guess that's what makes the LDAP server close the connection.
So splitting the background job into smaller junks would only improve the situation if every junk was processed by a new process (with a new LDAP connection, that won't hit the timeout).
I also thought about changing some LDAP connection related settings, however most of the options I found seem to be OpenLDAP specific (and therefore wont help with other LDAP implementations like my Samba DC LDAP):
At the very end I still think re-connecting is the best solution, even if it may require some extra handling (e.g. for paged results).
How do other user backends handle situations like this?
@daita you don't. try to set $resource to null before https://github.com/nextcloud/server/blob/master/apps/user_ldap/lib/LDAP.php#L333 Not sure whether it works, and as stated above this might create more issues without further adjustments. For you it's probably okay since you don't interact much with LDAP, as I interpret it.
I've been running a similar test overnight:
Instead of setting $resource to null I replaced the exception with a simple debug warning:
"occ nextant:index" now finishes without (obvious) error.