Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash due to number of open files #132

Closed
diseq opened this issue Oct 1, 2021 · 7 comments
Closed

Crash due to number of open files #132

diseq opened this issue Oct 1, 2021 · 7 comments

Comments

@diseq
Copy link

diseq commented Oct 1, 2021

headscale serve seems to accumulate number of open files and crashes.
What number is to be expected?

btw. headscale is awesome work!!

Version: 0.9.2

Oct 01 13:18:50 server-1 headscale[8588]: 2021-10-01T13:18:50Z ERR Error accessing db error="unable to open database file: too many open files"
Oct 01 13:18:50 server-1 headscale[8588]: 2021-10-01T13:18:50Z ERR Cannot fetch peers error="unable to open database file: too many open files" func=getMapResponse

for pid in /proc/[0-9]*; do printf "PID %6d has %4d FDs\n" $(basename $pid) $(ls $pid/fd | wc -l); done

PID   8588 has 1024 FDs

ps -ef |grep 8588

headsca+    8588       1 13 Sep30 ?        03:02:09 /usr/sbin/headscale serve

Config

{
    "server_url": "https://server.domain.com",
    "listen_addr": "0.0.0.0:8080",
    "private_key_path": "/etc/headscale/private.key",
    "derp_map_path": "/etc/headscale/derp.yaml",
    "ephemeral_node_inactivity_timeout": "30m",
    "db_type": "sqlite3",
    "db_path": "/mnt/data/headscale/db.sqlite",
    "tls_letsencrypt_hostname": "",
    "tls_letsencrypt_listen": ":http",
    "tls_letsencrypt_cache_dir": ".cache",
    "tls_letsencrypt_challenge_type": "HTTP-01",
    "tls_cert_path": "",
    "tls_key_path": "",
    "acl_policy_path": "/mnt/data/headscale/policy.hujson",
    "dns_config": {
        "nameservers": [
            "1.1.1.1"
        ]
    }
}
@qbit
Copy link
Contributor

qbit commented Oct 1, 2021

I hit this as well. Mine is fronted by nginx - i meant to check and see if nginx is misbehaving or headscale but I haven't had time to track it down.

@diseq
Copy link
Author

diseq commented Oct 1, 2021

same here. nginx in front.
it seems to happen after ephemeral nodes are added. Not sure if this is related. Might be a coincidence.

for pid in /proc/[0-9]*; do printf "PID %6d has %4d FDs\n" $(basename $pid) $(ls $pid/fd | wc -l); done |grep 78809
PID  78809 has   10 FDs

for pid in /proc/[0-9]*; do printf "PID %6d has %4d FDs\n" $(basename $pid) $(ls $pid/fd | wc -l); done |grep 78809
PID  78809 has   10 FDs

for pid in /proc/[0-9]*; do printf "PID %6d has %4d FDs\n" $(basename $pid) $(ls $pid/fd | wc -l); done |grep 78809
PID  78809 has   10 FDs

for pid in /proc/[0-9]*; do printf "PID %6d has %4d FDs\n" $(basename $pid) $(ls $pid/fd | wc -l); done |grep 78809
PID  78809 has   10 FDs

for pid in /proc/[0-9]*; do printf "PID %6d has %4d FDs\n" $(basename $pid) $(ls $pid/fd | wc -l); done |grep 78809
PID  78809 has   14 FDs

for pid in /proc/[0-9]*; do printf "PID %6d has %4d FDs\n" $(basename $pid) $(ls $pid/fd | wc -l); done |grep 78809
PID  78809 has   15 FDs

for pid in /proc/[0-9]*; do printf "PID %6d has %4d FDs\n" $(basename $pid) $(ls $pid/fd | wc -l); done |grep 78809
PID  78809 has   16 FDs

for pid in /proc/[0-9]*; do printf "PID %6d has %4d FDs\n" $(basename $pid) $(ls $pid/fd | wc -l); done |grep 78809
PID  78809 has   19 FDs

@juanfont
Copy link
Owner

juanfont commented Oct 1, 2021

@qbit is it also happening for you with ephemeral nodes?

@qbit
Copy link
Contributor

qbit commented Oct 1, 2021

I am not using pre-auth keys, so I don't think it is.

@qbit
Copy link
Contributor

qbit commented Oct 4, 2021

I switched to a non-nginx configuration and things seem happy. I wonder if diddling nginx timeouts would resolve this?

Maybe proxy_connect_timeout 300; or something in the location block?

(I'll try to test the above, but it might be a bit before I can)

@diseq
Copy link
Author

diseq commented Oct 7, 2021

0.9.3 seems to have resolved the accumulating open fds.
Testing on the same configuration, so I can rule out other changes.

will leave it for some time running.

@diseq
Copy link
Author

diseq commented Oct 8, 2021

issue is gone. thanks!

@diseq diseq closed this as completed Oct 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants