Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2.10.0 unable to start on clean install #2753

Open
troykelly opened this issue Mar 27, 2023 · 136 comments
Open

2.10.0 unable to start on clean install #2753

troykelly opened this issue Mar 27, 2023 · 136 comments
Labels

Comments

@troykelly
Copy link
Contributor

Checklist

  • Have you pulled and found the error with jc21/nginx-proxy-manager:latest docker image?
    • Yes / No
  • Are you sure you're not using someone else's docker image?
    • Yes / No
  • Have you searched for similar issues (both open and closed)?
    • Yes / No

Describe the bug

The :latest and 2.10.0 image fails to start either with an existing configuration, or with a clean install.

Nginx Proxy Manager Version

2.10.0

To Reproduce
Steps to reproduce the behavior:

  1. Start a container
  2. Watch it fail

Expected behavior

The container should start

Screenshots

➜  lb-pi003 docker compose up -d && docker compose logs -f app
[+] Running 3/3
 ⠿ Network lb-pi003_default  Created                                                                                                                                                        0.8s
 ⠿ Container lb-pi003-db-1   Started                                                                                                                                                       27.7s
 ⠿ Container lb-pi003-app-1  Started                                                                                                                                                       18.7s
lb-pi003-app-1  | s6-rc: info: service s6rc-oneshot-runner: starting
lb-pi003-app-1  | s6-rc: info: service s6rc-oneshot-runner successfully started
lb-pi003-app-1  | s6-rc: info: service fix-attrs: starting
lb-pi003-app-1  | s6-rc: info: service fix-attrs successfully started
lb-pi003-app-1  | s6-rc: info: service legacy-cont-init: starting
lb-pi003-app-1  | s6-rc: info: service legacy-cont-init successfully started
lb-pi003-app-1  | s6-rc: info: service prepare: starting
lb-pi003-app-1  | ❯ Configuring npmuser ...
lb-pi003-app-1  | id: 'npmuser': no such user
lb-pi003-app-1  | ❯ Checking paths ...
lb-pi003-app-1  | ❯ Setting ownership ...
lb-pi003-app-1  | s6-rc: fatal: timed out
lb-pi003-app-1  | s6-sudoc: fatal: unable to get exit status from server: Operation timed out
lb-pi003-app-1  | /run/s6/basedir/scripts/rc.init: warning: s6-rc failed to properly bring all the services up! Check your logs (in /run/uncaught-logs/current if you have in-container logging) for more information.

Operating System

Rpi

Additional context

@troykelly troykelly added the bug label Mar 27, 2023
@troykelly
Copy link
Contributor Author

I'm assuming different to #2734 because this is the same error on a clean install or existing install (and not resolved with a restart as the original issue poster)

@Pacogens
Copy link

I have the same problem in a host with OpenMediaVault. On another host with Ubuntu Server I have no problem.

@tristanXme
Copy link

Have a similar issue on multiple Hosts:

s6-rc: info: service s6rc-oneshot-runner: starting
s6-rc: info: service s6rc-oneshot-runner successfully started
s6-rc: info: service fix-attrs: starting
s6-rc: info: service fix-attrs successfully started
s6-rc: info: service legacy-cont-init: starting
s6-rc: info: service legacy-cont-init successfully started
s6-rc: info: service prepare: starting
❯ Configuring npmuser ...
id: 'npmuser': no such user
useradd: UID 0 is not unique
s6-rc: warning: unable to start service prepare: command exited 1
/run/s6/basedir/scripts/rc.init: warning: s6-rc failed to properly bring all the services up! Check your logs (in /run/uncaught-logs/current if you have in-container logging) for more information.

@nitro424
Copy link

nitro424 commented Mar 27, 2023

After updating from 2.9.22 to 2.10.0 on my Synology DS it failed to start:

nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)

I did a fresh new install with minimal configuration and got the error:

id: 'npmuser': no such user
s6-rc: fatal: timed out
s6-sudoc: fatal: unable to get exit status from server: Operation timed out

Rolling back to 2.9.22 fixed the issue.

2.10.0 works on my laptop (Pop OS).
Synology OS has no user with ID 1000. Maybe that's a hint.

@jicho
Copy link

jicho commented Mar 27, 2023

When I do a portainter recreate including "re-pull image", I'm getting the error:

s6-rc: info: service s6rc-oneshot-runner: starting
s6-rc: info: service s6rc-oneshot-runner successfully started
s6-rc: info: service fix-attrs: starting
s6-rc: info: service fix-attrs successfully started
s6-rc: info: service legacy-cont-init: starting
s6-rc: info: service legacy-cont-init successfully started
s6-rc: info: service prepare: starting
❯ Configuring npmuser ...
id: 'npmuser': no such user
s6-rc: fatal: timed out
s6-sudoc: fatal: unable to get exit status from server: Operation timed out
/run/s6/basedir/scripts/rc.init: warning: s6-rc failed to properly bring all the services up! Check your logs (in /run/uncaught-logs/current if you have in-container logging) for more information.

I'm running on jc21/nginx-proxy-manager:2

Back to 2.9.22 "solves" the problem for now :)

@jk-andersen
Copy link

jk-andersen commented Mar 27, 2023

can confirm this issue on synology for me. Rollback on 2.9.22 worked

@adammau2
Copy link

adammau2 commented Mar 27, 2023

Hi @jicho , I also rolled back to 2.9.22 but got this log, and the login has a Bad Gateway. did you get that log too?

proxy-manager-app-1 | [3/27/2023] [8:17:30 AM] [Global ] › ✖ error create table migrations (id int unsigned not null auto_increment primary key, name varchar(255), batch int, migration_time timestamp) - ER_CANT_CREATE_TABLE: Can't create table proxy-mgr.migrations (errno: 13 "Permission denied")

@jicho
Copy link

jicho commented Mar 27, 2023

Hi @jicho , I also rolled back to 2.9.22 but got this log, and the login has a Bad Gateway. did you get that log too?

proxy-manager-app-1 | [3/27/2023] [8:17:30 AM] [Global ] › ✖ error create table migrations (id int unsigned not null auto_increment primary key, name varchar(255), batch int, migration_time timestamp) - ER_CANT_CREATE_TABLE: Can't create table proxy-mgr.migrations (errno: 13 "Permission denied")

Hi @adammau2 after going back to tag/label 2.9.22 I had no issues had all.
I can login without any issues.

Some more info:

  • I run NPM with a SQLite db
  • I'm running NPM on a Synology NAS, but do stuff (most of the time) with portainer.

@Adrianos712
Copy link

Hi, same issue here. Rolling back to 2.9.22 did the job for now...

@Reupireup
Copy link

Same for me, running on arm7

@dietrichmd
Copy link

dietrichmd commented Mar 27, 2023

Same issue here. Ubuntu 22.04 LTS (docker). Confirmed fix on rollback to 2.9.22

@yurividal
Copy link

Same issue on Ubuntu. Confirmed rollback works fine.

@taimadoCE
Copy link

Same on a Arm7
Back to 2.9.22

@rwood
Copy link

rwood commented Mar 27, 2023

Ditto. 2.10.0 has the error "'npmuser': no such user" and will not start. Switch back to 2.9.22, and everything works.
Host Kernel: Linux 5.19.9-Unraid x86_64

@siancu
Copy link

siancu commented Mar 27, 2023

Same for me on Synology. Switch back to 2.9.22, it works!

@dglueckstadt
Copy link

dglueckstadt commented Mar 27, 2023

Same for me on Synology DSM 6.2.4
Switch back to 2.9.22 works, but i can't log in to Dashboard.
User/Password invalid
Last Login on Sat 2023-03-25 with no Problems
Was something changed in the database tables?

@wolfiiy
Copy link

wolfiiy commented Mar 27, 2023

Same problem on Debian (Docker). 2.9.22 works and I can log into the dashboard without any issue.

@Martydog
Copy link

same on synology, rollback to 2.9.22 fixed for now..

@ptC7H12
Copy link

ptC7H12 commented Mar 27, 2023

same on unraid rollback to 2.9.22 fixed it

@pifou25
Copy link

pifou25 commented Mar 27, 2023

Hi, the same for me, with debian bullseye on RPI3. also rollback to 2.9.22 fixed the issue.

@jc21
Copy link
Member

jc21 commented Mar 27, 2023

For the s6-rc: fatal: timed out errors which is the main subject of this issue, I've put a fix up and it's available in the github-develop docker tag, can you please try that and let me know if you get further.

@nitro424
Copy link

nitro424 commented Mar 28, 2023

s6-rc: info: service s6rc-oneshot-runner: starting
s6-rc: info: service s6rc-oneshot-runner successfully started
s6-rc: info: service fix-attrs: starting
s6-rc: info: service fix-attrs successfully started
s6-rc: info: service legacy-cont-init: starting
s6-rc: info: service legacy-cont-init successfully started
s6-rc: info: service prepare: starting
❯ Configuring npmuser ...
id: 'npmuser': no such user
s6-rc: fatal: timed out
s6-sudoc: fatal: unable to get exit status from server: Operation timed out
/run/s6/basedir/scripts/rc.init: warning: s6-rc failed to properly bring all the services up! Check your logs (in /run/uncaught-logs/current if you have in-container logging) for more information.

compose file

version: "3"
services:
  app:
    image: 'jc21/nginx-proxy-manager:github-develop'
    restart: unless-stopped
    ports:
      # These ports are in format <host-port>:<container-port>
      - '8093:80' # Public HTTP Port
      - '8094:443' # Public HTTPS Port
      - '8095:81' # Admin Web Port

on latest Synology DSM

@jc21
Copy link
Member

jc21 commented Mar 28, 2023

@nitro424 pull and try again please?

@Emeriz-M
Copy link

Emeriz-M commented Mar 28, 2023

Same issue for me on Synology with latest DSM.

nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)

Rollback to 2.9.22 resolved for now as well.

@codysnider
Copy link

Same for debian 10 with docker, rollback to 2.9.22 fixed it.

@jicho
Copy link

jicho commented Mar 28, 2023

@jc21 when I change the tag into github-develop in Portainer I get the following after updating (this is on Synology):

s6-rc: info: service s6rc-oneshot-runner: starting
s6-rc: info: service s6rc-oneshot-runner successfully started
s6-rc: info: service fix-attrs: starting
s6-rc: info: service fix-attrs successfully started
s6-rc: info: service legacy-cont-init: starting
s6-rc: info: service legacy-cont-init successfully started
s6-rc: info: service prepare: starting
❯ Configuring npmuser ...
id: 'npmuser': no such user
❯ Checking paths ...
❯ Setting ownership ...
❯ Dynamic resolvers ...
❯ IPv6 ...
Enabling IPV6 in hosts in: /etc/nginx/conf.d
s6-rc: fatal: timed out
s6-sudoc: fatal: unable to get exit status from server: Operation timed out
/run/s6/basedir/scripts/rc.init: warning: s6-rc failed to properly bring all the services up! Check your logs (in /run/uncaught-logs/current if you have in-container logging) for more information.
- /etc/nginx/conf.d/default.conf
- /etc/nginx/conf.d/include/assets.conf
- /etc/nginx/conf.d/include/block-exploits.conf
- /etc/nginx/conf.d/include/force-ssl.conf

After a complete container restart I get:

- /etc/nginx/conf.d/default.conf
Enabling IPV6 in hosts in: /data/nginx
- /data/nginx/default_host/site.conf
- /data/nginx/proxy_host/4.conf
- /data/nginx/proxy_host/5.conf
- /data/nginx/proxy_host/3.conf
- /data/nginx/proxy_host/18.conf
- /data/nginx/proxy_host/6.conf
- /data/nginx/proxy_host/2.conf
- /data/nginx/proxy_host/17.conf
- /data/nginx/redirection_host/1.conf
❯ Docker secrets ...
-------------------------------------
 _   _ ____  __  __
| \ | |  _ \|  \/  |
|  \| | |_) | |\/| |
| |\  |  __/| |  | |
|_| \_|_|   |_|  |_|
-------------------------------------
User UID: 911
User GID: 911
-------------------------------------
s6-rc: info: service prepare successfully started
s6-rc: info: service nginx: starting
s6-rc: info: service frontend: starting
s6-rc: info: service backend: starting
s6-rc: info: service nginx successfully started
s6-rc: info: service backend successfully started
❯ Starting nginx ...
s6-rc: info: service frontend successfully started
❯ Starting backend ...
s6-rc: info: service legacy-services: starting
s6-rc: info: service legacy-services successfully started
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
❯ Starting nginx ...
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
[3/28/2023] [7:59:11 AM] [Global   ] › ℹ  info      Using Sqlite: /data/database.sqlite
[3/28/2023] [7:59:11 AM] [Global   ] › ℹ  info      Creating a new JWT key pair...
❯ Starting nginx ...
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)

In both situations I can't access any of my sites, when I go back to 2.9.22 everything is back to normal again.

It looks like User UID/GID is giving some issues when you leave this setting alone in the config/env. variables.
This is all I could test quickly, hope it helps!

@jc21
Copy link
Member

jc21 commented Mar 28, 2023

@jicho Nothing has changed from the port number side of things, if 2.9.22 could start listening on that port previously then it should be fine for 2.10.0 to do so :/ Does port 81 work for the admin interface?

@jicho
Copy link

jicho commented Mar 28, 2023

@jc21 When I change the tag back go github-develop in Portainer the first run breaks (just didn't start):

s6-rc: info: service s6rc-oneshot-runner: starting
s6-rc: info: service s6rc-oneshot-runner successfully started
s6-rc: info: service fix-attrs: starting
s6-rc: info: service fix-attrs successfully started
s6-rc: info: service legacy-cont-init: starting
s6-rc: info: service legacy-cont-init successfully started
s6-rc: info: service prepare: starting
❯ Configuring npmuser ...
id: 'npmuser': no such user
s6-rc: fatal: timed out
s6-sudoc: fatal: unable to get exit status from server: Operation timed out
/run/s6/basedir/scripts/rc.init: warning: s6-rc failed to properly bring all the services up! Check your logs (in /run/uncaught-logs/current if you have in-container logging) for more information.

So after a restart I'm getting the nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied) error.

When I go to port 81 Safari is telling met that it can't connect.

It's the same when I do a stop / start in Portainer.

Logs are the same:

-------------------------------------
 _   _ ____  __  __
| \ | |  _ \|  \/  |
|  \| | |_) | |\/| |
| |\  |  __/| |  | |
|_| \_|_|   |_|  |_|
-------------------------------------
User UID: 911
User GID: 911
-------------------------------------
s6-rc: info: service prepare successfully started
s6-rc: info: service nginx: starting
s6-rc: info: service frontend: starting
s6-rc: info: service backend: starting
s6-rc: info: service frontend successfully started
s6-rc: info: service backend successfully started
s6-rc: info: service nginx successfully started
s6-rc: info: service legacy-services: starting
❯ Starting nginx ...
❯ Starting backend ...
s6-rc: info: service legacy-services successfully started
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
❯ Starting nginx ...
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
[3/28/2023] [9:27:40 AM] [Global   ] › ℹ  info      Using Sqlite: /data/database.sqlite
❯ Starting nginx ...
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
❯ Starting nginx ...
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
❯ Starting nginx ...
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
[3/28/2023] [9:27:44 AM] [Migrate  ] › ℹ  info      Current database version: none
❯ Starting nginx ...
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
❯ Starting nginx ...
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
❯ Starting nginx ...
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
❯ Starting nginx ...
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
❯ Starting nginx ...
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
❯ Starting nginx ...
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
❯ Starting nginx ...
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
❯ Starting nginx ...
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
❯ Starting nginx ...
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
❯ Starting nginx ...
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
[3/28/2023] [9:27:56 AM] [Setup    ] › ℹ  info      Added Certbot plugins certbot-dns-cloudflare==$(certbot --version | grep -Eo '[0-9](\.[0-9]+)+') cloudflare
[3/28/2023] [9:27:56 AM] [Setup    ] › ℹ  info      Logrotate Timer initialized
❯ Starting nginx ...
[3/28/2023] [9:27:56 AM] [Setup    ] › ℹ  info      Logrotate completed.
[3/28/2023] [9:27:56 AM] [IP Ranges] › ℹ  info      Fetching IP Ranges from online services...
[3/28/2023] [9:27:56 AM] [IP Ranges] › ℹ  info      Fetching https://ip-ranges.amazonaws.com/ip-ranges.json
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
[3/28/2023] [9:27:57 AM] [IP Ranges] › ℹ  info      Fetching https://www.cloudflare.com/ips-v4
[3/28/2023] [9:27:57 AM] [IP Ranges] › ℹ  info      Fetching https://www.cloudflare.com/ips-v6
❯ Starting nginx ...
[3/28/2023] [9:27:57 AM] [SSL      ] › ℹ  info      Let's Encrypt Renewal Timer initialized
[3/28/2023] [9:27:57 AM] [SSL      ] › ℹ  info      Renewing SSL certs close to expiry...
[3/28/2023] [9:27:57 AM] [IP Ranges] › ℹ  info      IP Ranges Renewal Timer initialized
[3/28/2023] [9:27:57 AM] [Global   ] › ℹ  info      Backend PID 145 listening on port 3000 ...
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
❯ Starting nginx ...
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
[3/28/2023] [9:27:59 AM] [SSL      ] › ✖  error     Error: Command failed: /usr/sbin/nginx -t -g "error_log off;" 
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: [emerg] open() "/etc/nginx/nginx/off" failed (13: Permission denied)
nginx: configuration file /etc/nginx/nginx.conf test failed
    at ChildProcess.exithandler (node:child_process:402:12)
    at ChildProcess.emit (node:events:513:28)
    at maybeClose (node:internal/child_process:1100:16)
    at Socket.<anonymous> (node:internal/child_process:458:11)
    at Socket.emit (node:events:513:28)
    at Pipe.<anonymous> (node:net:301:12)
❯ Starting nginx ...
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
❯ Starting nginx ...
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
❯ Starting nginx ...
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
❯ Starting nginx ...
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
❯ Starting nginx ...
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
❯ Starting nginx ...
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
❯ Starting nginx ...
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
❯ Starting nginx ...
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
❯ Starting nginx ...
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)

Back to 2.9.22 (just a tag change) makes everything work again...

Okay... another test... I'm using the tag 2.10.0, the logs are the same.
This time I removed my MacVLAN and kept the bridge connection.

I'm still getting nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)

As soon as I'm back to 2.9.22 everything is back to normal :)
Even when I connect my container to macvlan and bridge

@rymancl
Copy link

rymancl commented May 4, 2023

Ah yep can you add DEBUG=true to the docker environment variables?

I'm not who your comment was directed at, but even with DEBUG=true I don't get verbose logs on Synology.
On-boot failure to start:
image

    image: jc21/nginx-proxy-manager:github-s6-verbose
    container_name: nginx_proxy_manager
    profiles:
      - all
      - core
    network_mode: synobridge
    environment:
      - TZ=America/New_York
      - PUID=0
      - PGID=0
      - DEBUG=true
      # - S6_CMD_WAIT_FOR_SERVICES_MAXTIME=60000
    ports:
      - "8341:80"
      - "81:81"
      - "8766:443"
    volumes:
      - /volume1/docker/npm/config.json:/app/config/production.json
      - /volume1/docker/npm/data:/data
      - /volume1/docker/npm/letsencrypt:/etc/letsencrypt
    restart: unless-stopped

@Sungray
Copy link

Sungray commented May 4, 2023

Here's with DEBUG=true, the timeout happens on /etc/letsencrypt

s6-rc: info: service s6rc-oneshot-runner: starting
s6-rc: info: service s6rc-oneshot-runner successfully started
s6-rc: info: service fix-attrs: starting
s6-rc: info: service fix-attrs successfully started
s6-rc: info: service legacy-cont-init: starting
s6-rc: info: service legacy-cont-init successfully started
s6-rc: info: service prepare: starting
+ . /etc/s6-overlay/s6-rc.d/prepare/10-usergroup.sh
++ set -e
++ log_info 'Configuring npm user ...'
++ echo -e '\E[1;34m❯ \E[1;36mConfiguring npm user ...\E[0m'
++ id -u npm
++ useradd -o -u 99 -U -d /tmp/npmuserhome -s /bin/false npm
++ log_info 'Configuring npm group ...'
++ echo -e '\E[1;34m❯ \E[1;36mConfiguring npm group ...\E[0m'
+++ get_group_id npm
+++ '[' npm '!=' '' ']'
+++ getent group npm
+++ cut -d: -f3
++ '[' 1000 = '' ']'
++ groupmod -o -g 100 npm
++ groupmod -o -g 100 npm
+++ get_group_id npm
+++ '[' npm '!=' '' ']'
+++ getent group npm
+++ cut -d: -f3
++ '[' 100 '!=' 100 ']'
++ usermod -G 100 npm
+++ id -g npm
++ '[' 100 '!=' 100 ']'
++ mkdir -p /tmp/npmuserhome
++ chown -R 99:100 /tmp/npmuserhome
+ . /etc/s6-overlay/s6-rc.d/prepare/20-paths.sh
++ set -e
++ log_info 'Checking paths ...'
++ echo -e '\E[1;34m❯ \E[1;36mChecking paths ...\E[0m'
++ '[' '!' -d /data ']'
++ '[' '!' -d /etc/letsencrypt ']'
++ mkdir -p /data/nginx /data/custom_ssl /data/logs /data/access /data/nginx/default_host /data/nginx/default_www /data/nginx/proxy_host /data/nginx/redirection_host /data/nginx/stream /data/nginx/dead_host /data/nginx/temp /data/letsencrypt-acme-challenge /run/nginx /tmp/nginx/body /var/log/nginx /var/lib/nginx/cache/public /var/lib/nginx/cache/private /var/cache/nginx/proxy_temp
++ touch /var/log/nginx/error.log
++ chmod 777 /var/log/nginx/error.log
++ chmod -R 777 /var/cache/nginx
++ chmod 644 /etc/logrotate.d/nginx-proxy-manager
+ . /etc/s6-overlay/s6-rc.d/prepare/30-ownership.sh
++ set -e
++ log_info 'Setting ownership ...'
++ echo -e '\E[1;34m❯ \E[1;36mSetting ownership ...\E[0m'
++ chown root /tmp/nginx
++ chown -R 99:100 /data
++ chown -R 99:100 /etc/letsencrypt
s6-rc: fatal: timed out
s6-sudoc: fatal: unable to get exit status from server: Operation timed out
/run/s6/basedir/scripts/rc.init: warning: s6-rc failed to properly bring all the services up! Check your logs (in /run/uncaught-logs/current if you have in-container logging) for more information.
❯ Configuring npm user ...
❯ Configuring npm group ...
❯ Checking paths ...
❯ Setting ownership ...

@Sungray
Copy link

Sungray commented May 4, 2023

I noticed that there are 23.398 files in both the /etc/letsencrypt/csr folder and /etc/letsencrypt/keys folder. I'm guessing the chown isn't fast enough and has a set timeout of around 1000ms which isn't enough for the buttload of files there. How can I clean these folders safely?

@jc21
Copy link
Member

jc21 commented May 4, 2023

@rymancl in fact yes I am seeing more information in your output as expected.

@Sungray you can run cert-prune inside the docker container to clean up those archived files. Just be sure to back up your letsencrypt folder first.

@Sungray
Copy link

Sungray commented May 4, 2023

@Sungray you can run cert-prune inside the docker container to clean up those archived files. Just be sure to back up your letsencrypt folder first.

Yes, cert-prune fixed the problem for me. It deleted around 500 stale certificates, 23k csr and 23k keys. There must have been a big issue at some point that went unnoticed, and the change to 2.10.0 introduced new timeouts on chown which were not handled fast enough.

Now all is good. Thank you!

@rymancl
Copy link

rymancl commented May 12, 2023

Cross-posting this from #2750


With v2.10.3, npm is now working perfectly again on my Synology. 🎉
I removed

- PUID=0
- PGID=0

from my env vars and that's it.

I tested a fresh install and several server reboots and npm didn't have any issues starting up anymore.

@janaxhell
Copy link

janaxhell commented May 19, 2023

@jc21

@janaxhell pull the new build of the github-s6-verbose tag and try again and let me know how that goes

Hi, looks to me that 2.10.3 is working fine now on Orange Pi 3 LTS. After reboot it starts normally. Unfortunately I see that with :github-s6-verbose logs have grown up huge, 4Gb+ and my whole internal storage is 8Gb. How do I get rid of those huge logs? I am on :latest now.

@anto294
Copy link

anto294 commented May 21, 2023

Hello,
I have a strange issue.
My npm stuck at setting ownership, exactly on "chown -R 1000:1000 /opt/certbot".
The OS is TrueNAS scale, but I use direct docker :latest image (not ix-sys version).

In fact, if I run that command from the shell, it never ends.

If I do quickly after deploy these commands, npm start well.

rm -rf /etc/s6-overlay/s6-rc.d/prepare/30-ownership.sh
touch /etc/s6-overlay/s6-rc.d/prepare/30-ownership.sh
chown -R "1000:1000" /data
chown -R "1000:1000" /etc/letsencrypt
chown -R "1000:1000" /run/nginx
chown -R "1000:1000" /tmp/nginx
chown -R "1000:1000" /var/cache/nginx
chown -R "1000:1000" /var/lib/logrotate
chown -R "1000:1000" /var/lib/nginx
chown -R "1000:1000" /var/log/nginx
chown -R "1000:1000" /etc/nginx/nginx
chown -R "1000:1000" /etc/nginx/nginx.conf
chown -R "1000:1000" /etc/nginx/conf.d

EDIT

chown -R  1000:1000  /opt/certbot -v

logs 1 file per second, so the deploy go timeout.

@TristanHarms
Copy link

TristanHarms commented May 21, 2023

Hello, I have a strange issue. My npm stuck at setting ownership, exactly on "chown -R 1000:1000 /opt/certbot". The OS is TrueNAS scale, but I use direct docker :latest image (not ix-sys version).

In fact, if I run that command from the shell, it never ends.

If I do quickly after deploy this commands, npm start well. rm -rf /etc/s6-overlay/s6-rc.d/prepare/30-ownership.sh touch /etc/s6-overlay/s6-rc.d/prepare/30-ownership.sh chown -R "1000:1000" /data chown -R "1000:1000" /etc/letsencrypt chown -R "1000:1000" /run/nginx chown -R "1000:1000" /tmp/nginx chown -R "1000:1000" /var/cache/nginx chown -R "1000:1000" /var/lib/logrotate chown -R "1000:1000" /var/lib/nginx chown -R "1000:1000" /var/log/nginx chown -R "1000:1000" /etc/nginx/nginx chown -R "1000:1000" /etc/nginx/nginx.conf chown -R "1000:1000" /etc/nginx/conf.d

EDIT "chown -R 1000:1000 /opt/certbot -v" log 1 file per second, so the deploy go timeout

I can confirm that this is exactly the same issue I've been seeing from my side as well.
Also running TrueNAS Scale, broken since 2.10.x.

@Sungray
Copy link

Sungray commented May 21, 2023

@anto294 @TristanHarms did you guys try the cert-prune command in the container? For me it was a problem of letsencrypt, thousands of junk files kept for a long time causing a timeout. Fixed the problem instantly.

@TristanHarms
Copy link

@anto294 @TristanHarms did you guys try the cert-prune command in the container? For me it was a problem of letsencrypt, thousands of junk files kept for a long time causing a timeout. Fixed the problem instantly.

I did, it only removed 12 entries in my case and didn't fix the issue. It also didn't change the issue as indicated by @anto294 where chowning files seems to be extremely slow, leading to a timeout on deploy.

@blaine07
Copy link

@anto294 @TristanHarms did you guys try the cert-prune command in the container? For me it was a problem of letsencrypt, thousands of junk files kept for a long time causing a timeout. Fixed the problem instantly.

Does certprune commands work now; at one point they were defunct?

@anto294
Copy link

anto294 commented May 21, 2023

@anto294 @TristanHarms did you guys try the cert-prune command in the container? For me it was a problem of letsencrypt, thousands of junk files kept for a long time causing a timeout. Fixed the problem instantly.

Of course, I have also tested with fresh containers.

@JBake130
Copy link

Im running unraid, and get this if I use :latest, but if i use :2.10.2 or now .3 it starts up fine.

just deleted the containers and images and started fresh. still get permission denied with :latest

@DcR-NL
Copy link

DcR-NL commented May 24, 2023

Im running unraid, and get this if I use :latest, but if i use :2.10.2 or now .3 it starts up fine.

You must be using an old "latest" than. Currently tag "latest", "2.10.3" and "v2" are the same image.

@JBake130
Copy link

JBake130 commented May 24, 2023

Im running unraid, and get this if I use :latest, but if i use :2.10.2 or now .3 it starts up fine.

You must be using an old "latest" than. Currently tag "latest", "2.10.3" and "v2" are the same image.

Ya it’s a fresh install, v2 just removed my container completely since it failed. Latest and 2.10.3 should be exact same thing, but only works if I define the version

also to add just jc21/nginx-proxy-manager (default on install) fails to start too. Only works when I specify a version. Weird

@SHASHWATAA
Copy link

I was having the exact same issue @JBake130. Thanks that fixed it.

@ljpaff
Copy link

ljpaff commented Jun 1, 2023

Is there any update for Rpi ecosystem?

@IgnatBeresnev
Copy link

IgnatBeresnev commented Jun 3, 2023

Can confirm, I have the same issue in TrueNAS Scale with the container getting stuck at chown -R xx:xx /opt/certbot, exactly as in #2753 (comment). Very annoying

I ended up mounting /etc/s6-overlay/s6-rc.d/prepare/30-ownership.sh to a script I created on the host, with all of the same contents, but with the modified chown line that causes the issue (just let it run in the background):

...
# Prevents errors when installing python certbot plugins when non-root
nohup chown -R "$PUID:$PGID" /opt/certbot -v &

@carsten-walther
Copy link

I had trouble installing Nginx Proxy Manager on Proxmox in LXC container.

The install script (https://github.com/ej52/proxmox-scripts/blob/main/lxc/nginx-proxy-manager/install/alpine.sh) was telling me:

  • rc-service openresty start

nginx: [emerg] getpwnam("npm") failed ....

Solution was create a new user npm:

  • adduser -s /sbin/nologin -D -H npm

  • adduser npm wheel

  • apk add doas

  • nano /etc/doas.d/doas.conf
    -- add line: permit persist :wheel

  • rc-service openresty start

This fixed it for me.

@thisisnotfez
Copy link

thisisnotfez commented Jun 26, 2023

This is the only log item I am receiving with the github-s6 image and with debug enabled.

s6-rc-compile: fatal: unable to read /etc/s6-overlay/s6-rc.d/prepare/type: No such file or directory

deftdawg added a commit to deftdawg/homeassistant-addons that referenced this issue Jul 7, 2023
@deftdawg
Copy link
Contributor

deftdawg commented Jul 8, 2023

I gave up trying to get this to run as npm user properly on v2.10.2 and v2.10.3... Instead I needed to hack the following into my home-asssistant addon fork's Dockerfile to get the container to start and run certbot. Maybe this will help someone, but I don't recommend it.

# trying to start as user: npm fails, borked s6 overlay stuff; so delete the the user
#  and just run as root to get going -- NOTE: running as root IS BAD
RUN  sed -i '/user npm;/d' /etc/nginx/nginx.conf 

# create the missing certbot in /opt and set include-system-site-packages true 
# so --user doesn't cause the certbot command to fail
RUN \
  cd /opt \
  && python -m venv certbot \
  && sed -i 's/include-system-site-packages = false/include-system-site-packages = true/' /opt/certbot/pyvenv.cfg

deftdawg added a commit to deftdawg/homeassistant-addons that referenced this issue Jul 20, 2023
@heisian
Copy link

heisian commented Aug 20, 2023

This is how to fix it: truenas/charts#1212 (comment)

S6_STAGE2_HOOK=sed -i $d /etc/s6-overlay/s6-rc.d/prepare/30-ownership.sh

This will remove the last line that was added in version 2.10+ chown -R "$PUID:$PGID" /opt/certbot with takes a long time on HDD pools

Or you can just wait for 5-10mins like I did...

@anto294
Copy link

anto294 commented Aug 20, 2023

This is how to fix it: truenas/charts#1212 (comment)

S6_STAGE2_HOOK=sed -i $d /etc/s6-overlay/s6-rc.d/prepare/30-ownership.sh
This will remove the last line that was added in version 2.10+ chown -R "$PUID:$PGID" /opt/certbot with takes a long time on HDD pools

Or you can just wait for 5-10mins like I did...

Mine won't even start after one night.
HDD pool.

Thanks for the ENV, they work :)

Copy link

Issue is now considered stale. If you want to keep it open, please comment 👍

@github-actions github-actions bot added the stale label Apr 16, 2024
@Spokeek
Copy link

Spokeek commented Sep 4, 2024

This is how to fix it: truenas/charts#1212 (comment)

S6_STAGE2_HOOK=sed -i $d /etc/s6-overlay/s6-rc.d/prepare/30-ownership.sh
This will remove the last line that was added in version 2.10+ chown -R "$PUID:$PGID" /opt/certbot with takes a long time on HDD pools

Or you can just wait for 5-10mins like I did...

This does indeed skips the steps and allows the app to start (i tried it on the container version too).
Hopefully this can be fixed in a more proper way.

@github-actions github-actions bot removed the stale label Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests