Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First time boot takes incredibly long due to chown -R "$PUID:$PGID" /opt/certbot #3154

Open
devedse opened this issue Aug 23, 2023 · 26 comments
Labels

Comments

@devedse
Copy link
Contributor

devedse commented Aug 23, 2023

Checklist

  • Have you pulled and found the error with jc21/nginx-proxy-manager:latest docker image?
    • Yes
  • Are you sure you're not using someone else's docker image?
    • Yes
  • Have you searched for similar issues (both open and closed)?
    • Yes

Describe the bug

When I start the container for the first time it's taking an incredibly long due to it setting ownership:

nginx-proxy-manager  | ❯ Configuring npm user ...
nginx-proxy-manager  | ❯ Configuring npm group ...
nginx-proxy-manager  | ❯ Checking paths ...
nginx-proxy-manager  | ❯ Setting ownership ...

I did a bit of investigation and found out it's due to this command:

chown -R "$PUID:$PGID" /opt/certbot

This code was added in this commit:
05307aa

I ran this command myself inside the container and it took about 5 minutes to complete:
image

To me it feels a bit strange to have to execute this command on the /opt folder which isn't even mapped to my host system. So why not do this during the creation of the container rather then during boot?

Nginx Proxy Manager Version

2.10.4

To Reproduce
See above

Expected behavior

Boot quickly

Screenshots

N/A

Operating System

Docker inside a container on a Synology NAS

Additional context

@devedse devedse added the bug label Aug 23, 2023
@devedse devedse changed the title First time boot takes incredibly long due to https://github.com/NginxProxyManager/nginx-proxy-manager/commit/05307aa253c073cf94237fc96d816ec2919f4d7f First time boot takes incredibly long due to chown -R "$PUID:$PGID" /opt/certbot Aug 23, 2023
@Nightreaver
Copy link

I can confirm this. Sometimes it pretty annyoing

@panos-stavrianos
Copy link

Any workaround at least? 3-4 minutes downtime just to restart npm is a lot, and very annoying when changing configs and have to restart to see the results.

@devedse
Copy link
Contributor Author

devedse commented Oct 9, 2023

I guess you could map a volume to the root fs that persists the /opt folder

@panos-stavrianos
Copy link

I guess you could map a volume to the root fs that persists the /opt folder

Thanks, it worked!

@DWRedShoes
Copy link

DWRedShoes commented Nov 9, 2023

I've ran into the exact same issue at startup of the docker container where the chown command (Setting Ownership) took around 25 minutes to complete and this happens every time I turn the container on.

Screenshot_20231109-044706

@anantanandgupta
Copy link

I guess you could map a volume to the root fs that persists the /opt folder

Thanks, it worked!

How? can you please explain? I am not getting pass to the settings ownership and it simply timesout in my case. I am mounting the data and letsencrypt folder on an nfs share.

@corporategoth
Copy link

I too am hitting this issue on my TrueNAS system. How I resolve this for now.

Step 1: Create a PVC mount called /opt/certbot2 (ie. an external mount is managed by TrueNAS - this is not the same as a named mount managed by docker, from docker's perspective it's still a host mount, but it's as good as you get with truenas).
Step 2: Start the container, then log into it via. shell (use heavyscript or truenas to get into the shell)
Step 3: Copy everything in /opt/certbot to /opt/certbot2 (cp -a /opt/certbot/* /opt/certbot2)
Step 4: Stop the container and change the PVC mount from /opt/certbot2 to /opt/certbot
Step 5: Start the container again.

It should now be very quick to chown, because the storage is external, not copy on write.
Of course, this is a terrible solution, because we are now overriding the docker image for what is in /opt/certbot, so as the container updates, our /opt/certbot will not unless you remove the PVC and do the above steps again.

@Lapo-Statix
Copy link

Lapo-Statix commented Feb 19, 2024

I guess you could map a volume to the root fs that persists the /opt folder

Thanks, it worked!

@panos-stavrianos

How did you do that? I'm having the same problem running on my server, when I run it on my machine it doesn't take long. 😢

@Nightreaver
Copy link

Nightreaver commented Feb 19, 2024 via email

@Lapo-Statix
Copy link

I would guess, run the container and wait for it to start, use docker copy and copy the /opt from the container to a folder on your harddisk once this is done, stop the container add a volume that maps the copy on your local disk to the /opt inside the container start the container again Am Mo., 19. Feb. 2024 um 18:45 Uhr schrieb Lapo @.>:

I guess you could map a volume to the root fs that persists the /opt folder Thanks, it worked! @panos-stavrianos https://github.com/panos-stavrianos How did you do that? I'm having the same problem running on my server, when I run it on my machine it doesn't take long. — Reply to this email directly, view it on GitHub <#3154 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAY6MDHQYN3XKWID27OJX3LYUOFSHAVCNFSM6AAAAAA34FODYWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNJSHE2DIMBUGM . You are receiving this because you commented.Message ID: @.
>

@Nightreaver

Thank you!! it worked :)

@willtaylor
Copy link

willtaylor commented Feb 25, 2024

I'm having the same problem with 2.11.1 via the TrueNAS catalog chart version 1.0.27

I never seem to get past the chown before TrueNAS decides that the container is unhealthy and attempts to restart it, causing an endless loop

2024-02-25 09:32:46.464991-05:00�[1;34m❯ �[1;36mConfiguring npm user ...�[0m
2024-02-25 09:36:39.192117-05:00�[1;34m❯ �[1;36mConfiguring npm group ...�[0m
2024-02-25 09:36:51.272227-05:00�[1;34m❯ �[1;36mChecking paths ...�[0m
2024-02-25 09:37:09.147531-05:00�[1;34m❯ �[1;36mSetting ownership ...�[0m
2024-02-25 09:58:04.848091-05:00�[1;34m❯ �[1;36mConfiguring npm user ...�[0m

I haven't tried the proposed workaround

EDIT

I was just in the console when the container was restarted and it echoed the following message to the console:
command terminated with exit code 137

@dasunsrule32
Copy link

dasunsrule32 commented Apr 9, 2024

Affected by this issue as well on TrueNAS SCALE 23.10.2. Same issue on the community train and installing the docker image directly as well. Takes about 10 minutes for mine to fully load.

@dasunsrule32
Copy link

dasunsrule32 commented Apr 9, 2024

I too am hitting this issue on my TrueNAS system. How I resolve this for now.

Step 1: Create a PVC mount called /opt/certbot2 (ie. an external mount is managed by TrueNAS - this is not the same as a named mount managed by docker, from docker's perspective it's still a host mount, but it's as good as you get with truenas).
Step 2: Start the container, then log into it via. shell (use heavyscript or truenas to get into the shell)
Step 3: Copy everything in /opt/certbot to /opt/certbot2 (cp -a /opt/certbot/* /opt/certbot2)
Step 4: Stop the container and change the PVC mount from /opt/certbot2 to /opt/certbot
Step 5: Start the container again.

It should now be very quick to chown, because the storage is external, not copy on write.
Of course, this is a terrible solution, because we are now overriding the docker image for what is in /opt/certbot, so as the container updates, our /opt/certbot will not unless you remove the PVC and do the above steps again.

So I copied the files to my HDD pool in my /data mount that mounts to npm and created a folder there called certbot and mounted a host path of /opt/certbot to /data/certbot it has the same effect. Boots in seconds now.

@Furglitch
Copy link

So I copied the files to my HDD pool in my /data mount that mounts to npm and created a folder there called certbot and mounted a host path of /opt/certbot to /data/certbot it has the same effect. Boots in second now.

This actually really helped me out.
Working in Portainer, I just edited my stack compose.

First added the following

volumes:
  - ./data/opt:/opt/certbot-cp # command will generate the /certbot folder in /data/opt
command: 'cp -avr /opt/certbot /opt/certbot-cp'

waited for the app to output the command response (a ton of lines starting with '/opt/certbot
then changed to

volumes:
  - ./data/opt/certbot:/opt/certbot # added /certbot to host path, removed -cp from container path
# removed command setting

and restarted. npm loaded pretty much immediately, and I was able to access the admin panel.
restarted again just to make sure, and npm loaded in less than 10 seconds.

I'm still learning docker so I may not be doing it the best way, but it worked for me.

@dasunsrule32
Copy link

So I copied the files to my HDD pool in my /data mount that mounts to npm and created a folder there called certbot and mounted a host path of /opt/certbot to /data/certbot it has the same effect. Boots in second now.

This actually really helped me out.
Working in Portainer, I just edited my stack compose.

First added the following

volumes:
  - ./data/opt:/opt/certbot-cp # command will generate the /certbot folder in /data/opt
command: 'cp -avr /opt/certbot /opt/certbot-cp'

waited for the app to output the command response (a ton of lines starting with '/opt/certbot
then changed to

volumes:
  - ./data/opt/certbot:/opt/certbot # added /certbot to host path, removed -cp from container path
# removed command setting

and restarted. npm loaded pretty much immediately, and I was able to access the admin panel.
restarted again just to make sure, and npm loaded in less than 10 seconds.

I'm still learning docker so I may not be doing it the best way, but it worked for me.

Yep, same thing I did, just using docker compose.

@MichaelKirgus
Copy link

I too am hitting this issue on my TrueNAS system. How I resolve this for now.
Step 1: Create a PVC mount called /opt/certbot2 (ie. an external mount is managed by TrueNAS - this is not the same as a named mount managed by docker, from docker's perspective it's still a host mount, but it's as good as you get with truenas).
Step 2: Start the container, then log into it via. shell (use heavyscript or truenas to get into the shell)
Step 3: Copy everything in /opt/certbot to /opt/certbot2 (cp -a /opt/certbot/* /opt/certbot2)
Step 4: Stop the container and change the PVC mount from /opt/certbot2 to /opt/certbot
Step 5: Start the container again.
It should now be very quick to chown, because the storage is external, not copy on write.
Of course, this is a terrible solution, because we are now overriding the docker image for what is in /opt/certbot, so as the container updates, our /opt/certbot will not unless you remove the PVC and do the above steps again.

So I copied the files to my HDD pool in my /data mount that mounts to npm and created a folder there called certbot and mounted a host path of /opt/certbot to /data/certbot it has the same effect. Boots in seconds now.

This cant be an solution, only an workaround. Its VERY anoying...at every update of the container or helm forces an downtime for over an hour...

@dasunsrule32
Copy link

I too am hitting this issue on my TrueNAS system. How I resolve this for now.
Step 1: Create a PVC mount called /opt/certbot2 (ie. an external mount is managed by TrueNAS - this is not the same as a named mount managed by docker, from docker's perspective it's still a host mount, but it's as good as you get with truenas).
Step 2: Start the container, then log into it via. shell (use heavyscript or truenas to get into the shell)
Step 3: Copy everything in /opt/certbot to /opt/certbot2 (cp -a /opt/certbot/* /opt/certbot2)
Step 4: Stop the container and change the PVC mount from /opt/certbot2 to /opt/certbot
Step 5: Start the container again.
It should now be very quick to chown, because the storage is external, not copy on write.
Of course, this is a terrible solution, because we are now overriding the docker image for what is in /opt/certbot, so as the container updates, our /opt/certbot will not unless you remove the PVC and do the above steps again.

So I copied the files to my HDD pool in my /data mount that mounts to npm and created a folder there called certbot and mounted a host path of /opt/certbot to /data/certbot it has the same effect. Boots in seconds now.

This cant be an solution, only an workaround. Its VERY anoying...at every update of the container or helm forces an downtime for over an hour...

Not a solution, just a workaround for now.

@MichaelKirgus
Copy link

I too am hitting this issue on my TrueNAS system. How I resolve this for now.
Step 1: Create a PVC mount called /opt/certbot2 (ie. an external mount is managed by TrueNAS - this is not the same as a named mount managed by docker, from docker's perspective it's still a host mount, but it's as good as you get with truenas).
Step 2: Start the container, then log into it via. shell (use heavyscript or truenas to get into the shell)
Step 3: Copy everything in /opt/certbot to /opt/certbot2 (cp -a /opt/certbot/* /opt/certbot2)
Step 4: Stop the container and change the PVC mount from /opt/certbot2 to /opt/certbot
Step 5: Start the container again.
It should now be very quick to chown, because the storage is external, not copy on write.
Of course, this is a terrible solution, because we are now overriding the docker image for what is in /opt/certbot, so as the container updates, our /opt/certbot will not unless you remove the PVC and do the above steps again.

So I copied the files to my HDD pool in my /data mount that mounts to npm and created a folder there called certbot and mounted a host path of /opt/certbot to /data/certbot it has the same effect. Boots in seconds now.

This cant be an solution, only an workaround. Its VERY anoying...at every update of the container or helm forces an downtime for over an hour...

Not a solution, just a workaround for now.

Ok, sorry if my comment was a bit salty...lets be productive:
Is there an elegant solution for this? What about executing chrown only if PUID/PGID is set or check if the first file under "/data/certbot" has the right permissions and owner and then skip it?

@nintendoeats
Copy link

Did anybody ever figure out an answer to the original question of why this is required? Could we simply...remove the offending line?....

@Mathpro
Copy link

Mathpro commented Sep 4, 2024

Same issue here unfortunately

@Dremor
Copy link

Dremor commented Sep 6, 2024

Same here, Truenas 24.10-Beta.1.

10 to 15 minutes downtime, on an SSD array. I can imagine how frustrating it must be on a pure HDD array.

@Tsaukpaetra
Copy link

I think instead of using a hacky workaround, perhaps the underlying issue with whatever causes certbot to complain about can be resolved? Like, if it's to make it not complain about not running as root... isn't there something with the python stuff that lets user packages to be used?

Maybe I'm just dumb but I'm baffled why this is a thing...

@LexiconCode
Copy link

LexiconCode commented Sep 8, 2024

Workaround add an Additional Environment Variable on Truenas 24.10-Beta.1:
S6_STAGE2_HOOK=sed -i $d /etc/s6-overlay/s6-rc.d/prepare/30-ownership.sh

@Dremor
Copy link

Dremor commented Sep 14, 2024

Workaround add an Additional Environment Variable on Truenas 24.10-Beta.1: S6_STAGE2_HOOK=sed -i $d /etc/s6-overlay/s6-rc.d/prepare/30-ownership.sh

That's indeed the current (sometime buggy) workaround, but that won't resolve the issue itself.

The issue was caused by going from a python user package to a system package (see commit linked in the issue), which, if I understand it well, can only be run as root unless you chown it back, which take ages.
The question is, should we really chown the whole /opt/certbot, or is it possible to do it another way, like chowning only a subset (like only the plugin directory, and only if there is a plugin installed) to reduce the time required, or, ideally, finding a better solution that the commit introducing this was trying to address.

@skirsch
Copy link

skirsch commented Sep 23, 2024

It's only 20 minutes startup time with HDD in a truenas array. WIth the fix, the startup time is 5 seconds.

@dasunsrule32
Copy link

This happens on straight docker as well on zfs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests