New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

restart needed after autoSSL #662

Open
jrsarath opened this Issue Oct 18, 2017 · 61 comments

Comments

Projects
None yet
@jrsarath
Copy link

jrsarath commented Oct 18, 2017

I'm having an issue with AutoSSL
whenever a website gets a ssl certificate via AutoSSL
we need to restart Nginx either website shows OLD ssl certificate,
its kinda weird, we tried to update /etc/crontab , opened that file from WHM > Eningtron APP, clicked on save
still no result searched old issues for Hour didn't found any solution

@damianetienne

This comment has been minimized.

Copy link

damianetienne commented Oct 18, 2017

Same here. I have the same trouble on my servers.

I've to restart nginx every time that an autossl certificate is installed/renewed. I thought that the v1.8.4 version resolved that, but this is not the case.

@jrsarath

This comment has been minimized.

Copy link

jrsarath commented Oct 19, 2017

Waiting for a solution...

@josephsimony

This comment has been minimized.

Copy link

josephsimony commented Oct 19, 2017

Strangely I don't have now and didn't have any problems on any of my servers with Auto SSL and Engintron so far (not with SSL, not with their renewals).

Currently, servers running on:
Autossl provider: Let’s Encrypt
cPanel: 68.0.6
Engintron: 1.8.6

@damianetienne

This comment has been minimized.

Copy link

damianetienne commented Oct 19, 2017

My case is with autossl with Comodo certs. I've not tried with LE.

@jrsarath

This comment has been minimized.

Copy link

jrsarath commented Oct 19, 2017

@damianetienne i agree with you
we are using Comodo cert too
and problem exists

@Silviu-LS

This comment has been minimized.

Copy link

Silviu-LS commented Oct 19, 2017

I have the same issues on 100+ servers , I had to create a bash scripts to read the SSL files if they were changed or newly created.

nginx reload instead of restart is working sometimes...

@jrsarath

This comment has been minimized.

Copy link

jrsarath commented Oct 19, 2017

well we will love to take a look at that script
thanks in advance

@jrsarath

This comment has been minimized.

Copy link

jrsarath commented Oct 19, 2017

Where is author
Hello, Author will look into this, please ?
this guy has this same issue over 100 server OMG !!
Engintron is a great Script for sure, but kindly try to resolve this please

@Silviu-LS

This comment has been minimized.

Copy link

Silviu-LS commented Oct 19, 2017

This is still in progress , but I have attached it.

ng.txt

@jrsarath

This comment has been minimized.

Copy link

jrsarath commented Oct 19, 2017

alright thank you,
and let's see if we can make any progress with your current work

@Silviu-LS

This comment has been minimized.

Copy link

Silviu-LS commented Oct 19, 2017

Unfortunately , I don't have time atm to perfect the code in order to work as expected ( near perfect ) , but if you make any changes to the code , please share it back.

@jrsarath

This comment has been minimized.

Copy link

jrsarath commented Oct 19, 2017

i will make sure that if we can make progress with your script then it be available as a patch for Engintron

@brixly

This comment has been minimized.

Copy link

brixly commented Oct 19, 2017

My comments were added to

#631 Need to restart nginx/apache everytime ssl certificate changes

Desperately in need of a solution / fix.

I know that it's been referenced several times to say it should work as is, but it just doesn't for us. Almost every day we receive numerous requests to restart nginx.

@jrsarath

This comment has been minimized.

Copy link

jrsarath commented Oct 19, 2017

i am sure that this is not a problem of cron job cron job is running,
I'm looking for something else
https://github.com/tgerov/engintron/blob/patch-1/nginx/utilities/https_vhosts.sh
gonna try this patch

@brixly

This comment has been minimized.

Copy link

brixly commented Oct 19, 2017

What are the differences between the patch you just referenced and the included version?

I can see it does a restart - I guess this is instead of the reload?

I always feel concerned when it comes to full nginx restarts incase of syntax errors which prevents nginx starting back up

@jrsarath

This comment has been minimized.

Copy link

jrsarath commented Oct 19, 2017

yep, just a restart, while job can be done with a Reload

@brixly

This comment has been minimized.

Copy link

brixly commented Oct 19, 2017

Can you test the patch on your system after reissuing a new SSL and let us know how it goes?

I assume the patch will work, but with the added risk of a full restart taking place rather than just a reload.

@Silviu-LS

This comment has been minimized.

Copy link

Silviu-LS commented Oct 19, 2017

I'm guessing this will not work for current certificates , since the Patch doesn't check if the SSL file was updated ( cPanel SSL or Let's Encrypt ) , those are the issues that I have faced.

This will work for new SSL certificates , but the current ones ? that expires after 3 months and renewed monthly or at a custom date ?

@damianetienne

This comment has been minimized.

Copy link

damianetienne commented Oct 19, 2017

@Silviu-LS Comodo and LE renew the certs every 3 months.

In my case, i need to restart nginx wether is a new certifiate or is just an extension of the current cert.

@Silviu-LS

This comment has been minimized.

Copy link

Silviu-LS commented Oct 19, 2017

@damianetienne exactly what I need too.

I'm still trying to fix the final script to cover all this and reload the nginx when this is really needed.

@jrsarath

This comment has been minimized.

Copy link

jrsarath commented Oct 19, 2017

I'm not sure if this php script is being actually executed or not,
/etc/nginx/utilities/https_vhosts.php

@fevangelou

This comment has been minimized.

Copy link
Member

fevangelou commented Oct 26, 2017

@jrsarath Verify first you're not dealing with this https://github.com/engintron/engintron/wiki/SSL-certificate-changes-not-visible-in-Nginx%3F-Here's-a-possible-explanation-&-solution - also make sure it's not a case of CPANEL-15961 (bug).

@brixly

This comment has been minimized.

Copy link

brixly commented Oct 26, 2017

@fevangelou my comments were added to

#631 Need to restart nginx/apache everytime ssl certificate changes

We have tried numerous times running this manually - the process completes fine for us, however it doesn't make a difference in terms of the SSL / browser situation. They aren't working in a browser until we run...

/engintron.sh purgecache

Actually, we tend to do the following...

/scripts/rebuildhttpdconf
/engintron.sh purgecache

We run the first to ensure the apache conf is correct / latest version - we then run the purgecache to, well, clear the caches and restart those services.

Once the services have been restarted, the SSL's work beautifully in the browser, every time.

If just the /etc/nginx/utilities/https_vhosts.php script is ran, it honestly makes no difference to the SSL as it appears in the browser.

We have known numerous times after 24 hours of a customer waiting post SSL installation a ticket being opened to point the issue out. We follow our own steps above, and its working again.

@fevangelou

This comment has been minimized.

Copy link
Member

fevangelou commented Oct 26, 2017

@brixly Nginx's purge cache option is actually kicked in by bash /usr/local/src/engintron/engintron.sh purgecache (line 18 in https_vhosts.sh). Caches will be properly flushed and Apache restarted. By issuing /scripts/rebuildhttpdconf you actually tell Apache to regenerate its httpd.conf file. There's no reason to do this. The https_vhosts.sh already kicked in EXACTLY because it detected a change in httpd.conf. Re-generating httpd.conf once more will simply force https_vhosts.sh to kick in again in 15 seconds. In other words, you're forcing your server to restart every 15 seconds. Think about it.

I have pushed a new build today which allows setting the Nginx vhost generation interval to something bigger than 15 seconds. This may come handy for servers with hundreds of sites for which the process to generate Nginx vhosts may take more than the actual interval for checking httpd.conf, which in turn could lead to a broken Nginx config.

Update to today's build and then open up the file /etc/nginx/utilities/https_vhosts.sh and change the value INTERVAL from 15 to say 20 or 30. Verify changes take effect using a private window in your browser.

@jrsarath

This comment has been minimized.

Copy link

jrsarath commented Oct 26, 2017

i will update and will notify @fevangelou
and i have already verified this https://github.com/engintron/engintron/wiki/SSL-certificate-changes-not-visible-in-Nginx%3F-Here's-a-possible-explanation-&-solution
i can also make sure that cron is running and https_vhost.sh is being executed

@jhawkins002

This comment has been minimized.

Copy link

jhawkins002 commented Oct 31, 2017

Been experiencing this problem for quite some time too; commented on other threads. Given the new release we tried today removing our CRON rules that automatically restart NGINX every 5 minutes to test if the new patches solved the issue on our servers.

Unfortunately I can confirm we at least still have the problem. Tested with both 30 second and 60 second intervals.

I can add that it seems the script is kicking off as it should because newly created/updated domains are indeed detected by https_vhosts.sh and updates are applied to the NGINX default_https.conf file. NGINX simply doesnt seem to be applying the new conf until a restart is manually invoked.

I do think @fevangelou's idea that there is a timing element to this is potentially right on the money. During our test this afternoon, our logs show the following:

Test account created at 14:28:41

Tue Oct 31 14:28:41 2017:CREATE:betatest:root:engin2.betatesting.as.ua.edu:67.43.4.132:engin2betatestin

LetsEncrypt Certificate installed at 14:29:45 (zulu-adjusted time)

[2017-10-31T19:29:41Z] The website “engin2.betatesting.as.ua.edu”, owned by “engin2betatestin”, has a faulty SSL certificate (OPENSSL_VERIFY:0:18:DEPTH_ZERO_SELF_SIGNED_CERT). AutoSSL will attempt to replace this certificate.
[2017-10-31T19:29:41Z] The system will attempt to renew SSL certificates for the following websites:
[2017-10-31T19:29:41Z] engin2.betatesting.as.ua.edu (engin2.betatesting.as.ua.edu www.engin2.betatesting.as.ua.edu mail.engin2.betatesting.as.ua.edu)
[2017-10-31T19:29:45Z] The system has installed a new certificate onto “engin2betatestin”’s website “engin2.betatesting.as.ua.edu”.

Engintron sees an updated configuration in Apache and rebuilds the NGINX default_https.conf accordingly at 14:29:01 - note above that AutoSSL doesn't actually complete until 14:29:45 so the initiating element may be the account creation and not the certificate install

stat /etc/nginx/conf.d/default_https.conf
  File: ‘/etc/nginx/conf.d/default_https.conf’
  Size: 53157     	Blocks: 104        IO Block: 4096   regular file
Device: fd03h/64771d	Inode: 1044384     Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2017-10-31 14:35:42.126000000 -0500
Modify: 2017-10-31 14:29:01.686000000 -0500
Change: 2017-10-31 14:29:01.686000000 -0500
 Birth: -

After the configuration update, /var/log/messages show that NGINX was never restarted to pick up an updated conf profile, as the last restart was at 14:23:34 (a restart we manually-invoked during a prior test)

Oct 31 14:23:34 galactica systemd: Started nginx - high performance web server.

Hope this helps at least a little!

@jrsarath

This comment has been minimized.

Copy link

jrsarath commented Oct 31, 2017

yeah actually i forgot to report back,
problem still persists

@pnboy

This comment has been minimized.

Copy link

pnboy commented Nov 8, 2017

it seams the problem is with /etc/nginx/utilities/https_vhosts.sh
the line:
RUN_CHECK=$(/usr/bin/php -c /dev/null /etc/nginx/utilities/https_vhosts.php)

should probably be changed to something like:
RUN_CHECK=$(/usr/bin/php -c /dev/null /etc/nginx/utilities/https_vhosts.php > /dev/null 2>&1 ; echo $?)

@jhawkins002

This comment has been minimized.

Copy link

jhawkins002 commented Nov 8, 2017

Can confirm @pnboy's adjustment to https_vhosts.sh appears to have fixed the issue on our servers!

@brixly

This comment has been minimized.

Copy link

brixly commented Nov 9, 2017

I have tried this again to the death and have found the following...

With the above changes, running /etc/nginx/utilities/https_vhosts.sh manually from shell immediately after issuing an SSL it works perfectly.

When ran from the cron though, it isn't working. When issuing a new certificate in exactly the same process, $RUN_CHECK=1 - then the cron runs and afterwards $RUN_CHECK=0

That being the case, I was expecting the purgecache to have ran, however it hasn't.

Yet, replicate the exact same scenario and run https_vhosts.sh manually, it works perfectly.

Why would this be the case?

My cron entry is...

          • root /etc/nginx/utilities/https_vhosts.sh >> /dev/null 2>&1
@brixly

This comment has been minimized.

Copy link

brixly commented Nov 9, 2017

Giving up slightly - I have disabled the cron entirely for now.

[root@charlie ~]# RUN_CHECK=$(/usr/bin/php -c /dev/null /etc/nginx/utilities/https_vhosts.php > /dev/null 2>&1 ; echo $?); echo $RUN_CHECK
1
[root@charlie ~]# RUN_CHECK=$(/usr/bin/php -c /dev/null /etc/nginx/utilities/https_vhosts.php > /dev/null 2>&1 ; echo $?); echo $RUN_CHECK
1
[root@charlie ~]# RUN_CHECK=$(/usr/bin/php -c /dev/null /etc/nginx/utilities/https_vhosts.php > /dev/null 2>&1 ; echo $?); echo $RUN_CHECK
0
[root@charlie ~]# RUN_CHECK=$(/usr/bin/php -c /dev/null /etc/nginx/utilities/https_vhosts.php > /dev/null 2>&1 ; echo $?); echo $RUN_CHECK
0

The value is changing to 0 without the cron actually running - shouldn't this remain '1' until the https_vhosts.sh script has ran from cron?

@brixly

This comment has been minimized.

Copy link

brixly commented Nov 9, 2017

Done some further digging - Think we are getting somewhere!

#607

This made for a good read - I was struggling to get my head around the logic as sometimes the script was working, sometimes it didn't restart nginx.

I think the situation is this -

If a rebuildhttpdconf is called and within the 15 seconds (as set in define('HTTPD_CONF_LAST_CHANGED', 15)) the https_vhosts.sh script is called then it works perfectly.

If though, the rebuildhttpdconf happens, then 30 seconds later the cron kicks in, that change is no longer recognised and the purgecache is skipped entirely.

I have tried changing to 5 seconds and 30 seconds - 30 seconds works really well.

@fevangelou - is there a reason this is set as low as 15 seconds? Would you mind explaining the logic?

@fevangelou

This comment has been minimized.

Copy link
Member

fevangelou commented Nov 16, 2017

I have a question here...

If my original script did not work as expected, wound't Engintron NOT function properly for anyone?

I hope this is clear.

What you're facing is a problem with certain conditions:

  • you have servers with hundreds of cPanel accounts on them
  • your servers are not as powerful to hold so many accounts
  • given the above points, operations like Apache rebuilding its config take longer than most cPanel servers with Nginx/Engintron on them and subsequently cause misbehaviour on Engintron's part while its rebuilding the Nginx HTTPS vhosts.

The solution to the above problems is simple if you don't wish to change/upgrade your server hardware or distribute cPanel accounts to more servers. Simply raise HTTPD_CONF_LAST_CHANGED to a value higher than 30 secs. Make it 60 so that it's executed once a minute. This way, when a single site is added or removed, the change is mirrored in Nginx in 1 min tops.

For the record, the shell script https_vhosts.sh controls the process, while https_vhosts.php is the process to generate the Nginx HTTPS vhost files. When the PHP script outputs 1, it means the vhosts files have been regenerated. Then the shell script sees this value and tells Nginx to purge its cache and restart along with Apache. It's just 0 & 1 being output by the PHP script and this value is picked up by the shell script to determine whether it will purge Nginx's cache and restart both Nginx and Apache. The PHP script could output "Yes Sir" and "No Sir". It doesn't matter as long as the shell script knows what to expect as a control value in order to do what it's supposed to do.

So the fact that you're seeing something "change" magically when you output "YES" or "NO" is NOT based on that change but rather the circumstances. That's why your change fails later. And by circumstances, I switch back to my initial point made: lots of cPanel accounts or poor performing servers. Of course it's a matter of WHEN the changes occur. If you, as the server admin, perform any code changes when there are no clients using their cPanel accounts, it's natural that you'll see "YES" and "NO" working instead of 1 and 0. They are both valid of course, but you're simply testing things out when the scripts did not have any issue in the first place.

Re-consider your strategy. Either distribute cPanel accounts to more poor performing servers (who would either way explode without Nginx) or simply upgrade your servers with the right components (e.g. RAM, which is dead simple to add).

@fevangelou fevangelou closed this Nov 16, 2017

@jhawkins002

This comment has been minimized.

Copy link

jhawkins002 commented Nov 16, 2017

That may be true in some cases.

I can report we have documented this issue occurring on a brand new cPanel server (Xeon E3, 4 core, 16GB RAM) with 0 other accounts than the one we used to test Engingtron and SSL. The server was also running on a recent release Engintron (i.e. the CRON format fix had been implemented). No other third party plugins or cPanel addons were installed at the time.

The log data we provided above in this particular thread (showing NGINX never restarted) was from a staging server with ~ 60 accounts. It happens to be a Xeon E3 machine with 12GB that typically runs a load of 0.06 or so.

@fevangelou

This comment has been minimized.

Copy link
Member

fevangelou commented Nov 16, 2017

@jhawkins002 so you setup a stock cPanel server with Engintron on it and it had issues with vhost generation? On a valid and publicly accessible domain/subdomain? Without any other plugins or modifications in cPanel?

@jhawkins002

This comment has been minimized.

Copy link

jhawkins002 commented Nov 16, 2017

Precisely. The actual NGINX https conf appears to generate just fine (timestamp verified by stat) after the domain gets its SSL certificate, but NGINX itself never restarts - verified in /var/log/messages

@fevangelou

This comment has been minimized.

Copy link
Member

fevangelou commented Nov 16, 2017

Is the shell script https_vhosts .sh executable? Did you run the installation as root user?

@jhawkins002

This comment has been minimized.

Copy link

jhawkins002 commented Nov 16, 2017

Indeed.

[root@betatesting]# ls -l /etc/nginx/utilities/https_vhosts.sh
-rwxr-xr-x 1 root root 929 Nov  8 11:55 /etc/nginx/utilities/https_vhosts.sh
@fevangelou

This comment has been minimized.

Copy link
Member

fevangelou commented Nov 16, 2017

@brixly

This comment has been minimized.

Copy link

brixly commented Nov 16, 2017

I can reproduce it - it's been the same for us but the blame is being pushed to the hardware.

Our situation is identical.

@fevangelou

This comment has been minimized.

Copy link
Member

fevangelou commented Nov 16, 2017

@brixly

This comment has been minimized.

Copy link

brixly commented Nov 16, 2017

Although not many, I do see a few more people who have commented on this thread (as well as a number of similar related threads) - they / we are all having to try and add modifications and fixes to the system

@photogaff

This comment has been minimized.

Copy link

photogaff commented Nov 16, 2017

Add me to the list of people experiencing the error - The only way I could resolve my issue was to disable and re-enable Engintron - there may have been another way, but that was my initial 'fix quickly' solution because I was seeing SSL errors on all letsencrypt configured domains.

@pnboy

This comment has been minimized.

Copy link

pnboy commented Nov 17, 2017

Hi @fevangelou
the main issue is in /etc/nginx/utilities/https_vhosts.sh

that scripts is counting on a 1 from the /etc/nginx/utilities/https_vhosts.php return value in the variable RUN_CHECK, to fire the purgecache, but this will never happen as it's the output of the script that is returned, the echo's, not the exit return value. This variable ends up with the content "HTTPS vhosts for Nginx re-created.\n"

For the script to work as intended it needs to be changed to something like:
RUN_CHECK=$(/usr/bin/php -c /dev/null /etc/nginx/utilities/https_vhosts.php > /dev/null 2>&1 ; echo $?)

so that it actually get the return value of the command, not the output :)

@fevangelou

This comment has been minimized.

Copy link
Member

fevangelou commented Nov 21, 2017

@pnboy Good catch. Devised a bit differently overall, but you can see the changes in the just released v1.8.7.

@nuclearsam

This comment has been minimized.

Copy link

nuclearsam commented Nov 28, 2017

I am facing the exact same issue. The expired certificates are cached until I restart (or clear cache) manually. I have only a few sites (around 4) running on a very powerful hardware.

Since the status of this issue is closed, does that mean it is fixed in the next version update? or do we need to make any changes manually to fix this?

Regards
Sam

@brixly

This comment has been minimized.

Copy link

brixly commented Nov 28, 2017

@nuclearsam - try and upgrade to the latest version of Engintron (1.8.7)

The new version released by @fevangelou seems to have resolved the issue for us.

Check /etc/nginx/utilities/https_vhosts.log shortly after issuing a new SSL certificate - if the log has some contents, it will show the purgecache function running.

If nothing is being output to the logs, try going editing the cron via the Engintron interface, then re-saving the cron entry.

@pnboy

This comment has been minimized.

Copy link

pnboy commented Nov 29, 2017

@fevangelou it still seams not to work (for me), this is the output of RUN_CHECK:

<br /> <b>Warning</b>: date(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in <b>/etc/nginx/utilities/https_vhosts.php</b> on line <b>40</b><br /> 1
0
0
0

@brixly

This comment has been minimized.

Copy link

brixly commented Nov 29, 2017

This isn't because of engintron - you need to set your timezone in php

@pnboy

This comment has been minimized.

Copy link

pnboy commented Nov 30, 2017

the command in /etc/nginx/utilities/https_vhosts.sh
/usr/bin/php -c /dev/null -q /etc/nginx/utilities/https_vhosts.php ; echo $?
overrides the php.ini timezone definition, so it has to be fixed in that engintron php script, either by suppressing the warning or by adding something similar to:

if (!ini_get('date.timezone')) {
    ini_set('date.timezone', 'UTC');
}

or by redirecting the output of the command to somewhere else

@imorandinwnp

This comment has been minimized.

Copy link

imorandinwnp commented Feb 28, 2018

Hi. Still experiencing AutoSSL being "cached" here. Only way to solve it is reloading nginx (service nginx reload).
https_vhosts.sh script does not solve the problem.
Engintron version 1.8.7

I suggest adding a service nginx reload to https_vhosts.sh script

@valiant1x

This comment has been minimized.

Copy link

valiant1x commented May 16, 2018

Issue still occurring in Engintron version 1.8.10, please reopen @jrsarath @brixly

@damianetienne

This comment has been minimized.

Copy link

damianetienne commented May 16, 2018

Indeed.

One of the upper comments solves the problem but, after every update, seems that the updater overwrite the changes.

@fevangelou

This comment has been minimized.

Copy link
Member

fevangelou commented Jul 6, 2018

@damianetienne Which one?

@imorandinwnp Nginx is reloaded when its cache is purged. Look closer ;)

@fevangelou fevangelou reopened this Jul 6, 2018

@jhawkins002

This comment has been minimized.

Copy link

jhawkins002 commented Jul 6, 2018

Not sure if this adds to the discussion - but cPanel has FINALLY added official hooks for autossl events as of v.72. Hooray!

Changelog details: https://documentation.cpanel.net/display/72Docs/72+Release+Notes#id-72ReleaseNotes-NewStandardizedHooks

@fevangelou

This comment has been minimized.

Copy link
Member

fevangelou commented Jul 6, 2018

@jhawkins002 thanks for that. I'll have a look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment