Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

restart needed after autoSSL #662

Closed
jrsarath opened this issue Oct 18, 2017 · 61 comments
Closed

restart needed after autoSSL #662

jrsarath opened this issue Oct 18, 2017 · 61 comments

Comments

@jrsarath
Copy link

I'm having an issue with AutoSSL
whenever a website gets a ssl certificate via AutoSSL
we need to restart Nginx either website shows OLD ssl certificate,
its kinda weird, we tried to update /etc/crontab , opened that file from WHM > Eningtron APP, clicked on save
still no result searched old issues for Hour didn't found any solution

@damianetienne
Copy link

Same here. I have the same trouble on my servers.

I've to restart nginx every time that an autossl certificate is installed/renewed. I thought that the v1.8.4 version resolved that, but this is not the case.

@jrsarath
Copy link
Author

Waiting for a solution...

@josephsimony
Copy link

Strangely I don't have now and didn't have any problems on any of my servers with Auto SSL and Engintron so far (not with SSL, not with their renewals).

Currently, servers running on:
Autossl provider: Let’s Encrypt
cPanel: 68.0.6
Engintron: 1.8.6

@damianetienne
Copy link

My case is with autossl with Comodo certs. I've not tried with LE.

@jrsarath
Copy link
Author

@damianetienne i agree with you
we are using Comodo cert too
and problem exists

@Silviu-LS
Copy link

I have the same issues on 100+ servers , I had to create a bash scripts to read the SSL files if they were changed or newly created.

nginx reload instead of restart is working sometimes...

@jrsarath
Copy link
Author

well we will love to take a look at that script
thanks in advance

@jrsarath
Copy link
Author

Where is author
Hello, Author will look into this, please ?
this guy has this same issue over 100 server OMG !!
Engintron is a great Script for sure, but kindly try to resolve this please

@Silviu-LS
Copy link

This is still in progress , but I have attached it.

ng.txt

@jrsarath
Copy link
Author

alright thank you,
and let's see if we can make any progress with your current work

@Silviu-LS
Copy link

Unfortunately , I don't have time atm to perfect the code in order to work as expected ( near perfect ) , but if you make any changes to the code , please share it back.

@jrsarath
Copy link
Author

i will make sure that if we can make progress with your script then it be available as a patch for Engintron

@brixly
Copy link

brixly commented Oct 19, 2017

My comments were added to

#631 Need to restart nginx/apache everytime ssl certificate changes

Desperately in need of a solution / fix.

I know that it's been referenced several times to say it should work as is, but it just doesn't for us. Almost every day we receive numerous requests to restart nginx.

@jrsarath
Copy link
Author

i am sure that this is not a problem of cron job cron job is running,
I'm looking for something else
https://github.com/tgerov/engintron/blob/patch-1/nginx/utilities/https_vhosts.sh
gonna try this patch

@brixly
Copy link

brixly commented Oct 19, 2017

What are the differences between the patch you just referenced and the included version?

I can see it does a restart - I guess this is instead of the reload?

I always feel concerned when it comes to full nginx restarts incase of syntax errors which prevents nginx starting back up

@jrsarath
Copy link
Author

yep, just a restart, while job can be done with a Reload

@brixly
Copy link

brixly commented Oct 19, 2017

Can you test the patch on your system after reissuing a new SSL and let us know how it goes?

I assume the patch will work, but with the added risk of a full restart taking place rather than just a reload.

@Silviu-LS
Copy link

I'm guessing this will not work for current certificates , since the Patch doesn't check if the SSL file was updated ( cPanel SSL or Let's Encrypt ) , those are the issues that I have faced.

This will work for new SSL certificates , but the current ones ? that expires after 3 months and renewed monthly or at a custom date ?

@damianetienne
Copy link

@Silviu-LS Comodo and LE renew the certs every 3 months.

In my case, i need to restart nginx wether is a new certifiate or is just an extension of the current cert.

@Silviu-LS
Copy link

@damianetienne exactly what I need too.

I'm still trying to fix the final script to cover all this and reload the nginx when this is really needed.

@jrsarath
Copy link
Author

jrsarath commented Oct 19, 2017

I'm not sure if this php script is being actually executed or not,
/etc/nginx/utilities/https_vhosts.php

@fevangelou
Copy link
Member

@jrsarath Verify first you're not dealing with this https://github.com/engintron/engintron/wiki/SSL-certificate-changes-not-visible-in-Nginx%3F-Here's-a-possible-explanation-&-solution - also make sure it's not a case of CPANEL-15961 (bug).

@brixly
Copy link

brixly commented Oct 26, 2017

@fevangelou my comments were added to

#631 Need to restart nginx/apache everytime ssl certificate changes

We have tried numerous times running this manually - the process completes fine for us, however it doesn't make a difference in terms of the SSL / browser situation. They aren't working in a browser until we run...

/engintron.sh purgecache

Actually, we tend to do the following...

/scripts/rebuildhttpdconf
/engintron.sh purgecache

We run the first to ensure the apache conf is correct / latest version - we then run the purgecache to, well, clear the caches and restart those services.

Once the services have been restarted, the SSL's work beautifully in the browser, every time.

If just the /etc/nginx/utilities/https_vhosts.php script is ran, it honestly makes no difference to the SSL as it appears in the browser.

We have known numerous times after 24 hours of a customer waiting post SSL installation a ticket being opened to point the issue out. We follow our own steps above, and its working again.

@fevangelou
Copy link
Member

@brixly Nginx's purge cache option is actually kicked in by bash /usr/local/src/engintron/engintron.sh purgecache (line 18 in https_vhosts.sh). Caches will be properly flushed and Apache restarted. By issuing /scripts/rebuildhttpdconf you actually tell Apache to regenerate its httpd.conf file. There's no reason to do this. The https_vhosts.sh already kicked in EXACTLY because it detected a change in httpd.conf. Re-generating httpd.conf once more will simply force https_vhosts.sh to kick in again in 15 seconds. In other words, you're forcing your server to restart every 15 seconds. Think about it.

I have pushed a new build today which allows setting the Nginx vhost generation interval to something bigger than 15 seconds. This may come handy for servers with hundreds of sites for which the process to generate Nginx vhosts may take more than the actual interval for checking httpd.conf, which in turn could lead to a broken Nginx config.

Update to today's build and then open up the file /etc/nginx/utilities/https_vhosts.sh and change the value INTERVAL from 15 to say 20 or 30. Verify changes take effect using a private window in your browser.

@jrsarath
Copy link
Author

i will update and will notify @fevangelou
and i have already verified this https://github.com/engintron/engintron/wiki/SSL-certificate-changes-not-visible-in-Nginx%3F-Here's-a-possible-explanation-&-solution
i can also make sure that cron is running and https_vhost.sh is being executed

@jhawkins002
Copy link

Been experiencing this problem for quite some time too; commented on other threads. Given the new release we tried today removing our CRON rules that automatically restart NGINX every 5 minutes to test if the new patches solved the issue on our servers.

Unfortunately I can confirm we at least still have the problem. Tested with both 30 second and 60 second intervals.

I can add that it seems the script is kicking off as it should because newly created/updated domains are indeed detected by https_vhosts.sh and updates are applied to the NGINX default_https.conf file. NGINX simply doesnt seem to be applying the new conf until a restart is manually invoked.

I do think @fevangelou's idea that there is a timing element to this is potentially right on the money. During our test this afternoon, our logs show the following:

Test account created at 14:28:41

Tue Oct 31 14:28:41 2017:CREATE:betatest:root:engin2.betatesting.as.ua.edu:67.43.4.132:engin2betatestin

LetsEncrypt Certificate installed at 14:29:45 (zulu-adjusted time)

[2017-10-31T19:29:41Z] The website “engin2.betatesting.as.ua.edu”, owned by “engin2betatestin”, has a faulty SSL certificate (OPENSSL_VERIFY:0:18:DEPTH_ZERO_SELF_SIGNED_CERT). AutoSSL will attempt to replace this certificate.
[2017-10-31T19:29:41Z] The system will attempt to renew SSL certificates for the following websites:
[2017-10-31T19:29:41Z] engin2.betatesting.as.ua.edu (engin2.betatesting.as.ua.edu www.engin2.betatesting.as.ua.edu mail.engin2.betatesting.as.ua.edu)
[2017-10-31T19:29:45Z] The system has installed a new certificate onto “engin2betatestin”’s website “engin2.betatesting.as.ua.edu”.

Engintron sees an updated configuration in Apache and rebuilds the NGINX default_https.conf accordingly at 14:29:01 - note above that AutoSSL doesn't actually complete until 14:29:45 so the initiating element may be the account creation and not the certificate install

stat /etc/nginx/conf.d/default_https.conf
  File: ‘/etc/nginx/conf.d/default_https.conf’
  Size: 53157     	Blocks: 104        IO Block: 4096   regular file
Device: fd03h/64771d	Inode: 1044384     Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2017-10-31 14:35:42.126000000 -0500
Modify: 2017-10-31 14:29:01.686000000 -0500
Change: 2017-10-31 14:29:01.686000000 -0500
 Birth: -

After the configuration update, /var/log/messages show that NGINX was never restarted to pick up an updated conf profile, as the last restart was at 14:23:34 (a restart we manually-invoked during a prior test)

Oct 31 14:23:34 galactica systemd: Started nginx - high performance web server.

Hope this helps at least a little!

@jrsarath
Copy link
Author

yeah actually i forgot to report back,
problem still persists

@pnboy
Copy link

pnboy commented Nov 8, 2017

it seams the problem is with /etc/nginx/utilities/https_vhosts.sh
the line:
RUN_CHECK=$(/usr/bin/php -c /dev/null /etc/nginx/utilities/https_vhosts.php)

should probably be changed to something like:
RUN_CHECK=$(/usr/bin/php -c /dev/null /etc/nginx/utilities/https_vhosts.php > /dev/null 2>&1 ; echo $?)

@jhawkins002
Copy link

Can confirm @pnboy's adjustment to https_vhosts.sh appears to have fixed the issue on our servers!

@brixly
Copy link

brixly commented Nov 9, 2017

Done some further digging - Think we are getting somewhere!

#607

This made for a good read - I was struggling to get my head around the logic as sometimes the script was working, sometimes it didn't restart nginx.

I think the situation is this -

If a rebuildhttpdconf is called and within the 15 seconds (as set in define('HTTPD_CONF_LAST_CHANGED', 15)) the https_vhosts.sh script is called then it works perfectly.

If though, the rebuildhttpdconf happens, then 30 seconds later the cron kicks in, that change is no longer recognised and the purgecache is skipped entirely.

I have tried changing to 5 seconds and 30 seconds - 30 seconds works really well.

@fevangelou - is there a reason this is set as low as 15 seconds? Would you mind explaining the logic?

@fevangelou
Copy link
Member

fevangelou commented Nov 16, 2017

I have a question here...

If my original script did not work as expected, wound't Engintron NOT function properly for anyone?

I hope this is clear.

What you're facing is a problem with certain conditions:

  • you have servers with hundreds of cPanel accounts on them
  • your servers are not as powerful to hold so many accounts
  • given the above points, operations like Apache rebuilding its config take longer than most cPanel servers with Nginx/Engintron on them and subsequently cause misbehaviour on Engintron's part while its rebuilding the Nginx HTTPS vhosts.

The solution to the above problems is simple if you don't wish to change/upgrade your server hardware or distribute cPanel accounts to more servers. Simply raise HTTPD_CONF_LAST_CHANGED to a value higher than 30 secs. Make it 60 so that it's executed once a minute. This way, when a single site is added or removed, the change is mirrored in Nginx in 1 min tops.

For the record, the shell script https_vhosts.sh controls the process, while https_vhosts.php is the process to generate the Nginx HTTPS vhost files. When the PHP script outputs 1, it means the vhosts files have been regenerated. Then the shell script sees this value and tells Nginx to purge its cache and restart along with Apache. It's just 0 & 1 being output by the PHP script and this value is picked up by the shell script to determine whether it will purge Nginx's cache and restart both Nginx and Apache. The PHP script could output "Yes Sir" and "No Sir". It doesn't matter as long as the shell script knows what to expect as a control value in order to do what it's supposed to do.

So the fact that you're seeing something "change" magically when you output "YES" or "NO" is NOT based on that change but rather the circumstances. That's why your change fails later. And by circumstances, I switch back to my initial point made: lots of cPanel accounts or poor performing servers. Of course it's a matter of WHEN the changes occur. If you, as the server admin, perform any code changes when there are no clients using their cPanel accounts, it's natural that you'll see "YES" and "NO" working instead of 1 and 0. They are both valid of course, but you're simply testing things out when the scripts did not have any issue in the first place.

Re-consider your strategy. Either distribute cPanel accounts to more poor performing servers (who would either way explode without Nginx) or simply upgrade your servers with the right components (e.g. RAM, which is dead simple to add).

@jhawkins002
Copy link

That may be true in some cases.

I can report we have documented this issue occurring on a brand new cPanel server (Xeon E3, 4 core, 16GB RAM) with 0 other accounts than the one we used to test Engingtron and SSL. The server was also running on a recent release Engintron (i.e. the CRON format fix had been implemented). No other third party plugins or cPanel addons were installed at the time.

The log data we provided above in this particular thread (showing NGINX never restarted) was from a staging server with ~ 60 accounts. It happens to be a Xeon E3 machine with 12GB that typically runs a load of 0.06 or so.

@fevangelou
Copy link
Member

@jhawkins002 so you setup a stock cPanel server with Engintron on it and it had issues with vhost generation? On a valid and publicly accessible domain/subdomain? Without any other plugins or modifications in cPanel?

@jhawkins002
Copy link

Precisely. The actual NGINX https conf appears to generate just fine (timestamp verified by stat) after the domain gets its SSL certificate, but NGINX itself never restarts - verified in /var/log/messages

@fevangelou
Copy link
Member

Is the shell script https_vhosts .sh executable? Did you run the installation as root user?

@jhawkins002
Copy link

Indeed.

[root@betatesting]# ls -l /etc/nginx/utilities/https_vhosts.sh
-rwxr-xr-x 1 root root 929 Nov  8 11:55 /etc/nginx/utilities/https_vhosts.sh

@fevangelou
Copy link
Member

fevangelou commented Nov 16, 2017 via email

@brixly
Copy link

brixly commented Nov 16, 2017

I can reproduce it - it's been the same for us but the blame is being pushed to the hardware.

Our situation is identical.

@fevangelou
Copy link
Member

fevangelou commented Nov 16, 2017 via email

@brixly
Copy link

brixly commented Nov 16, 2017

Although not many, I do see a few more people who have commented on this thread (as well as a number of similar related threads) - they / we are all having to try and add modifications and fixes to the system

@photogaff
Copy link

Add me to the list of people experiencing the error - The only way I could resolve my issue was to disable and re-enable Engintron - there may have been another way, but that was my initial 'fix quickly' solution because I was seeing SSL errors on all letsencrypt configured domains.

@pnboy
Copy link

pnboy commented Nov 17, 2017

Hi @fevangelou
the main issue is in /etc/nginx/utilities/https_vhosts.sh

that scripts is counting on a 1 from the /etc/nginx/utilities/https_vhosts.php return value in the variable RUN_CHECK, to fire the purgecache, but this will never happen as it's the output of the script that is returned, the echo's, not the exit return value. This variable ends up with the content "HTTPS vhosts for Nginx re-created.\n"

For the script to work as intended it needs to be changed to something like:
RUN_CHECK=$(/usr/bin/php -c /dev/null /etc/nginx/utilities/https_vhosts.php > /dev/null 2>&1 ; echo $?)

so that it actually get the return value of the command, not the output :)

@fevangelou
Copy link
Member

@pnboy Good catch. Devised a bit differently overall, but you can see the changes in the just released v1.8.7.

@sambhav-aggarwal
Copy link

I am facing the exact same issue. The expired certificates are cached until I restart (or clear cache) manually. I have only a few sites (around 4) running on a very powerful hardware.

Since the status of this issue is closed, does that mean it is fixed in the next version update? or do we need to make any changes manually to fix this?

Regards
Sam

@brixly
Copy link

brixly commented Nov 28, 2017

@nuclearsam - try and upgrade to the latest version of Engintron (1.8.7)

The new version released by @fevangelou seems to have resolved the issue for us.

Check /etc/nginx/utilities/https_vhosts.log shortly after issuing a new SSL certificate - if the log has some contents, it will show the purgecache function running.

If nothing is being output to the logs, try going editing the cron via the Engintron interface, then re-saving the cron entry.

@pnboy
Copy link

pnboy commented Nov 29, 2017

@fevangelou it still seams not to work (for me), this is the output of RUN_CHECK:

<br /> <b>Warning</b>: date(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in <b>/etc/nginx/utilities/https_vhosts.php</b> on line <b>40</b><br /> 1
0
0
0

@brixly
Copy link

brixly commented Nov 29, 2017

This isn't because of engintron - you need to set your timezone in php

@pnboy
Copy link

pnboy commented Nov 30, 2017

the command in /etc/nginx/utilities/https_vhosts.sh
/usr/bin/php -c /dev/null -q /etc/nginx/utilities/https_vhosts.php ; echo $?
overrides the php.ini timezone definition, so it has to be fixed in that engintron php script, either by suppressing the warning or by adding something similar to:

if (!ini_get('date.timezone')) {
    ini_set('date.timezone', 'UTC');
}

or by redirecting the output of the command to somewhere else

@imorandinwnp
Copy link

Hi. Still experiencing AutoSSL being "cached" here. Only way to solve it is reloading nginx (service nginx reload).
https_vhosts.sh script does not solve the problem.
Engintron version 1.8.7

I suggest adding a service nginx reload to https_vhosts.sh script

@valiant1x
Copy link

Issue still occurring in Engintron version 1.8.10, please reopen @jrsarath @brixly

@damianetienne
Copy link

Indeed.

One of the upper comments solves the problem but, after every update, seems that the updater overwrite the changes.

@fevangelou
Copy link
Member

@damianetienne Which one?

@imorandinwnp Nginx is reloaded when its cache is purged. Look closer ;)

@fevangelou fevangelou reopened this Jul 6, 2018
@jhawkins002
Copy link

Not sure if this adds to the discussion - but cPanel has FINALLY added official hooks for autossl events as of v.72. Hooray!

Changelog details: https://documentation.cpanel.net/display/72Docs/72+Release+Notes#id-72ReleaseNotes-NewStandardizedHooks

@fevangelou
Copy link
Member

@jhawkins002 thanks for that. I'll have a look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests