Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase nginx/php timeout: Drupal distribution fails to install due to " epoll_wait() reported that client prematurely closed connection" #672

Closed
budda opened this issue Feb 22, 2018 · 34 comments
Assignees
Milestone

Comments

@budda
Copy link

budda commented Feb 22, 2018

What happened:

Attempted to install the Varbase distribution. During "Install Site" section of the install.php process it eventually halts, before completing the 108 tasks, with the fatal error:

An AJAX HTTP request terminated abnormally.
Debugging information follows.
Path: /core/install.php?profile=varbase&langcode=en&id=1&op=do_nojs&op=do
StatusText: error
ResponseText: 
ReadyState: 0

Reviewing the logs I found:

2018/02/12 15:14:37 [info] 23#23: *163 client 127.0.0.1 closed keepalive connection
2018/02/12 15:14:38 [info] 22#22: *161 epoll_wait() reported that client prematurely closed connection, so upstream connection is closed too while sending request to upstream, client: 172.18.0.3, server: _, request: "POST /core/install.php?profile=varbase&langcode=en&id=1&op=do_nojs&op=do&_format=json HTTP/1.1", upstream: "fastcgi://unix:/run/php-fpm.sock:", host: "testvarbase.ddev.local", referrer: "http://testvarbase.ddev.local/core/install.php?profile=varbase&langcode=en&id=1&op=start"
172.18.0.3 - - [12/Feb/2018:15:14:38 +0000] "POST /core/install.php?profile=varbase&langcode=en&id=1&op=do_nojs&op=do&_format=json HTTP/1.1" 499 0 "http://testvarbase.ddev.local/core/install.php?profile=varbase&langcode=en&id=1&op=start" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.140 Safari/537.36"
2018/02/12 15:14:43 [info] 23#23: *164 client 127.0.0.1 closed keepalive connection

Do services configured by ddev need longer timeouts when doing intensive things like database queries during install?

What you expected to happen:

Drupal install.php to complete the process and a finished & configured site to be available.

How to reproduce this:

Use composer create-project as directed and then run through the Drupal 8 install.php wizard.

Version:

cli:       	v0.11.0
web:       	drud/nginx-php-fpm-local:v0.9.0
db:        	drud/mysql-local-57:v0.6.3
dba:       	drud/phpmyadmin:v0.2.0
router:    	drud/ddev-router:v0.5.0
commit:    	v0.11.0
domain:    	ddev.local
build info:	Built Mon Jan 8 17:44:21 UTC 2018 circleci@prealloc-thzqktqj-e98cafb1-eaae-4090-9ae8-bbd7e80cb02d drud/golang-build-container:v0.5.0
Client:
 Version:	17.12.0-ce
 API version:	1.35
 Go version:	go1.9.2
 Git commit:	c97c6d6
 Built:	Wed Dec 27 20:03:51 2017
 OS/Arch:	darwin/amd64

Server:
 Engine:
  Version:	17.12.0-ce
  API version:	1.35 (minimum version 1.12)
  Go version:	go1.9.2
  Git commit:	c97c6d6
  Built:	Wed Dec 27 20:12:29 2017
  OS/Arch:	linux/amd64
  Experimental:	true
APIVersion: "1"
name: testvarbase
type: drupal8
docroot: docroot
php_version: "7.1"
webimage: drud/nginx-php-fpm-local:v0.9.0
dbimage: drud/mysql-local-57:v0.6.3
dbaimage: drud/phpmyadmin:v0.2.0
provider: default

Anything else do we need to know:

Repeated the same install process and reproduces same fatal issue.

Related source links or issues:

https://www.drupal.org/project/varbase

@budda
Copy link
Author

budda commented Feb 22, 2018

I updated to the latest ddev, reset the DB data, and re-installed. This time - after about 25-30 minutes installing it ended with a 504 Gateway Time-out error page from NGINX.

@cyberswat
Copy link
Contributor

cyberswat commented Feb 22, 2018

Thank you for the detailed issue. I believe I was able to replicate this against the 8.4.x branch of the varbase github repository. I used all default settings during the install process of the distribution.

screen shot 2018-02-22 at 10 08 34 am

➜  varbase-project git:(8.4.x) ✗ ddev version
db    	drud/mariadb-local:v0.7.1
dba   	drud/phpmyadmin:v0.2.0
router	drud/ddev-router:v0.5.0
commit	v0.13.1
domain	ddev.local
cli   	v0.13.1
web   	drud/nginx-php-fpm-local:v0.9.3
➜  varbase-project git:(8.4.x) ✗ docker --version
Docker version 17.12.0-ce, build c97c6d6

@budda
Copy link
Author

budda commented Feb 22, 2018

That's the error -- but I'm getting it much earlier in the process before 'Configure site"!

@rfay
Copy link
Member

rfay commented Feb 22, 2018

I've seen this. I think we need to increase the timeout in the nginx config. However, when I saw it, just reloading the URL got things all finished up.

Also, Docker 17.12 is highly not recommended as it has a hanging problem. 17.09 and 18.02 (edge) seem to be OK at this point, and they promise to fix 17.12.

@rfay rfay changed the title Drupal distribution fails to install due to " epoll_wait() reported that client prematurely closed connection" Increase nginx/php timeout: Drupal distribution fails to install due to " epoll_wait() reported that client prematurely closed connection" Feb 22, 2018
@budda
Copy link
Author

budda commented Feb 22, 2018

I'll upgrade Docker with Brew shortly and try again.

It seems any Drupal Distro installs are painfully slow to install with Docker/ddev.
I'm using a Macbook Air 1.4 GHz Intel Core i5 which might not help.

@rfay
Copy link
Member

rfay commented Feb 22, 2018

I use a 2013 Macbook Air (1.7 GHz Intel Core i7), and it's pretty old and long in the tooth. So yeah, yours sounds a little old. Adding memory to docker has also helped lots of people (and being careful not to run a bunch of sites at one time)

@budda
Copy link
Author

budda commented Feb 22, 2018

Mines Early 2014 - time flies :)
I upped the Docker RAM from 2GB (default) to 4GB.

@rfay
Copy link
Member

rfay commented Feb 22, 2018

Also, I'm increasing the nginx timeout waiting for php-fpm in https://github.com/drud/docker.nginx-php-fpm-local/pull/51, will also look at the php timeout.

@rfay
Copy link
Member

rfay commented Feb 22, 2018

Saw your earlier thing, apparently deleted, but for edge with homebrew/docker I think you have to brew cask install docker-edge but I might be wrong.

@budda
Copy link
Author

budda commented Feb 22, 2018

Yeah realised I needed to re-install specific edge package to get the v18.x after some knocking around (new to docker)

The 4GB memory increase got me through to the "Assemble extra components" stage of the instal process before a failure of "504 Gateway Time-out nginx/1.13.6"

@rfay
Copy link
Member

rfay commented Feb 22, 2018

You can provide your own nginx config if you want, and bump the nginx timeout, https://ddev.readthedocs.io/en/latest/users/extend/customization-extendibility/#providing-custom-nginx-configuration

I'll also have a new container with increased timeout ready in a bit.

@rfay
Copy link
Member

rfay commented Feb 22, 2018

I'd love to hear if putting webimage: drud/nginx-php-fpm-local:20180222_multi_cms_nginx in your .ddev/config.yaml and ddev start gets you any farther along.

Oh, but you need to upgrade your ddev to v0.13.1 please. I don't think there's any risk for you in that. brew upgrade ddev - your v0.11.0 seems SOOOO long ago :) You'll want to rm .ddev/docker.compose.yml as well.

@budda
Copy link
Author

budda commented Feb 22, 2018

Tried the new webimage but it didn't start

$ ddev start
Creating ddev-testvarbase-db ... done
Pulling web (drud/nginx-php-fpm-local:20180222_multi_cms_nginx)...
Creating ddev-testvarbase-web ... done
Creating ddev-testvarbase-web ...
Creating ddev-testvarbase-dba ...

Network ddev_default is external, skipping
Creating ddev-router ... done


Failed to start testvarbase: web service health check timed out

@rfay
Copy link
Member

rfay commented Feb 22, 2018

Shoot, guess that's bleeding edge for you. I don't have that problem, came up fine. You might try ddev rm --remove-data and then a ddev start again. It blows away your db, but since you're just trying to do an install anyway it shouldn't matter.

@rfay
Copy link
Member

rfay commented Feb 22, 2018

Oh... and if you're now on Edge 18.02... there are some weird problems with that. You might have to go back to 17.09.

@rfay
Copy link
Member

rfay commented Feb 22, 2018

I can demonstrate your problem with varbase. I suspect they're doing a really long-running item in the batch stuff. I even increased the php max_execution_time and nginx timeout to 999.

My bet is that the varbase folks have seen this before and know why it is. I think they're doing something really, really long in there.

I assume you have plenty of experience with installing varbase?

@budda
Copy link
Author

budda commented Feb 22, 2018

Heh, no I was just wanting to give Varbase a try and have no experience of it!

I was also trying the Acquia Lightning distro earlier too - which failed too, but with a different issue, it seems.

@rfay
Copy link
Member

rfay commented Feb 22, 2018

OK, well shucks - we don't have any way to know if this is a problem in ddev or somewhere else, like varbase.

If you have a comparable environment to try these things out you can try them out there to see, like a local webserver/php config.

@rfay
Copy link
Member

rfay commented Feb 22, 2018

BTW, when it hit the error I clicked "Continue to the error page" and it continued on and appeared to complete successfully.

Drupal's batch stuff is a pretty fragile mix in general.

@budda
Copy link
Author

budda commented Feb 22, 2018

Just setting up GeerlingGuys Drupal.vm to test Varbase distro on as a comparable test.

When I clicked continue once, it skipped all the extras from the Varbase installer - and did let me move around the Drupal install, just appeared unfinished.

@rickmanelius rickmanelius self-assigned this Feb 23, 2018
@rickmanelius rickmanelius added this to the v1.2.0 milestone Feb 23, 2018
@rfay
Copy link
Member

rfay commented Apr 16, 2018

We haven't had enough reports of problems like this to debug this, so I'm going to close it. But anybody is free to reopen or add additional information.

@rfay rfay closed this as completed Apr 16, 2018
@ikit-claw
Copy link

If you run long processes you can get the error I get it for running the Hacked! module to check the modules on a site with 200+ modules. So this really needs to be addressed. Also, only the owner can reopen the issue.

@ikit-claw
Copy link

The duration needs increasing 75% the output tells me how far it got into the scan.

@cweagans cweagans reopened this Apr 26, 2018
@cweagans
Copy link
Contributor

I'm wondering if we should just set an absurdly high timeout (like 36000 seconds) and call it a day.

@ikit-claw
Copy link

Yeah, it might be wise, the composer has a similar issue you can just run a command to tell it memory is infinite to get past it.

@budda
Copy link
Author

budda commented Apr 26, 2018

Sorry i've not had any time since February to revisit this and test Varbase distro on Drupal.vm yet to compare the issue. Good to see at least somebody else has noticed similar problems now.

@rfay
Copy link
Member

rfay commented Apr 26, 2018

I think there are a couple of approaches:

  • Increasing nginx and php timeouts way high as @cweagans says. It's not that bad a thing. The timeout is normally to prevent inappropriate webserver attacks/usage, so doesn't matter on a local.
  • Already every user can override both nginx and php configs, so this is already possible for anybody who wants to have a very long timeout.

@ikit-claw
Copy link

Perhaps a little howto for those that want this?

@rfay
Copy link
Member

rfay commented Apr 27, 2018

I should note that only poorly designed code goes underwater for longer than a normal page timeout, and won't work in most server environments. That's why php-cli has a timeout of 0 (never timeout), so you can do strictly php-work non-website work on the command line, and why big things are often done with drush. So Hacked module, does it have a drush UI? Because that would normally be the way to go.

The way to change the timeout:

  • Custom nginx config changing fastcgi_read_timeout 360;
  • Custom php config changing
max_execution_time = 360
request_terminate_timeout = 360

Custom nginx and php configs are discussed in https://ddev.readthedocs.io/en/latest/users/extend/customization-extendibility/

Also, please make sure you don't have xdebug enabled.

The current timeout is 6 minutes. Any php process that's doing something for 6 minutes without coming up for air is not intending to populate a page :) It's probably mining bitcoin.

@budda
Copy link
Author

budda commented Apr 29, 2018

It's probably mining bitcoin.

Hah!

@mglaman
Copy link
Contributor

mglaman commented May 3, 2018

EDIT: My problem was something else.

This keeps happening for me. On a project which was running fine a few days ago -- my Commerce 2x dev environment..

[03:43 PM]-[mglaman@Matts-MacBook-Pro]-[~/Drupal/sites/commerce2x] 
$ ddev start
Starting environment for commerce2x... 
Using custom PHP configuration: [20-xdebug.ini] 
Custom configuration takes effect when container is created, 
usually on start, use 'ddev restart' if you're not seeing it take effect. 
Creating ddev-commerce2x-db ... done
Creating ddev-commerce2x-web ... done
Creating ddev-commerce2x-dba ... done
 
Network ddev_default is external, skipping 
Creating ddev-router ... done
 
Failed to start commerce2x: web service health check timed out 
$ ddev version
cli   	v0.17.0                        
web   	drud/nginx-php-fpm-local:v1.2.2
db    	drud/mariadb-local:v0.9.0      
dba   	drud/phpmyadmin:v0.2.0         
router	drud/ddev-router:v0.5.0        
commit	v0.17.0                        
domain	ddev.local 
$ docker version
Client:
 Version:      18.05.0-ce-rc1
 API version:  1.37
 Go version:   go1.9.5
 Git commit:   33f00ce
 Built:        Thu Apr 26 00:58:56 2018
 OS/Arch:      darwin/amd64
 Experimental: false
 Orchestrator: swarm

Server:
 Engine:
  Version:      18.05.0-ce-rc1
  API version:  1.37 (minimum version 1.12)
  Go version:   go1.10.1
  Git commit:   33f00ce
  Built:        Thu Apr 26 01:06:49 2018
  OS/Arch:      linux/amd64
  Experimental: true

@rfay
Copy link
Member

rfay commented May 3, 2018

This sounds different from the OP. You're having the web service health check time out. Does ddev logs show anything interesting?

Please delete any webimage: line in your config.yaml; Probably ddev rm and rm -rf .ddev and ddev config and ddev start

@mglaman
Copy link
Contributor

mglaman commented May 3, 2018

Yeah, I'm wrong. Just searching the error brought this up. Was due to xdebug config override.

@rfay
Copy link
Member

rfay commented May 16, 2018

I think this turns out to be #844 - the ddev-router nginx timeout has to be increased.

@dclear dclear removed this from the v1.2.0 milestone May 17, 2018
@rfay rfay closed this as completed in 2d3bab3 May 18, 2018
@dclear dclear added this to the v0.19.0 milestone May 31, 2018
@dclear dclear removed the incubate label May 31, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants