New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Publishing via POST sometimes timeouts when using multiple worker processes #509
Comments
Were there any worker crashes? I see nothing suspicious in the error log. |
Nope, no crashes. I also tried the same scenario with the static nginx 1.10 + nchan 1.2.3 build available from the website, and got slightly different outcome - in this case the channel ID didn't matter, the failures are intermittent even when publishing to the same channel multiple times, whereas before it would either always succeed or always fail. Please note that for the initial report I used a 5 second timeout for the curl requests. If I leave that out, the request stalls until Here are some more logs from static build + no timeout on client side: Connected to ws://172.17.0.2:80/listen
id: 1548369334:-,[0],-,-
content-type: application/x-www-form-urlencoded
test @ 1548369334
id: 1548369394:-,[0],-,-
content-type: application/x-www-form-urlencoded
test @ 1548369334
id: 1548369394:-,[1],-,-
content-type: application/x-www-form-urlencoded
test @ 1548369334
id: 1548369396:-,[0],-,-
content-type: application/x-www-form-urlencoded
test @ 1548369396
id: 1548369398:-,[0],-,-
content-type: application/x-www-form-urlencoded
test @ 1548369398
id: 1548369458:-,[0],-,-
content-type: application/x-www-form-urlencoded
test @ 1548369398
id: 1548369458:-,[1],-,-
content-type: application/x-www-form-urlencoded
test @ 1548369398 $ for i in test test test; do curl -v -d "$i @ `date +%s`" http://172.17.0.2/pub/$i; sleep 2; done
* Trying 172.17.0.2...
* TCP_NODELAY set
* Connected to 172.17.0.2 (172.17.0.2) port 80 (#0)
> POST /pub/test HTTP/1.1
> Host: 172.17.0.2
> User-Agent: curl/7.63.0
> Accept: */*
> Content-Length: 17
> Content-Type: application/x-www-form-urlencoded
>
* upload completely sent off: 17 out of 17 bytes
* Empty reply from server
* Connection #0 to host 172.17.0.2 left intact
curl: (52) Empty reply from server
* Trying 172.17.0.2...
* TCP_NODELAY set
* Connected to 172.17.0.2 (172.17.0.2) port 80 (#0)
> POST /pub/test HTTP/1.1
> Host: 172.17.0.2
> User-Agent: curl/7.63.0
> Accept: */*
> Content-Length: 17
> Content-Type: application/x-www-form-urlencoded
>
* upload completely sent off: 17 out of 17 bytes
< HTTP/1.1 201 Created
< Server: nginx/1.10.1
< Date: Thu, 24 Jan 2019 22:36:36 GMT
< Content-Type: text/plain
< Content-Length: 101
< Connection: keep-alive
<
queued messages: 4
last requested: -1 sec. ago
active subscribers: 1
* Connection #0 to host 172.17.0.2 left intact
last message id: 1548369396:0* Trying 172.17.0.2...
* TCP_NODELAY set
* Connected to 172.17.0.2 (172.17.0.2) port 80 (#0)
> POST /pub/test HTTP/1.1
> Host: 172.17.0.2
> User-Agent: curl/7.63.0
> Accept: */*
> Content-Length: 17
> Content-Type: application/x-www-form-urlencoded
>
* upload completely sent off: 17 out of 17 bytes
* Empty reply from server
* Connection #0 to host 172.17.0.2 left intact
curl: (52) Empty reply from server |
quick update: i'm on this, i can reproduce it, but the issue is going to take some deep digging inside Nginx guts. Will update when I have it figured out. |
I'm having the same issue using nginx 1.16 and nchan 1.2.5 Any updates? |
Please let me know if this is still happening in version 1.2.6 |
I'm seeing this on nginx 1.17.5 and nchan 1.2.6, though I don't believe I'm using nchan_authorize_request and nchan_publisher_upstream_request. They're not in my nchan conf, anyway. Changing the worker processes to 1 seems to help. |
When publishing via POST requests, it seems that some channel IDs work fine, but others do not, if the following conditions are met:
worker_processes 2;
(or more)nchan_authorize_request
andnchan_publisher_upstream_request
When publishing to the broken channels, subscribers receive the message immediately, the but the publisher's connection hangs until a timeout occurs, and when that happens, subscribers receive the same message again with a new ID. I've attached below a debug log, subscriber websocket output and a curl request log.
If I change the config to
worker_processes 1;
or remove either of the upstream requests, the issue disappears.The ID values themselves don't seem to matter, I originally stumbled upon this when trying to use "public" together with numeric IDs, and some numbers would consistently work, while others consistently fail.
nginx.log
Publisher:
The text was updated successfully, but these errors were encountered: