Skip to content
This repository has been archived by the owner on Apr 21, 2023. It is now read-only.

Streaming/Chunked upstream responses are buffered when pagespeed is on #1163

Open
muralikg opened this issue Mar 25, 2016 · 10 comments
Open

Comments

@muralikg
Copy link

I have an upstream Django server which flushes html in chunks and then optimized by ngx_pagespeed. Config looks like below

    gzip on;
    pagespeed on;
    pagespeed gzip off;
    pagespeed FileCachePath /var/cache/nginx-pagespeed;
    pagespeed HttpCacheCompressionLevel 0;    
    pagespeed RewriteLevel Passthrough;

NPS_VERSION=1.10.33.6
NGINX_VERSION=1.8.1
Nginx alone without pagespeed doesn't do any buffering.

I have tried the same with Apache ModPageSpeed and there is no buferring with Apache and the optimzied html is flushed immediately

Is there any configuration that I am missing with nginx?

@jeffkaufman
Copy link
Contributor

Hmm. That's not supposed to happen: we should be passing through flushes from the origin, not buffering, and we should do that even if you had all the rewriters on. (And we actually have some pretty complicated logic in our rewriters to make sure we can safely handle a flush anywhere.)

I just tested this with the echo module though, and I'm seeing exactly what you're seeing:

$ curl -sS www.jefftk.com/echoflushecho?PageSpeedFilters=
[2s pause]
<p>before flush</p>
<p>after sleep(2) and flush</p>

$ curl -sS www.jefftk.com/echoflushecho?PageSpeed=off
<p>before flush</p>
[2s pause]
<p>after sleep(2) and flush</p>

This is with:

location /echoflushecho {
    echo "<p>before flush</p>";
    echo_flush;

    echo_sleep 2;

    echo "<p>after sleep(2) and flush</p>";
}

@jeffkaufman
Copy link
Contributor

I had thought we had a test for this case, but all I'm seeing is Headers are not destroyed by a flush event.

@oschaaf
Copy link
Member

oschaaf commented Mar 25, 2016

@jeffkaufman there are some changes around flushing behavior contained in #936
This also contains a test we worked on for timing flushes..
(the PR also contains stuff involving shutting down differently, but it may make sense to factor out just the flushing-related changes).

@jeffkaufman
Copy link
Contributor

I have a nice test for this; I'll send it out after lunch

@jeffkaufman
Copy link
Contributor

jeffkaufman commented Mar 25, 2016

Actually, your test in #936 is good, and mine is too complicated.

The root of the problem here is that we never set the flush flag, in nginx. Thinking now, we should probably set the flush flag whenever it's set on the buffer chain coming in?

Looking over #936 and apache/incubator-pagespeed-mod#1066, it's not actually clear to me whether this would fix #1163 by default?

(I poked @morlovich again about reviewing apache/incubator-pagespeed-mod#1066 -- I'm sorry it's taking so long!)

@oschaaf
Copy link
Member

oschaaf commented Mar 25, 2016

On flushing behaviour for the proposed change -- from https://github.com/pagespeed/ngx_pagespeed/pull/936/files#r28490250 :

When string_piece_to_buffer_chain is called by NgxBaseFetch it will always have the flush argument set to true.
By default, we will never flush at all except when we are done transforming (equalling current behaviour).
We could consider switching the new follow_flushes argument to true for nps, which would make us flush when the upstream indicates it want us to. We could recommend enabling flush_html for reverse proxy setups.

The argument for optionizing it is that when we would have very chunky incoming content, it may be nice to have an option to not follow that. But I think defaulting to following flushes would be good?

@jeffkaufman
Copy link
Contributor

I'm not sure we need an option not to follow upstream flushes; we've always followed them for apache with no complaints.

We could add an option not to follow them, but then either:

  • we shouldn't document it, and just keep it as an escape hatch in case there turned out to be something big we were missing, or
  • we should make it work for both apache and nginx and document it fully

@oschaaf
Copy link
Member

oschaaf commented Mar 25, 2016

If Apache always behaved like that without trouble, that makes me think the option is overkill too.

But ProxyFetch does seem to handle decisions around flushing carefully, and we need a way to inform it that it should follow flushes for us in ngx_pagespeed (and the other async servers).

Perhaps remove the FollowFlushes options from the mps side in apache/incubator-pagespeed-mod#1066 and instead pass in if ProxyFetches should always follow flushes to the ProxyFetchFactory when constructing it?

@morlovich
Copy link
Contributor

Background:
ProxyFetch's flush stuff was done for PageSpeedService, which was a proxy,
so bytes were arriving over TCP. A lot of it is basically there to make
sure to
flush data gotten from the backend ASAP, so the first CWND gets sent out as
soon as we get it, then the second one, and so on. Beyond that, it also
tries to limit
memory usage.

This is already all controllable via options->idle_flush_time_ms() and
options->flush_buffer_limit_bytes(), and there is an overall enable switch
in
Options()->flush_html()

On Fri, Mar 25, 2016 at 4:05 PM, Otto van der Schaaf <
notifications@github.com> wrote:

If Apache always behaved like that without trouble, that makes me think
the option is overkill too.

But ProxyFetch does seem to handle decisions around flushing carefully,
and we need a way to inform it that it should follow flushes for us in
ngx_pagespeed (and the other async servers).

Perhaps remove the FollowFlushes options from the mps side in
apache/incubator-pagespeed-mod#1066
apache/incubator-pagespeed-mod#1066 and instead pass
in if ProxyFetches should always follow flushes to the ProxyFetchFactory
when constructing it?


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#1163 (comment)

@mbunzel
Copy link

mbunzel commented Apr 29, 2016

I'm seeing significantly elevated TTFB (200ms instead of 70ms with ?PageSpeed=off for ~260kb of HTML) using 1.11.33.0 on nginx 1.9.15 with fastcgi. This is a new install so I have no previous data for comparison. In my case, enabling the inline_images filter seems to reduce TTFB by 50ms. All details and debug log can be found here.

Any chance this is the same buffering issue (I'd expect pagespeed to at least flush the header asap) or just unavoidable parsing overhead?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants