Skip to content
This repository has been archived by the owner on Jan 29, 2020. It is now read-only.

Removes Content-Length injection and output buffer flushing #270

Merged
merged 4 commits into from
Sep 13, 2017

Conversation

weierophinney
Copy link
Member

Per discussion on #251, we have two "features" in Diactoros that tend to cause problems:

  • Auto-injection of the Content-Length header causes issues when any output is emitted via the output buffer. What tends to happen is that the contents of the output buffer are emitted up to the Content-Length. As a result, this content is truncated, or, in the case that it is shorter, the original content of the response is truncated. In either situation, the results are difficult to debug.
  • By flushing the output buffer, we run into two issues: the Content-Length issues previously mentioned, but also mixed content, which typically breaks clients.

As discussed in the issue, the proper solution is a couple of BC breaks:

  • We remove the Content-Length auto-injection. If developers want this, they can now use Zend\Expressive\Helper\ContentLengthMiddleware from zend-expressive-helpers 4.1.0.
  • We no longer do anything with output buffers in either the SAPI emitters or the Server class. Emitters now check for either headers sent, or an existing output buffer with content length > 0; if either case occurs, they raise an exception, as the emitter cannot properly emit the response.

Essentially, this approach forces the developer to choose to use PSR-7 fully, or create their own emitter and/or server implementation if they want to allow mixed global/PSR-7 usage.

By removing a call to `ob_start()` within `Server::listen()`, we can
solve the issue of detecting when we have both a response and content in
the output buffer; in most cases, we will have already sent headers,
which will cause an exception to be raised; we can also check the
_current_ output buffer and, if non-empty, raise an exception.

This means we can:

- Remove the `SapiEmitterTrait::flush()` implementation, and all calls
  to it.
- Remove the `$maxBufferLevel` argument to each emitter.
- Remove tests regarding interactions of emitters with the output buffer.

This is a backwards-incompatible change. However, it fixes a rather
tricky problem that occurs currently when mixing buffered output and
response instances.
@weierophinney
Copy link
Member Author

Pinging @mindplay-dk, @stuartherbert, and @mnapoli for review.

Per zendframework#251, this patch removes Content-Length injection from SAPI stream
emitters. It also renames the `checkForPreviousOutput()` method to
`assertNoPreviousOutput()` to make it more clear that the method is a
no-op under proper usage.

The `assertNoPreviousOutput()` method also borrows an idea from the
comment to the issue, indicating we can check for (1) a non-zero output
buffer, and (2) use `ob_get_length()` to determine if it has content.
* @param int $maxBufferLength Maximum output buffering size for each iteration
*/
public function emit(ResponseInterface $response, $maxBufferLevel = null, $maxBufferLength = 8192)
public function emit(ResponseInterface $response, $maxBufferLength = 8192)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ironically, we changed the signature of this method to swap the $maxBufferLength and $maxBufferLevel arguments for 1.5.0, fixing an issue whereby 2 bytes at a time were streamed; had I resolved this issue prior to that release, we would not have needed to make that change, and then revert it here.

}

return $response;
if (ob_get_level() > 0 && ob_get_length() > 0) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mindplay-dk: note that I separated out this from the headers_sent() check; I did this to allow having discretely different exception messages.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line 32: the PR is called 'Remove output buffer awareness', but line 32 is checking the output buffer :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, so is calling headers_sent(). :)

I suppose I should retitle it "remove output buffer flushing"...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-named the PR to "Removes Content-Length injection and output buffer flushing".


class ServerIntegrationTest extends TestCase
{
public function testPassesBufferLevelToSapiStreamEmitter()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test was added as part of the patch around the argument swap in SapiStreamEmitter, and thus became unnecessary once we stopped working with the output buffer.

@mwillbanks
Copy link
Contributor

This looks good to me, the BC break is unfortunate but better to break it now.

Copy link
Member

@DASPRiD DASPRiD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't spot anything bad here, approved.

- Ensures they do not discuss output buffering.
- Adds section on `SapiStreamEmitter`.
@weierophinney weierophinney changed the title Removes Content-Length and output buffer awareness Removes Content-Length and output buffer flushing Sep 12, 2017
@weierophinney weierophinney changed the title Removes Content-Length and output buffer flushing Removes Content-Length injection and output buffer flushing Sep 12, 2017
{
if (headers_sent()) {
throw new RuntimeException('Unable to emit response; headers already sent');

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this removed? You won't be able to emit the status line or headers if output has already been started - the first call to header() will error. Isn't it better to have the exception, which explains why?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, never mind, you just moved it somewhere else ;-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😄

@mindplay-dk
Copy link

mindplay-dk commented Sep 13, 2017

Wait, one thing... and maybe this was already discussed, but I'm a bit fuzzy right now... it's now never injecting the Content-Length header - not even when there's a Stream with a known length? The issue was with output buffering vs the content-length, but if there's no output in the buffer, and we know we're emitting the stream only, and it has a known size, is it any problem to emit the missing Content-Length header?

@weierophinney
Copy link
Member Author

@mindplay-dk For those cases, I introduced the ContentLengthMiddleware in the zend-expressive-helpers package. You can slipstream that into pipelines where you know you want/need to emit the Content-Length header. Alternately, you can calculate it and inject it manually into the response you return.

What we're doing here is simply saying: we won't auto-inject it.

We could continue doing that, I suppose, as we're now asserting that the output buffer has not been used before we emit anything. I'd gone with the suggestion @stuartherbert made and removed that auto-injection, as the main reason it had been present previously was due to an assumption that HTTP/1.1 servers would degrade the connection if it was missing, and he discovered that was not the case. The spec indicates that if it is not present, the server should send a Transfer-Encoding: chunked header, and most of the surveyed servers do exactly that, making it work fine.

I'd be fine with re-adding the auto-injection; my main concern is that it might still lead to issues. As an example, with the current checks, the following may still occur:

  • we may have a nested output buffer at a level greater than 1
  • the current output buffer may have no contents
  • but a parent output buffer might

In those circumstances, we might still end up with the current situation where we end up with truncated content, which would be difficult to debug. With the proposed solution, we can ask the reporter if they have enabled the ContentLengthMiddleware; if they have, we can ask them to disable it to see if the problem still exists. If it does, they now have a path for debugging; if not, it's far harder.

Thoughts?

@stuartherbert
Copy link

I did some testing on this, the results are here: #251 (comment)

Long and short of it: it'll only break non-compliant clients that haven't fully implemented HTTP/1.1 support, if you're using PHP-FPM. That's acceptable to my user base. I can appreciate it might not be acceptable for everyone.

@weierophinney weierophinney merged commit a22b273 into zendframework:develop Sep 13, 2017
weierophinney added a commit that referenced this pull request Sep 13, 2017
@weierophinney weierophinney deleted the hotfix/251a branch September 13, 2017 14:27
@mindplay-dk
Copy link

@weierophinney hmm, I think I'm actually OK with the emitter not adding the Content-Length header - I personally don't want middleware for something like this, but I think it might make more sense for us to do that in our server abstraction, because we have harder constraints; for one, we don't want people to use echo statements or output buffers at all - we want no dependency on global state, we rather want everything that gets emitted to get emitted by the emitter.

Since the emitter isn't doing the header anymore, this gives us a bit more control in this regard.

I think this is a great improvement.

One last remark: I was a bit surprised by this change being made available in a minor release. This is a breaking change - in some cases with potentially problematic issues arising from it, such as (for example) losing the progress bar on large downloads, since the client can no longer know the content length.

There were also breaking API changes to SapiEmitter::emit() and SapiStreamEmitter::emit() - the latter being potentially a serious BC break, since the $maxBufferLength argument took the position of $maxBufferLevel, which means you might accidentally set the buffer level to just a few bytes (!)

Perhaps you're not committed to SEMVER? These days I kind of assume everyone is. If not, I think maybe I need to adjust our version constraint and stop using the ^ operator?

@froschdesign
Copy link
Member

@mindplay-dk
The Zend Framework uses semantic versioning.

…losing the progress bar on large downloads, since the client can no longer know the content length.

You have to add the new middleware, this ensures backward compatibility. From the release notes:

They no longer auto-inject a Content-Length header. If you need this functionality, zendframework/zend-expressive-helpers 4.1+ provides it via Zend\Expressive\Helper\ContentLengthMiddleware.

There were also breaking API changes to SapiEmitter::emit() and SapiStreamEmitter::emit()…

Right! This is a problem, because both methods are a part of the public API and these are incompatible changes.
The arguments should marked as deprecated and not removed.

/cc @weierophinney

@froschdesign
Copy link
Member

@weierophinney
This issue has a label "BC break" and is marked with the milestone for version 1.6.0?!

@weierophinney
Copy link
Member Author

The rationale for releasing this in a new minor version, versus a major version, is that the component was fundamentally broken without the fix; pushing it to a new major version would prevent many developers from adopting the fixed code.

It's not something I do lightly, or often. However, when the only way to provide a fix is via a BC break, it's something we consider.

In this particular case, I felt the drawbacks would not affect the majority of users. As Stuart's research shows, the initial rationale for auto-injecting the Content-Length header — ensuring the server does not degrade the connection from HTTP/1.1 to HTTP/1.0 — was misguided, as the majority of web servers and web server + PHP SAPI combinations ensure that a Transfer-Encoding: chunked header is present, preserving HTTP/1.1 semantics. Further, the issues raised regarding the inability to debug properly when the Content-Length header was present an errors were raised were significant, and the fix in this patch makes it far easier to diagnose such issues. Finally, the changes to the emitters were interesting in that the second and additional arguments are not defined in the interface, and the only code that was passing any additional arguments was the Server implementation... which, with these changes, could now call the emitters with only the original, and only defined, argument.

As such, I determined that for the vast majority of users, there is no perceivable BC break. Those affected will be:

  • those who were requiring a Content-Length header to be sent; a solution is now present for those users in the form of Zend\Expressive\Helper\ContentLengthMiddleware.
  • those who were extending the various SAPI emitters and providing their own Server implementation in order to pass the additional arguments to them.

The latter is very much an advanced use case. The former is not, but is a minority use case, and, as noted addressed easily.

I stand by the decision to release as a minor release. Again, as noted, it was not a decision taken lightly, but in the interest of getting a substantial fix to as many users as possible.

@mindplay-dk
Copy link

@weierophinney thank you for the clarification. makes sense! I don't know if SEMVER really stipulates anything one way or the other for a "breaking fix" - it's not an everyday thing, I suppose. Good work :-)

@teohhanhui
Copy link

It kinda defeats the purpose of semver if the user cannot trust that you wouldn't introduce BC breaks.

pushing it to a new major version would prevent many developers from adopting the fixed code.

The user should be the one making the decision to upgrade.

However, when the only way to provide a fix is via a BC break, it's something we consider.

It could be released as a major version. The old version could be marked deprecated.

@weierophinney
Copy link
Member Author

@teohhanhui Shipping a bugfix that requires the user to change their dependency constraints to adopt is also a poor user experience.

Yes, in a perfect world, we would not ever, ever break backwards compatibility in a bugfix or minor release. Practically speaking, however, sometimes it's a necessary evil. I can count on one hand how many times I've done this in the past 10 years.

@mindplay-dk
Copy link

@weierophinney

@teohhanhui does make a valid point here - I too was surprised that this was released as a non-breaking change. Developers rely on composer update to get bug-fixes - if we roll out breaking changes against SEMVER, we're breaking the intended workflow.

Shipping a bugfix that requires the user to change their dependency constraints to adopt is also a poor user experience.

Shipping a breaking change against the users wishes - per their dependency constraints - requires the user to discover (potentially really subtle) bugs (such as a missing header) on their own, which is a much worse user experience.

Having to change your dependency constraint to adopt is SEMVER and Composer package constraints working as intended - it may not be a wonderful experience, but it works that way for good reasons.

</rant>

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants