Skip to content
This repository has been archived by the owner on Jan 29, 2020. It is now read-only.

Emitter performance #238

Closed
mindplay-dk opened this issue Mar 26, 2017 · 3 comments
Closed

Emitter performance #238

mindplay-dk opened this issue Mar 26, 2017 · 3 comments

Comments

@mindplay-dk
Copy link

I'm about to start using Diactoros to process many image requests (profile photos that are only accessible to members) and have some concerns regarding performance.

I generally try to avoid dispatching PHP when delivering high volumes of assets, e.g. using a caching strategy where PHP is invoked only the first time an image is requested - for subsequent requests, the physical asset with the requested URL will be in place, and the server will serve it up without hitting PHP.

In this particular case, the images have to be protected from public access, so there's no way around hitting PHP for every image.

First, a question: how come there are two emitter implementations? Why do we have SapiEmitter as well as SapiStreamEmitter? Are there performance differences? I don't see a benchmark in the package, so it's hard to tell.

I can tell at least that SapiStreamEmitter appears to support partial (resume) requests. Another difference appears to be that SapiEmitter echoes the entire body directly to the output buffer, whereas StreamSapiEmitter echoes the body in chunks. The manual currently states that only "A single implementation is currently available", so not much help there, and there are no class-level doc-blocks describing the purpose of each implementation.

Isn't it generally more efficient to fopen('php://output') and then stream_copy_to_stream() from one stream to the other, avoiding output buffer overhead etc.?

Is there any use-case for an emitter that supports output buffering at all? I mean, the whole purpose of PSR-7 is to model the response, such that there is (theoretically) no missing information still needing to be computed by the time we've populated the response and are ready to emit, so when is output buffering useful? Why isn't it being suppressed by the emitter?

And finally, what do you think about adding support for X-Sendfile under Apache and X-Accel under NGINX? Is it something emitters could cover, or is it out of scope?

I know that's a lot of questions, but I'm trying to assess whether the simple strategy of emitting protected assets is (or can be made) feasible in the first place, if I can help in any way, or if I need to think about alternate strategies.

@Ocramius
Copy link
Member

Ocramius commented Mar 29, 2017

I can't reply for the different emitter implementations, but for these particular bit:

And finally, what do you think about adding support for X-Sendfile under Apache and X-Accel under NGINX? Is it something emitters could cover, or is it out of scope?

This is a good use-case scenario for a middleware, in my opinion (strip body completely, replace with empty body, make sure header is well-formed, yadda yadda).

@weierophinney
Copy link
Member

First, a question: how come there are two emitter implementations? Why do we have SapiEmitter as well as SapiStreamEmitter? Are there performance differences? I don't see a benchmark in the package, so it's hard to tell.

There are. As you noted, the SapiEmitter uses the string casting capabilities of Zend\Diactoros\Stream, which in turn uses stream_get_contents() to emit the full stream contents in one go. The primary issue with this approach is that it buffers in-memory in the process of flushing to the output buffer (which is evoked via echo), which can cause issues in memory-restricted environments.

SapiStreamEmitter uses the StreamInterface API, looping while ! $stream->eof() and issuing multiple read() statements. This approach helps in memory-restricted environments, or when serving large files, by buffering one read at a time to the output buffer. The issue with the approach, however, is that there are more method calls, which leads to some performance overhead.

Honestly, you'll likely need to benchmark in your own environment with typical payloads to see which is a better fit for your needs.

Isn't it generally more efficient to fopen('php://output') and then stream_copy_to_stream() from one stream to the other, avoiding output buffer overhead etc.?

Yes, IF the underlying StreamInterface implementation is an actual PHP stream resource. If it isn't, that approach will not work. Since we cannot know for certain what the underlying implementation supports, we have to use the StreamInterface API.

Is there any use-case for an emitter that supports output buffering at all? I mean, the whole purpose of PSR-7 is to model the response, such that there is (theoretically) no missing information still needing to be computed by the time we've populated the response and are ready to emit, so when is output buffering useful? Why isn't it being suppressed by the emitter?

The emitter is called by some process in your application; it doesn't wrap execution (at least, that was never the intent). In the case of Expressive, for instance, Zend\Expressive\Application, if it composes an emitter, will call it from run() once it has a response, passing the response to it. Since the emitter acts on a response, management of output buffering is really something for the developer to manage within their front controller (e.g., public/index.php).

The reason it interacts with the output buffer is... well, we have to get the content back to the client somehow, and the way PHP does that is through the output buffer. If you're working in an async environment, I assume you'd need a custom emitter that is capable of writing back on the socket in which the request was transmitted.

And finally, what do you think about adding support for ...

As @Ocramius has noted, I think those particular items would make great middleware.

@mindplay-dk
Copy link
Author

Okay, so I benchmarked this:

<?php

header("Content-Type: image/jpeg");

while (ob_get_level() > 0) {
    ob_end_clean();
}

fpassthru(fopen(__DIR__ . "/image.jpg", "rb"));

Versus this:

<?php

use Zend\Diactoros\Response\SapiStreamEmitter;
use Zend\Diactoros\Stream;

require __DIR__ . '/vendor/autoload.php';

$body = new Stream(__DIR__ . '/image.jpg');

$response = new Zend\Diactoros\Response($body);

$response = $response
    ->withHeader("Content-Type", "image/jpeg");

$emitter = new SapiStreamEmitter();

$emitter->emit($response);

Versus the same image.jpg, a ~240KB image served by NGINX without hitting PHP, which is ~20 msec.

Results look, well, not so good for PHP, which adds around 100% overhead, so ~40 msec for fpassthru(), which was the fastest of a few different methods, including readfile() and stream_copy_to_stream(), not that there is much difference between any of those approaches, and disabling the output buffers also helps reduce overhead by only a very small amount.

Results look good for Diactoros, though - it adds only a ~0.25% overhead per request, about ~41 msec.

So the performance of Diactoros looks good! So I'm closing this issue.

I still plan on benchmarking against appserver.io and php-pm for comparison - let me know if you'd like me to post the results?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants