Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP client doesn't handle responses with Transfer-Encoding: chunked #104

Open
MayPaw opened this issue Jun 4, 2024 · 6 comments
Open
Assignees
Labels
bug Something isn't working enhancement New feature or request HTTP Client

Comments

@MayPaw
Copy link
Collaborator

MayPaw commented Jun 4, 2024

In the process of working on #82 (E2E tests) with @reimic we found an issue with the way how the HTTP client handles chunked streams.

Currently, the response is written with the chunk markers, as if they were normal characters. This corrupts the wp-cli.phar. Handling Transfer-Encoding: chunked is expected behaviour of HTTP/1.1 protocol.

At the moment we have two ideas how to handle this issue:

  1. Change the used HTTP protocol version to HTTP/1.0, which by default doesn't support chunks.
    Currently the function stream_http_prepare_request_bytes in async_http_streams.php prepares a request in 1.1:
$request = <<<REQUEST
GET $path HTTP/1.1 // We could change this to: GET $path HTTP/1.0
[...]
REQUEST;

We've tested this approach and it works. The downside to this would be probable performance loss, since HTTP/1.1 is supposedly more efficient. Yet neither I nor @reimic are capable of assessing the influence of such change.

  1. Alternatively, the streams could be filtered with the PHP dechunk filter when performing read or write operations. We have also tested that this can work. Yet this is somewhat problematic:
    We tried appending the filter after the stream creation, in function streams_http_open_nonblocking in async_http_streams.php and several other places. Unfortunately, due to usage of @stream_select in continuous reading and writing the response, appending stream filter while performing those operations is impossible (reference: PHP8: Uncaught ValueError: No stream arrays wered pased in StreamSelectLoop:: reactphp/event-loop#220).
    The first place, where it worked as expected, was at the very last possible moment, i.e. while copying the already closed stream in run method of WriteFileStepRunner.php.
public function run(
   	$input,
   	$progress = null
   ) { 
        [...]
   	$fp2 = fopen( $path, 'w' );
        [...]
        stream_filter_append($fp2,"dechunk",STREAM_FILTER_WRITE); // dechunking the stream
   	stream_copy_to_stream( $this->getResource( $input->data ), $fp2 );
   	fclose( $fp2 );
}

This doesn't seem like the right place to do this though, as dechunking should probably be done when handling the response in the client.

@adamziel what do you think?

@reimic reimic added bug Something isn't working blocker HTTP Client labels Jun 4, 2024
@adamziel
Copy link
Collaborator

adamziel commented Jun 5, 2024

Great issue! Let's use HTTP 1 until we have chunked encoding. Chunked encoding would have to be likely implemented here

function streams_http_response_await_bytes( $streams, $length, $timeout_microseconds = 5000000 ) {
	$read = $streams;
	if ( count( $read ) === 0 ) {
		return false;
	}
	$write  = array();
	$except = null;
	// phpcs:disable WordPress.PHP.NoSilencedErrors.Discouraged
	$ready = @stream_select( $read, $write, $except, 0, $timeout_microseconds );
	if ( $ready === false ) {
		throw new Exception( 'Could not retrieve response bytes: ' . error_get_last()['message'] );
	} elseif ( $ready <= 0 ) {
		throw new Exception( 'stream_select timed out' );
	}

	$chunks = array();
	foreach ( $read as $k => $stream ) {
		if ( PHP_VERSION_ID <= 71999 ) {
			// In PHP <= 7.1, stream_select doesn't preserve the keys of the array
			$k = array_search( $stream, $streams, true );
		}
		$chunks[ $k ] = fread( $stream, $length );
	}

	return $chunks;
}

Note we might also need to provide relevant response headers to that function (or maybe they're already available via stream context?)

@adamziel
Copy link
Collaborator

adamziel commented Jun 5, 2024

HTTP 1.0 is fine, but if you're up for a challenge and want to implement chunked encoding, here's how:

In that function above, you'd read the chunk size (is it two bytes?), associate it with the stream somehow (stream context?), and then buffer the next chunk size bytes to $chunks. You'd have to keep track of the last chunk size and bytes read for each stream because streams_http_response_await_bytes might stop in the middle of a chunk and resume later. Then you'd rinse and repeat until the end of the stream.

@reimic
Copy link
Collaborator

reimic commented Jun 5, 2024

Maybe we can have both? I'd recommend going the HTTP/1.0 route for now.

There is also a small headers issue, where the content-length might not be there and cause a Undefined array key error. That's quite easy to fix. So HTTP -> 1.0 and headers in the first PR.

Doing this first will enable me to finish the long-awaited E2E PR. For @MayPaw - with running tests, you'll have an easier job testing the custom dechunker implementation. And once it is ready in can enhance the client in a separate PR.

@MayPaw
Copy link
Collaborator Author

MayPaw commented Jun 6, 2024

I agree. We can go with HTTP/1.0 first, then @reimic can continue working on tests and I'll work on the dechunking mechanism.

@adamziel
Copy link
Collaborator

adamziel commented Jun 6, 2024

Sounds great, thank you!

MayPaw added a commit to MayPaw/blueprints-library that referenced this issue Jun 8, 2024
…-ending's replacement, add null check for 'content-length' header (WordPress#104)
@reimic
Copy link
Collaborator

reimic commented Jun 16, 2024

With #106 merged this is no longer a blocker, but an enhancement issue.

@reimic reimic added enhancement New feature or request and removed blocker labels Jun 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request HTTP Client
Projects
None yet
Development

No branches or pull requests

3 participants