Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HttpClient] Support file uploads by nesting resource streams in body option #49911

Merged
merged 1 commit into from Apr 20, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions src/Symfony/Component/HttpClient/CHANGELOG.md
Expand Up @@ -8,6 +8,7 @@ CHANGELOG
* Add `ServerSentEvent::getArrayData()` to get the Server-Sent Event's data decoded as an array when it's a JSON payload
* Allow array of urls as `base_uri` option value in `RetryableHttpClient` to retry on a new url each time
* Add `JsonMockResponse`, a `MockResponse` shortcut that automatically encodes the passed body to JSON and sets the content type to `application/json` by default
* Support file uploads by nesting resource streams in option "body"

6.2
---
Expand Down
127 changes: 117 additions & 10 deletions src/Symfony/Component/HttpClient/HttpClientTrait.php
Expand Up @@ -13,6 +13,9 @@

use Symfony\Component\HttpClient\Exception\InvalidArgumentException;
use Symfony\Component\HttpClient\Exception\TransportException;
use Symfony\Component\HttpClient\Response\StreamableInterface;
use Symfony\Component\HttpClient\Response\StreamWrapper;
use Symfony\Component\Mime\MimeTypes;

/**
* Provides the common logic from writing HttpClientInterface implementations.
Expand Down Expand Up @@ -94,11 +97,7 @@ private static function prepareRequest(?string $method, ?string $url, array $opt
}

if (isset($options['body'])) {
if (\is_array($options['body']) && (!isset($options['normalized_headers']['content-type'][0]) || !str_contains($options['normalized_headers']['content-type'][0], 'application/x-www-form-urlencoded'))) {
$options['normalized_headers']['content-type'] = ['Content-Type: application/x-www-form-urlencoded'];
}

$options['body'] = self::normalizeBody($options['body']);
$options['body'] = self::normalizeBody($options['body'], $options['normalized_headers']);

if (\is_string($options['body'])
&& (string) \strlen($options['body']) !== substr($h = $options['normalized_headers']['content-length'][0] ?? '', 16)
Expand Down Expand Up @@ -313,21 +312,129 @@ private static function normalizeHeaders(array $headers): array
*
* @throws InvalidArgumentException When an invalid body is passed
*/
private static function normalizeBody($body)
private static function normalizeBody($body, array &$normalizedHeaders = [])
{
if (\is_array($body)) {
array_walk_recursive($body, $caster = static function (&$v) use (&$caster) {
if (\is_object($v)) {
static $cookie;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this a static variable instead of a local variable ? Do we really need a hidden global state here ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need this to generate a random boundary without consuming too much entropy from the system.
We don't care about the hidden state because this is just a random string that is updated after use.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not against using a variable. But it could be a normal one, so that each call to prepareBody is independant.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or you should not reuse that cookie when generating the multipart boundary that is visible to the outside.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each call is already independent from the previous one thanks to the randomization provided by the xxh128 hashing.
I don't want to call random_bytes all the time.


$streams = [];
array_walk_recursive($body, $caster = static function (&$v) use (&$caster, &$streams, &$cookie) {
if (\is_resource($v) || $v instanceof StreamableInterface) {
$cookie = hash('xxh128', $cookie ??= random_bytes(8), true);
$k = substr(strtr(base64_encode($cookie), '+/', '-_'), 0, -2);
$streams[$k] = $v instanceof StreamableInterface ? $v->toStream(false) : $v;
$v = $k;
} elseif (\is_object($v)) {
if ($vars = get_object_vars($v)) {
array_walk_recursive($vars, $caster);
$v = $vars;
} elseif (method_exists($v, '__toString')) {
} elseif ($v instanceof \Stringable) {
$v = (string) $v;
}
}
});

return http_build_query($body, '', '&');
$body = http_build_query($body, '', '&');

if ('' === $body || !$streams && !str_contains($normalizedHeaders['content-type'][0] ?? '', 'multipart/form-data')) {
if (!str_contains($normalizedHeaders['content-type'][0] ?? '', 'application/x-www-form-urlencoded')) {
$normalizedHeaders['content-type'] = ['Content-Type: application/x-www-form-urlencoded'];
}

return $body;
}

if (preg_match('{multipart/form-data; boundary=(?|"([^"\r\n]++)"|([-!#$%&\'*+.^_`|~_A-Za-z0-9]++))}', $normalizedHeaders['content-type'][0] ?? '', $boundary)) {
$boundary = $boundary[1];
} else {
$boundary = substr(strtr(base64_encode($cookie ??= random_bytes(8)), '+/', '-_'), 0, -2);
$normalizedHeaders['content-type'] = ['Content-Type: multipart/form-data; boundary='.$boundary];
}

$body = explode('&', $body);
$contentLength = 0;

foreach ($body as $i => $part) {
[$k, $v] = explode('=', $part, 2);
$part = ($i ? "\r\n" : '')."--{$boundary}\r\n";
$k = str_replace(['"', "\r", "\n"], ['%22', '%0D', '%0A'], urldecode($k)); // see WHATWG HTML living standard

if (!isset($streams[$v])) {
$part .= "Content-Disposition: form-data; name=\"{$k}\"\r\n\r\n".urldecode($v);
$contentLength += 0 <= $contentLength ? \strlen($part) : 0;
$body[$i] = [$k, $part, null];
continue;
}
$v = $streams[$v];

if (!\is_array($m = @stream_get_meta_data($v))) {
throw new TransportException(sprintf('Invalid "%s" resource found in body part "%s".', get_resource_type($v), $k));
}
if (feof($v)) {
throw new TransportException(sprintf('Uploaded stream ended for body part "%s".', $k));
}

$m += stream_context_get_options($v)['http'] ?? [];
$filename = basename($m['filename'] ?? $m['uri'] ?? 'unknown');
$filename = str_replace(['"', "\r", "\n"], ['%22', '%0D', '%0A'], $filename);
$contentType = $m['content_type'] ?? null;

if (($headers = $m['wrapper_data'] ?? []) instanceof StreamWrapper) {
$hasContentLength = false;
$headers = $headers->getResponse()->getInfo('response_headers');
} elseif ($hasContentLength = 0 < $h = fstat($v)['size'] ?? 0) {
$contentLength += 0 <= $contentLength ? $h : 0;
}

foreach (\is_array($headers) ? $headers : [] as $h) {
if (\is_string($h) && 0 === stripos($h, 'Content-Type: ')) {
$contentType ??= substr($h, 14);
} elseif (!$hasContentLength && \is_string($h) && 0 === stripos($h, 'Content-Length: ')) {
$hasContentLength = true;
$contentLength += 0 <= $contentLength ? substr($h, 16) : 0;
} elseif (\is_string($h) && 0 === stripos($h, 'Content-Encoding: ')) {
$contentLength = -1;
}
}

if (!$hasContentLength) {
$contentLength = -1;
}
if (null === $contentType && 'plainfile' === ($m['wrapper_type'] ?? null) && isset($m['uri'])) {
$mimeTypes = class_exists(MimeTypes::class) ? MimeTypes::getDefault() : false;
$contentType = $mimeTypes ? $mimeTypes->guessMimeType($m['uri']) : null;
}
$contentType ??= 'application/octet-stream';

$part .= "Content-Disposition: form-data; name=\"{$k}\"; filename=\"{$filename}\"\r\n";
$part .= "Content-Type: {$contentType}\r\n\r\n";

$contentLength += 0 <= $contentLength ? \strlen($part) : 0;
$body[$i] = [$k, $part, $v];
}

$body[++$i] = ['', "\r\n--{$boundary}--\r\n", null];

if (0 < $contentLength) {
$normalizedHeaders['content-length'] = ['Content-Length: '.($contentLength += \strlen($body[$i][1]))];
}

$body = static function ($size) use ($body) {
foreach ($body as $i => [$k, $part, $h]) {
unset($body[$i]);

yield $part;

while (null !== $h && !feof($h)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we close the resource once we are done reading it ? Due to the async nature of the component, it would be very hard to keep the ownership of the resource in the caller code while avoiding to close it too early (it must not be closed until http-client is done with it).
Guzzle has a feature allowing to create a body from a stream, and it takes ownership of the resource for such reason.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer not. The current code allows the outer scope to rewind the stream if needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but how do you handle closing those streams to avoid leaking them ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added unset($body[$i]); and $h = null at the end of the generator. Stream resources are auto-closed by PHP when released from memory.

if (false === $part = fread($h, $size)) {
throw new TransportException(sprintf('Error while reading uploaded stream for body part "%s".', $k));
}

yield $part;
}
}
$h = null;
};
}

if (\is_string($body)) {
Expand Down
89 changes: 89 additions & 0 deletions src/Symfony/Component/HttpClient/Tests/HttpClientTraitTest.php
Expand Up @@ -13,6 +13,7 @@

use PHPUnit\Framework\TestCase;
use Symfony\Component\HttpClient\Exception\InvalidArgumentException;
use Symfony\Component\HttpClient\HttpClient;
use Symfony\Component\HttpClient\HttpClientTrait;
use Symfony\Contracts\HttpClient\HttpClientInterface;

Expand Down Expand Up @@ -68,6 +69,94 @@ public function testPrepareRequestWithBodyIsArray()
$this->assertContains('Content-Type: application/x-www-form-urlencoded; charset=utf-8', $options['headers']);
}

public function testNormalizeBodyMultipart()
{
$file = fopen('php://memory', 'r+');
stream_context_set_option($file, ['http' => [
'filename' => 'test.txt',
'content_type' => 'text/plain',
]]);
fwrite($file, 'foobarbaz');
rewind($file);

$headers = [
'content-type' => ['Content-Type: multipart/form-data; boundary=ABCDEF'],
];
$body = [
'foo[]' => 'bar',
'bar' => [
$file,
],
];

$body = self::normalizeBody($body, $headers);

$result = '';
while ('' !== $data = $body(self::$CHUNK_SIZE)) {
$result .= $data;
}

$expected = <<<'EOF'
--ABCDEF
Content-Disposition: form-data; name="foo[]"

bar
--ABCDEF
Content-Disposition: form-data; name="bar[0]"; filename="test.txt"
Content-Type: text/plain

foobarbaz
--ABCDEF--

EOF;
$expected = str_replace("\n", "\r\n", $expected);

$this->assertSame($expected, $result);
}

/**
* @group network
*
* @dataProvider provideNormalizeBodyMultipartForwardStream
*/
public function testNormalizeBodyMultipartForwardStream($stream)
{
$body = [
'logo' => $stream,
];

$headers = [];
$body = self::normalizeBody($body, $headers);

$result = '';
while ('' !== $data = $body(self::$CHUNK_SIZE)) {
$result .= $data;
}

$this->assertSame(1, preg_match('/^Content-Type: multipart\/form-data; boundary=(?<boundary>.+)$/', $headers['content-type'][0], $matches));
$this->assertSame('Content-Length: 3086', $headers['content-length'][0]);
$this->assertSame(3086, \strlen($result));

$expected = <<<EOF
--{$matches['boundary']}
Content-Disposition: form-data; name="logo"; filename="1f44d.png"
Content-Type: image/png

%A
--{$matches['boundary']}--

EOF;
$expected = str_replace("\n", "\r\n", $expected);

$this->assertStringMatchesFormat($expected, $result);
}

public static function provideNormalizeBodyMultipartForwardStream()
{
yield 'native' => [fopen('https://github.githubassets.com/images/icons/emoji/unicode/1f44d.png', 'r')];
yield 'symfony' => [HttpClient::create()->request('GET', 'https://github.githubassets.com/images/icons/emoji/unicode/1f44d.png')->toStream()];
}

/**
* @dataProvider provideResolveUrl
*/
Expand Down