Skip to content

Extend CURLFile to support streams #4034

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed

Conversation

cmb69
Copy link
Member

@cmb69 cmb69 commented Apr 16, 2019

Due to former restrictions of the libcurl API, curl multipart/formdata
file uploads supported only proper files. However, as of curl 7.56.0
the new curl_mime_*() API is available (and already supported by
PHP[1]), which allows us to support arbitrary seekable streams, which
is generally desirable, and particularly resolves issues with the
transparent Unicode and long part support on Windows (see bug #77711).

Note that older curl versions are still supported, but CURLFile is
still restricted to proper files in this case.

[1] http://git.php.net/?p=php-src.git;a=commit;h=a83b68ba56714bfa06737a61af795460caa4a105

@cmb69
Copy link
Member Author

cmb69 commented Apr 16, 2019

This is a follow-up to PR #4032. /cc @weltling, @adoy.

PS: see also curl/curl#3675 (comment)

if (php_stream_stat(stream, &ssb)) {
zend_string_release_ex(string_key, 0);
return FAILURE;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth to support non seekable stream, like URL, which is a completely legit scenario.

Thanks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That may not always work (see https://curl.haxx.se/libcurl/c/curl_mime_data_cb.html), but we still could fail in this case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing -1 for data length should be ok in general and would spare the stat call. Not requiring stream to be seekable and setting the seek function to NULL might be still a fallback. Knowing the stream size ahead is good, but doesn't guarantee we still don't end up with a partial data as it's stream dependent.

Thanks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The curl docs are not particularly clear regarding the datasize, but it seems passing -1 works.

Instead of passing NULL as seekfunc I'd suggest to have a callback which tries to seek, and returns CURL_SEEKFUNC_CANTSEEK otherwise. which indicates that "libcurl is free to work around the problem if possible".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, could be probably an option, too. Thanks.

@nikic
Copy link
Member

nikic commented Apr 16, 2019

Can you please also add a test where a non-file stream is used?

@cmb69
Copy link
Member Author

cmb69 commented Apr 16, 2019

Non-statable and possibly non-seekable streams support added, as well as a test for data://

@cmb69
Copy link
Member Author

cmb69 commented Apr 23, 2019

If there are no objections, I'm going to merge this in a few days.

@nikic
Copy link
Member

nikic commented Apr 23, 2019

Care must be taken if the part is bound to a curl easy handle that is later duplicated: the arg pointer argument is also duplicated, resulting in the pointed item to be shared between the original and the copied handle. In particular, special attention should be given to the freefunc procedure code since it will be called twice with the same argument.

Do we need to do anything about this (as we expose curl_easy_duphandle)?

@cmb69
Copy link
Member Author

cmb69 commented Apr 23, 2019

@nikic Indeed, thanks! The best idea I came up with so far would be to not register a freefunc, but rather to convert the streams to zvals and to register them with the php_curl struct, and to increase the refcount when curl_copy_handle() is called, and to release them when the php_curl is destroyed. Alternative suggestions are welcome.

cmb69 added 4 commits April 26, 2019 18:42
Due to former restrictions of the libcurl API, curl multipart/formdata
file uploads supported only proper files.  However, as of curl 7.56.0
the new `curl_mime_*()` API is available (and already supported by
PHP[1]), which allows us to support arbitrary *seekable* streams, which
is generally desirable, and particularly resolves issues with the
transparent Unicode and long part support on Windows (see bug #77711).

Note that older curl versions are still supported, but CURLFile is
still restricted to proper files in this case.

[1] <http://git.php.net/?p=php-src.git;a=commit;h=a83b68ba56714bfa06737a61af795460caa4a105>
We pass `-1` as `datasize` to avoid the `stat` call, and let curl
handle non-seekable streams.
If the CURL handle is duplicated, the `freefunc` is called multiple
times, so we must take care not to close the stream prematurely.  Thus,
instead of passing a `freefunc` to `curl_mime_data_cb`, we add the
streams to a dedicated `zend_llist` which we release when the last
duplicated CURL handle is destroyed.
@cmb69 cmb69 changed the base branch from master to PHP-7.4 April 26, 2019 16:45
@cmb69
Copy link
Member Author

cmb69 commented Apr 27, 2019

I have addressed the duplication of handles issue, and retargeted to PHP-7.4. CI is green.

If there are no objections, I'll merge this on Monday.

@cmb69
Copy link
Member Author

cmb69 commented Apr 29, 2019

Applied as c68dc6b. Thanks!

@cmb69 cmb69 closed this Apr 29, 2019
@cmb69 cmb69 deleted the curl-streams branch April 29, 2019 08:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants