Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using CURLOPT_CURLU with CURLOPT_PORT overwrites port from url... #3753

Closed
ptlomholt opened this issue Apr 9, 2019 · 6 comments

Comments

@ptlomholt
Copy link

commented Apr 9, 2019

I'm using the new CURLOPT_CURLU feature to take advantage of curl's 'accumulative', relative url capabilities, but it appears that when also using CURLOPT_PORT, curl overwrites the port portion of the url in 'my' CURLU handle set by CURLOPT_CURLU. If I later use CURLOPT_PORT passing 0L to go back to the original port, curl instead remembers the port set by the previous CURLOPT_PORT call!

This behavior is different from using the classic 'CURLOPT_URL' way of passing the url to curl, where it correctly reverts back to the port indicated in the orginal url.

I figure the problem has to do with the way curl treats its internal CURLU handle as a 'scratch' area which is fine when used with a CURLOPT_URL url, which, I assume, resets the CURLU handle before every new transaction, but with CURLOPT_CURLU it clashes with 'my' CURLU handle, which I expect curl should leave alone and treat as read-only.

The following code illustrates the problem :

#include "curl/curl.h"
#include <stdio.h>

static char error_buffer[CURL_ERROR_SIZE] = "";

int main(void)
{
    const char *url = "http://httpbin.org/get";
    char *url_after;
    CURLU *curlu = curl_url();
    CURL *curl = curl_easy_init();
    CURLcode curl_code;

    printf("using curl version %s\n", curl_version_info(CURLVERSION_NOW)->version);

    curl_url_set(curlu, CURLUPART_URL, url, CURLU_DEFAULT_SCHEME);
    curl_easy_setopt(curl, CURLOPT_CURLU, curlu);
    curl_easy_setopt(curl, CURLOPT_ERRORBUFFER, error_buffer);
    curl_easy_setopt(curl, CURLOPT_VERBOSE, 1L);
    curl_easy_setopt(curl, CURLOPT_PORT, 1234L); // should not work, that's ok
    curl_code = curl_easy_perform(curl);
    printf("failure expected, curl_easy_perform returned %ld: <%s>, <%s>\n",
                                   (long) curl_code, curl_easy_strerror(curl_code), error_buffer);

    // print url
    curl_url_get(curlu, CURLUPART_URL, &url_after, 0);
    printf("curlu now: <%s>\n", url_after);
    curl_free(url_after);

    // now reset port to go back to port 80 (not!)
    curl_easy_setopt(curl, CURLOPT_PORT, 0L);

    curl_code = curl_easy_perform(curl);
    printf("success expected, curl_easy_perform returned %ld: <%s>, <%s>\n", 
                                   (long) curl_code, curl_easy_strerror(curl_code), error_buffer);

    // print url
    curl_url_get(curlu, CURLUPART_URL, &url_after, 0);
    printf("curlu now: <%s>\n", url_after);
    curl_free(url_after);

    curl_easy_cleanup(curl);
    curl_url_cleanup(curlu);

    return 0;
}

It generates the following output, note how it appears stuck on port 1234:

using curl version 7.65.0-DEV
*   Trying 52.71.234.219...
* TCP_NODELAY set
* connect to 52.71.234.219 port 1234 failed: Connection refused
*   Trying 3.85.154.144...
* TCP_NODELAY set
* connect to 3.85.154.144 port 1234 failed: Connection refused
* Failed to connect to httpbin.org port 1234: Connection refused
* Closing connection 0
failure expected, curl_easy_perform returned 7: <Couldn't connect to
server>, <Failed to connect to httpbin.org port 1234: Connection refused>
curlu now: <http://httpbin.org:1234/get>
* Hostname httpbin.org was found in DNS cache
*   Trying 52.71.234.219...
* TCP_NODELAY set
* connect to 52.71.234.219 port 1234 failed: Connection refused
*   Trying 3.85.154.144...
* TCP_NODELAY set
* connect to 3.85.154.144 port 1234 failed: Connection refused
* Failed to connect to httpbin.org port 1234: Connection refused
* Closing connection 1
success expected, curl_easy_perform returned 7: <Couldn't connect to
server>, <Failed to connect to httpbin.org port 1234: Connection refused>
curlu now: <http://httpbin.org:1234/get>

curl/libcurl version

7.64.1 and 7.65.0-DEV (github master (as of April 8th 2019))

operating system

CentOS 7, but probably not OS specific

@jay jay added the URL label Apr 9, 2019
@jay

This comment has been minimized.

Copy link
Member

commented Apr 9, 2019

I can reproduce that here. CURLOPT_CURLU doc says "libcurl will use this handle and its contents read-only and will not change its contents."

curl/lib/url.c

Lines 3009 to 3025 in 521bbbe

static CURLcode parse_remote_port(struct Curl_easy *data,
struct connectdata *conn)
{
if(data->set.use_port && data->state.allow_port) {
/* if set, we use this instead of the port possibly given in the URL */
char portbuf[16];
CURLUcode uc;
conn->remote_port = (unsigned short)data->set.use_port;
msnprintf(portbuf, sizeof(portbuf), "%d", conn->remote_port);
uc = curl_url_set(data->state.uh, CURLUPART_PORT, portbuf, 0);
if(uc)
return CURLE_OUT_OF_MEMORY;
}
return CURLE_OK;
}

and in curl_url_set

curl/lib/urlapi.c

Lines 1363 to 1364 in 521bbbe

if(port)
u->portnum = port;

Stack
libcurld.dll!curl_url_set(Curl_URL * u, CURLUPart what, const char * part, unsigned int flags)  Line 1363	C
libcurld.dll!parse_remote_port(Curl_easy * data, connectdata * conn)  Line 2846 + 0x17 bytes	C
libcurld.dll!create_conn(Curl_easy * data, connectdata * * in_connect, bool * async)  Line 3549 + 0xd bytes	C
libcurld.dll!Curl_connect(Curl_easy * data, bool * asyncp, bool * protocol_done)  Line 4023 + 0x11 bytes	C
libcurld.dll!multi_runsingle(Curl_multi * multi, curltime now, Curl_easy * data)  Line 1368 + 0x11 bytes	C
libcurld.dll!curl_multi_perform(Curl_multi * multi, int * running_handles)  Line 2072 + 0x29 bytes	C
libcurld.dll!easy_transfer(Curl_multi * multi)  Line 625 + 0xd bytes	C
libcurld.dll!easy_perform(Curl_easy * data, bool events)  Line 719 + 0x18 bytes	C
libcurld.dll!curl_easy_perform(Curl_easy * data)  Line 738 + 0xb bytes	C

Perhaps we should copy it internally before each transfer to avoid potential problems like this.

Also as a separate issue shouldn't curl_url_set(uh, CURLUPART_PORT, "0", 0); set the port to 0?

@bagder

This comment has been minimized.

Copy link
Member

commented Apr 9, 2019

Perhaps we should copy it internally before each transfer to avoid potential problems like this.

Probably, yes. We should start with writing up a test case that reproduces this, to avoid future regressions. Then fix it. Possible we should "copy-on-write" to avoid a lot of extra copies when no "write" is necessary, but perhaps that's just complicating matters further...

Also as a separate issue shouldn't curl_url_set(uh, CURLUPART_PORT, "0", 0); set the port to 0?

Perhaps. Port zero is hardly ever actually used as a "real" port anyway so I'm not sure exactly what a user would expect such an invocation to do...

@jzakrzewski

This comment has been minimized.

Copy link
Contributor

commented Apr 9, 2019

Perhaps. Port zero is hardly ever actually used as a "real" port anyway so I'm not sure exactly what a user would expect such an invocation to do...

Possibilities:

  • 0
  • default / not set
  • random (this is what the kernel does when asked about port 0)

I lean towards "0" but the second option is quite OK too, provided that it's documented.

@bagder bagder self-assigned this Apr 11, 2019
bagder added a commit that referenced this issue Apr 11, 2019
Since a few code paths actually update that data.

Fixes #3753

Reported-by: Poul T Lomholt
@bagder

This comment has been minimized.

Copy link
Member

commented Apr 11, 2019

The kernel doesn't treat zero as random, but several APIs provided for applications do. Port number zero is actually not special-cased in TCP so it can in theory be used, but due to how many APIs are done it is next to impossible.

I'm in favor of sticking to the current behavior for this: setting CURLUPART_PORT to "0" will return an error. I'll document this and make sure it is also verified in test 1560.

bagder added a commit that referenced this issue Apr 11, 2019
Only allow well formed decimal numbers in the input.

Document that the number MUST be between 1 and 65535.

Add tests to test 1560 to verify the above.

Ref: #3753
@jzakrzewski

This comment has been minimized.

Copy link
Contributor

commented Apr 11, 2019

TIL.
And after reading your blog post I see nothing against documenting it as an error. The only downside is not being able to use port 0. Given it's weird status it's probably not going to hurt anyone that much. ;)

@bagder bagder closed this in 9a4ad1b Apr 11, 2019
@jay

This comment has been minimized.

Copy link
Member

commented Apr 11, 2019

Perhaps. Port zero is hardly ever actually used as a "real" port anyway so I'm not sure exactly what a user would expect such an invocation to do...

I meant it should set the port to 0 internally (ie clear the port) because that seems like a bug. Setting it to actual port 0 (eg 1.2.3.4:0) may be technically correct. Anyway I'm having second thoughts. I will follow up with some comments in #3762.

bagder added a commit that referenced this issue Apr 12, 2019
Only allow well formed decimal numbers in the input.

Document that the number MUST be between 1 and 65535.

Add tests to test 1560 to verify the above.

Ref: #3753
bagder added a commit that referenced this issue Apr 13, 2019
Only allow well formed decimal numbers in the input.

Document that the number MUST be between 1 and 65535.

Add tests to test 1560 to verify the above.

Ref: #3753
Closes #3762
@lock lock bot locked as resolved and limited conversation to collaborators Jul 10, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
4 participants
You can’t perform that action at this time.