Skip to content

URL fragment not getting removed. #21745

@tiymat

Description

@tiymat

I did this

/tmp/repro.c:

#include <stdio.h>
#include <curl/curl.h>

static const char URL[] = "file:///test?test#test";

int main(void) {
  CURLU *u = curl_url();
  char *out = NULL;
  CURLUcode rc;

  rc = curl_url_set(u, CURLUPART_URL, URL, 0);
  printf("Set url to \"%s\" with rc=%d\n", URL, rc);

  rc = curl_url_set(u, CURLUPART_URL, "", 0);
  printf("Set url to empty string with rc=%d\n", rc);

  rc = curl_url_get(u, CURLUPART_URL, &out, 0);
  printf("Got url \"%s\" with rc=%d\n", out, rc);

  curl_free(out);
  curl_url_cleanup(u);

  return 0;
}
> export CURL_PATH="/path/to/curl/repo"
> gcc "-I$CURL_PATH/include" /tmp/repro.c "-L$CURL_PATH/lib/.libs" "-Wl,-rpath,$CURL_PATH/lib/.libs" -lcurl -o /tmp/repro && /tmp/repro
Set url to "file:///test?test#test" with rc=0
Set url to empty string with rc=0
Got url "file:///test?test#test" with rc=0

I expected the following

Per 5.2.2 of RFC3986, my understanding is the output should be file:///test?test, with no fragment.

The only time the fragment is set in the algorithm in 5.2.2 is right at the end, where T.fragment = R.fragment;. In this case, R is "", which obviously doesn't have a fragment.

The logic causing this is in set_url in lib/urlapi.c below, where the fragment never gets cleared in the if(!uc) branch.

  if(!part_size) {
    /* a blank URL is not a valid URL unless we already have a complete one
       and this is a redirect */
    uc = curl_url_get(u, CURLUPART_URL, &oldurl, flags);
    if(!uc) {
      /* success, meaning the "" is a fine relative URL, but nothing
         changes */
      curlx_free(oldurl);
      return CURLUE_OK;
    }

It looks like 1583945 introduced this. The test that commit added uses base https://example.com (i.e. a URL without a fragment).

This can be fixed by clearing the fragment in the if(!uc) branch before returning CURLUE_OK and updating the tests.


Found due to this WPT URL test case

{
  "input": "",
  "base": "file:///test?test#test",
  "href": "file:///test?test",
  "protocol": "file:",
  "username": "",
  "password": "",
  "host": "",
  "hostname": "",
  "port": "",
  "pathname": "/test",
  "search": "?test",
  "hash": ""
}

curl/libcurl version

230a986

operating system

7.0.9-arch1-1

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions