Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

curl_url_get for CURLUPART_HOST returns percent decoded string #14942

Closed
muttalkadavul opened this issue Sep 17, 2024 · 3 comments
Closed

curl_url_get for CURLUPART_HOST returns percent decoded string #14942

muttalkadavul opened this issue Sep 17, 2024 · 3 comments
Assignees

Comments

@muttalkadavul
Copy link

muttalkadavul commented Sep 17, 2024

I did this

#include <string>
#include <regex>
#include <tuple>
#include <iostream>
#include <curl/curl.h>


int main() {
    std::string input = "http://resum%c3%a9.com";
    CURLUcode rc = CURLUE_BAD_HANDLE;
    CURLU *urlh = curl_url();
    
    std::cout << "libcurl version" << curl_version() << std::endl;
 
    rc = curl_url_set(urlh, CURLUPART_URL, input.c_str(), 0);
    if(rc != CURLUE_OK)
    {
        std::cout << "curlucode = " << rc << std::endl;
        curl_url_cleanup(urlh);
    }

    char *host; 
    rc = curl_url_get(urlh, CURLUPART_HOST, &host, 0);
    if (rc != CURLUE_OK)
    {
        std::cout << "curlucode = " << rc << std::endl;
        curl_url_cleanup(urlh);
    }

    std::cout << host << std::endl; 
    free(host);
    curl_url_cleanup(urlh);
    return 0;
}

I expected the following

libcurl versionlibcurl/8.10.0 OpenSSL/1.1.1f zlib/1.2.11 brotli/1.0.7 libpsl/0.21.0
resum%c3%a9.com

but the actual output was


libcurl versionlibcurl/8.10.0 OpenSSL/1.1.1f zlib/1.2.11 brotli/1.0.7 libpsl/0.21.0
resumé.com

curl/libcurl version

libcurl/8.10.0

operating system

Ubuntu 20.04

@muttalkadavul
Copy link
Author

Also, if I set the host directly using curl_url_set(urlh, CURLUPART_HOST, hostname.c_str(), 0); and then call get, the hostname is not percent decoded. I get the expected result.

@Jayanta0123
Copy link

Jayanta0123 commented Sep 17, 2024 via email

@bagder
Copy link
Member

bagder commented Sep 17, 2024

The host name is indeed treated inconsistently with other components without it being documented.

bagder added a commit that referenced this issue Sep 17, 2024
As nothing in the documentation suggested otherwise and URL components
are by default stored and returned URL encoded.

Fixes #14942
Reported-by: Venkat Krishna R
bagder added a commit that referenced this issue Sep 17, 2024
When a full URL is set (parsed), the hostname component is stored URL
decoded (with default zero flags).

While perhaps surprising and inconsistent, the API has done this for
quite some time already and changigtn this now would break existing
behaviour.

Fixes #14942
Reported-by: Venkat Krishna R
@bagder bagder closed this as completed in c0a9db8 Sep 18, 2024
moritzbuhl pushed a commit to moritzbuhl/curl that referenced this issue Sep 20, 2024
When a full URL is set (parsed), the hostname component is stored URL
decoded (with default zero flags).

While perhaps surprising and inconsistent, the API has done this for
quite some time already and changigtn this now would break existing
behaviour.

Fixes curl#14942
Reported-by: Venkat Krishna R
Closes curl#14946
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment