Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-ascii host/domain names are not handled properly #9

Open
sjm42 opened this issue Jun 25, 2022 · 2 comments
Open

Non-ascii host/domain names are not handled properly #9

sjm42 opened this issue Jun 25, 2022 · 2 comments

Comments

@sjm42
Copy link

sjm42 commented Jun 25, 2022

For example, an URL like http://äläomista.fi/ is legal and works with a browser.

There is a catch: the web client should translate the dns names with non-ascii (UTF-8) chars properly,
and this example domain would translate into xn--lomista-4wab.fi

More info can be found here:

https://en.wikipedia.org/wiki/Internationalized_domain_name#Example_of_IDNA_encoding

@sjm42 sjm42 changed the title Non-ascii hostnames are not handled properly Non-ascii host/domain names are not handled properly Jun 25, 2022
@sjm42
Copy link
Author

sjm42 commented Jun 25, 2022

Okay, I was able to get away with this kind of code:

if let Ok(url) = Url::parse(url_s) {
    // Now we should have a canonical url, IDN handled etc.
    let url_c = String::from(url);
    // ... continue processing
}

Referring to: https://docs.rs/url/latest/url/struct.Url.html

Well, this is something that webpage could handle under the hood, I guess.
I would not mind if it did...

@orottier
Copy link
Owner

Hi @sjm42, thanks for flagging, and glad to hear you were able to work this out.
I'm keeping this issue open since I think it should be part of the library, but I won't be able to work on it anytime soon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants