Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Punycode (IDNA/ACE) handling #2097

Open
tlimoncelli opened this issue Feb 27, 2023 · 4 comments
Open

Refactor Punycode (IDNA/ACE) handling #2097

tlimoncelli opened this issue Feb 27, 2023 · 4 comments

Comments

@tlimoncelli
Copy link
Contributor

IDNA is something that needs to be fixed in DNSControl. Currently it takes the string from dnsconfig.js and convert it by calling dc.Punycode(). At that point we lose the original.

A better way would be: The DomainConfig struct should store the original string from dnsconfig.js and the string converted to punycode. (Maybe call the punycode version .Name and the Unicode version .NameUnicode?) It should then use the original for displaying to the users and the punycode for everything else.

Or, if we're lazy we can just output "Domain {.NameUnicode} being converted to {.Name}" once and refer to the punycode for all other output.

@tlimoncelli
Copy link
Contributor Author

The IDNA part of dnscontrol is kind of a mess.

First, I must confess that I'm not an expert here. I had to look up what ACE-encoding it just now. Please correct me if I use any terminology incorrectly.

The way things work now: Users can put Unicode in dnsconfig.js. The providers call dc.Punycode() which converts everything (zone names, labels, targets) to ACE-encoded (i.e. xn--bcher-kva) strings. After that all code assumes everything is ASCII. The "preview/push" output is all ASCII (I think).

The problem I see are:

  • some providers call dc.Punycode() and others do not.
  • preview/push output doesn't try to convert ACE-encoded to Unicode. As a result, a user could input a domain in Unicode but then be confused that the preview/push output is ACE-encoded.

I'm not 100% sure what the best way forward is. Some thoughts:

  • It would be less confusing to users if preview/push messages output ascii when the user input was ascii, and Unicode when the user input was Unicode. (This would be a simple matter of recording (bool) if the input was Unicode, and using that to convert back during preview/push output.)
  • I think it would be better if all providers called dc.Punycode(). It would be better if the conversion was done by the system, so that each provide didn't have a choice.

@tlimoncelli
Copy link
Contributor Author

dkim1970 writes:

I would say this RFC covers the "opinions" about how IDN should be handled in applications. https://datatracker.ietf.org/doc/rfc3490/

More about IDN can be found by this author: https://datatracker.ietf.org/person/paf@paftech.se

@tlimoncelli
Copy link
Contributor Author

Thanks, @dkim1970, for those URLs. I read rfc3490 and now I feel much more confident about what we should do:

My proposal:

  • Users should be permitted to use unicode or ascii in D("") statements.
  • Internally DNSControl should processed everything as ascii (i.e. run ToASCII on all labels as early as possible after "runninig" dnsconfig.js to create the DomainConfig structure). This means that dc.Punycode() should be called by dnscontrol and not be optional for providers.
  • Any output to stdout should pass all labels through ToUNICODE for display. Optionally it should display the ACE-formatted version too. Normally apps are discouraged from exposing the ACE-formatted strings except for debugging purposes (RFC3490 Section 6.1) but I would say that dnscontrol users probably always in a debugging mindset. I propose a CLI flag with options for ACE-only, Unicode-only, or both.
  • When creating zonefiles, output should be ACE, with a comment showing the Unicode.
  • Most provider APIs require everything to be ACE-formatted. If a provider sends or received Unicode, it is up to the provider code to properly translate. This goes for ListZones() and GetZoneRecords() too.

@dkim1970
Copy link
Contributor

@tlimoncelli no worries happy to be of help.
I agree with all your point on how a IDN domain should be handle by DNSControl from a users point of view and how the providers should conform to being force to accept that dc.Punycode() is run by DNSControl.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants