Skip to content

fix: parse Coder authorities with Punycode (xn--) domain labels#930

Open
EhabY wants to merge 1 commit intomainfrom
fix/parse-authority-punycode
Open

fix: parse Coder authorities with Punycode (xn--) domain labels#930
EhabY wants to merge 1 commit intomainfrom
fix/parse-authority-punycode

Conversation

@EhabY
Copy link
Copy Markdown
Collaborator

@EhabY EhabY commented May 4, 2026

Summary

parseRemoteAuthority (src/util.ts) splits the SSH host name on --, which breaks any domain containing a Punycode label (xn--{encoded}). After the existing split, the loop merges segment pairs back while the prefix ends in .xn (the cut landed inside an xn--... label). This covers mid-domain, apex (xn--p1ai), and consecutive Punycode labels.

The .xn check is structural: Punycode's ACE prefix is IANA-reserved. It does not depend on Coder's username, workspace, or agent character rules.

The parseRemoteAuthority describe block is refactored to it.each.

Closes

Closes #929. Builds on #917, extending the same idea to the apex case.

Scope

A deployment on a non-Punycode -- label (e.g. foo--bar.example.com) is not handled. Registries block -- outside xn-- at the TLD level, leaving operator-chosen subdomains as the only path. Supporting that case would require slug-shape validation of the tail, which couples this code to Coder's name rules.

parseRemoteAuthority splits the SSH host name on "--", which breaks any
domain containing a Punycode/IDNA label (xn--{encoded}).  After the
existing split, walk the segments and merge each pair back together
while the prefix ends in ".xn" -- the cut landed inside an "xn--..."
label.  This covers mid-domain, apex, and consecutive Punycode labels.

Refactors the parseRemoteAuthority describe block to it.each.
@EhabY EhabY self-assigned this May 4, 2026
@EhabY EhabY requested a review from code-asher May 5, 2026 09:28
Comment thread src/util.ts
Comment on lines +73 to +78
// Reassemble Punycode labels (xn--...) the split broke apart: when the
// prefix ends in ".xn", the cut landed inside an "xn--..." label.
while (parts.length >= 2 && parts[0].endsWith(".xn")) {
parts.splice(0, 2, `${parts[0]}--${parts[1]}`);
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently double dashes are only disallowed as the third and fourth characters, and even then it can depend on the tld?

I imagine this is exceedingly rare, but it means a domain like https://test--ドメイン名例.jp (xn--test---8o4epknhtgo724awkl.jp, becomes a triple dash apparently) is actually valid. And also things like test--domain.com are valid.

So I think to fully fix any domain issues we might have to take more of a parser approach. Something like scan through until the last dot, then scan to the first -- that is not part of a ---, then split on everything after that index. Something like that lol

The current change still seems like an improvement though.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this was considered but it would tie us to Coderd semantics. Like this only works because usernames cannot have dots and thus we can keep appending the segments until there is no dot (this would work on all domains even if rare). Do you prefer this approach?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

parseRemoteAuthority misparses Punycode (xn--) domains in SSH host names

2 participants