Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC2265: Proposal for mandating case folding when processing e-mail address localparts #2265

Merged
merged 10 commits into from Jun 7, 2020

Conversation

@babolivier
Copy link
Member

@babolivier babolivier commented Aug 30, 2019

Rendered

@babolivier babolivier changed the title Proposal for mandating lowercasing when processing e-mail address localparts MSC2265: Proposal for mandating lowercasing when processing e-mail address localparts Aug 30, 2019
@turt2live turt2live self-requested a review Aug 30, 2019
Copy link
Member

@turt2live turt2live left a comment

seems sane to me

Loading

This proposal suggests changing the specification of the e-mail 3PID type in
[the Matrix spec appendices](https://matrix.org/docs/spec/appendices#pid-types)
to mandate that any e-mail address must be entirely converted to lowercase
before any processing, instead of only its domain.
Copy link
Member

@KitsuneRal KitsuneRal Aug 31, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how much of complication is to mandate lower-case processing (such as lookup and hashing) but case-preserve storing addresses.

Loading

Copy link
Member

@turt2live turt2live Aug 31, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we ultimately can't tell implementations how to store their data (if they want to spend the extra time converting things to uppercase they can), but the requirement for lookups being lowercase is a fairly strong argument imo

Loading

proposals/2265-email-lowercase.md Outdated Show resolved Hide resolved
Loading
proposals/2265-email-lowercase.md Outdated Show resolved Hide resolved
Loading
proposals/2265-email-lowercase.md Show resolved Hide resolved
Loading
proposals/2265-email-lowercase.md Outdated Show resolved Hide resolved
Loading
proposals/2265-email-lowercase.md Outdated Show resolved Hide resolved
Loading
proposals/2265-email-lowercase.md Show resolved Hide resolved
Loading
babolivier and others added 6 commits Sep 2, 2019
Co-Authored-By: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
Co-Authored-By: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
Co-Authored-By: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
Co-Authored-By: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
@anoadragon453
Copy link
Member

@anoadragon453 anoadragon453 commented Sep 3, 2019

Seems like general consensus overall.

@mscbot fcp merge

Loading

@mscbot
Copy link
Collaborator

@mscbot mscbot commented Sep 3, 2019

Team member @anoadragon453 has proposed to merge this. The next step is review by the rest of the tagged people:

No concerns currently listed.

Once at least 75% of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

Loading

Copy link
Member

@richvdh richvdh left a comment

So some questions on this of the sort that arise whenever case mapping comes up:

Are you sure that lower-casing, as opposed to casefolding, is the right thing to do? Examples of the difference:

  • ß (german lower-case long 's', upper-case equivalent 'SS') case-folds to 'ss', so that 'hans.voß' matches 'HANS.VOSS'. (On the other hand: it's not entirely obvious that they should be treated the same)
  • ς (greek lower-case sigma, when used at the end of the word) case-folds to 'σ' (regular lower-case sigma), so that 'ΣΊΣΥΦΟΣ' matches 'σίσυφος'.

Relatedly: should we consider unicode normalisation, so that (for example) 'ê' (U+00EA, e with circumflex) is treated the same as 'ê' (U+0065 U+0302, e followed by circumflex combining character)?

Neither of the above solve the 'French problem' where (traditionally) accents are omitted on upper-case characters, so 'COTE' should be equivalent to 'côté'...

Loading

@babolivier
Copy link
Member Author

@babolivier babolivier commented Sep 10, 2019

Relatedly: should we consider unicode normalisation, so that (for example) 'ê' (U+00EA, e with circumflex) is treated the same as 'ê' (U+0065 U+0302, e followed by circumflex combining character)?

I guess it depends on whether common email providers treat both characters as the same. I'll do some investigation around that.

Neither of the above solve the 'French problem' where (traditionally) accents are omitted on upper-case characters, so 'COTE' should be equivalent to 'côté'...

Should it, though, keeping in mind we're only looking at email addresses here? I just checked on both Gmail and Hotmail and neither of them consider bréndan.abolivier@... as being the same as brendan.abolivier@..., and I'm not aware of any provider that does.

Otherwise, yes, casefold is probably the way to go, I'll update the proposal to reflect that.

Loading

@richvdh
Copy link
Member

@richvdh richvdh commented Sep 10, 2019

tbh I wasn't aware that gmail let you use non-ascii localparts at all. Certainly being guided by the behaviour of common providers seems like a sensible idea. (also: sorry for not starting a thread.)

Loading

proposals/2265-email-lowercase.md Show resolved Hide resolved
Loading
@babolivier
Copy link
Member Author

@babolivier babolivier commented Sep 10, 2019

tbh I wasn't aware that gmail let you use non-ascii localparts at all

Neither was I. For full context, I've tried Thunderbird (+OVH), Roundcube (+OVH), Hotmail and Gmail and only this last one accepted a non-ascii localpart in the recipient's address.

Loading

@babolivier babolivier changed the title MSC2265: Proposal for mandating lowercasing when processing e-mail address localparts MSC2265: Proposal for mandating case folding when processing e-mail address localparts Nov 13, 2019
proposals/2265-email-lowercase.md Outdated Show resolved Hide resolved
Loading
@mscbot
Copy link
Collaborator

@mscbot mscbot commented Jun 2, 2020

🔔 This is now entering its final comment period, as per the review above. 🔔

Loading

@mscbot
Copy link
Collaborator

@mscbot mscbot commented Jun 7, 2020

The final comment period, with a disposition to merge, as per the review above, is now complete.

Loading

@turt2live
Copy link
Member

@turt2live turt2live commented May 3, 2021

Merged 🎉

Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Linked issues

Successfully merging this pull request may close these issues.

None yet

7 participants