Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode limitation for usernames #3266

Closed
nukeador opened this issue Jul 21, 2021 · 7 comments
Closed

Unicode limitation for usernames #3266

nukeador opened this issue Jul 21, 2021 · 7 comments

Comments

@nukeador
Copy link

Hi there,

As part of the Community Discourse set up @Firefishy is leading, a question about unicode characters on usernames has been discussed and I was asked by @mmd-osm to bring this question here for guidance.

Discourse limits the characters on usernames to letters and numbers, but apparently we are getting usernames with unicode chars through the oauth2 connection.

We can enable unicode usernames over Discourse and we can also limit the allowed unicode characters, I suspect to avoid issues with invisible characters and confusing icons on the UI that might trick users:

imagen

What is the current unicode characters policy on OSM usernames? Is there an allowed list we can use too?

Thanks!

@tomhughes
Copy link
Member

We allow almost anything - most of what we don't allow is actually ascii!

The canonical test user is @amandasaurus: https://www.openstreetmap.org/user/%E1%9A%9B%E1%9A%90%E1%9A%8B%E1%9A%90%E1%9A%85%E1%9A%87%E1%9A%90%E1%9A%9C%20%F0%9F%8F%B3%EF%B8%8F%E2%80%8D%F0%9F%8C%88

@tomhughes
Copy link
Member

tomhughes commented Jul 21, 2021

The main validation that is applied to usernames, other than length constraints and disallowing leading and trailing whitespace, is https://github.com/openstreetmap/openstreetmap-website/blob/master/app/validators/characters_validator.rb, which is applied with the url_safe option.

@nukeador
Copy link
Author

Thanks @tomhughes

So I suspect we should be fine to accept any unicode, since OSM is already talking care of restricting, right?

@tomhughes
Copy link
Member

I think so, yes.

Will discourse cope with a user changing their name? Is it using the numeric ID internally and will it update it's idea of name if the OSM site name changes?

@mmd-osm
Copy link
Contributor

mmd-osm commented Jul 21, 2021

Yes, I think so. I've created user ᚐᚅᚇᚐ᚜ 🏳️‍🌈᚛ᚐᚋ -, logged on to Discourse, then changed the user name to ᚐᚅᚇᚐ᚜ 🏳️‍🌈᚛ᚐᚋ11 on osm.org, and logged on again using the new name on Discourse, and this all looked fine.

@amandasaurus
Copy link

It is fun to see that I'm the cause of this bug report (because the discourse software install had a problem when I tried to create an account), and used as an example of how OSM supports it all! 😂

Unicode, like human languages, is deep & complex. Every letter is “a unicode character”. If you plan to support it all, you may discover the edge cases of human writing 😈.

@mmd-osm
Copy link
Contributor

mmd-osm commented Jul 21, 2021

The funny thing is, I couldn’t reproduce your Unicode issue and nothing had changed in the meantime 🧐

so right now it’s still not clear what your issue was…

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants