Join GitHub today
Add "·" MIDDLEDOT (U+00B7) support #4
*Note: this issue is copied from old "twitter-text-conformance" repo
MDIDDLEDOT (U+00B7) is very used as inner-word punctuation in Catalan, a mandatory diacritical char in Catalan ortography rules. Currently Twitter doesn't allow to use "·" in several places, so I request to improve its support in Twitter.
I requested it in Twitter support forum, without feedback. So, I request it here. If that's not the place, please, report it to L10N Twitter team.
About 1 and 3
So, please, improve U+00B7 support in Twitter.
Thanks in advance.
I found a new bug related with U+00B7 and Twitter. Please, see this Tweet https://twitter.com/unjoanqualsevol/status/469148413486194688 There are 2 valid and registered URLs
Current Unicode UAX 31 cites 00B7 and its use in hashtags
Is there any improvement or roadmap about this issue?
Apr 7, 2015
Twitter supports hashtags with middle dot (U+00B7), really good news, :)
There are some issues around middle dot support in URLs:
Expected behaviour in all 3 cases is same currently achieved with accented letters (à,ç,ñ...). I. E. autolinking working fine with L·L
Please, note CMSs, like Wordpress, doesn't escape middle dot, and there are many word in Catalan Wiktionary with L·L. See: http://ca.wiktionary.org/wiki/Categoria:Mots_en_catal%C3%A0_amb_eles_geminades
Just to point one more example about autolinking URLs
See following Tweet:
But Twitter autolink breaks on "·" U+00B7 char and split URL:
Or, properly escaped if you copy it from the address bar of a modern
I think it's funny how people and messaging products are gradually giving
But, in the case of the middle dot, I don't mind adding it. It is just a
Is there a new RFC for what chars are allowed in urls in the age of modern
On Wed, Oct 7, 2015 at 12:08 PM, Joan Montané firstname.lastname@example.org
Yeah! I know beyond-old-ASCII chars should be escaped but, as you point, several web services (Wordpress, Twitter...) generate URLs with such chars, so links become unusable, :(
MIDDLE DOT (U+00B7) is used as inner-word char for Catalan language. According to Unicode UAX TR29 it's a MidLetter character  on word boundary segmentation. So, it's unlikely that it's used as a URL terminator.