Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accentuated letters and other non-English letters *are* letters #4

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jesus2099
Copy link

I have noticed in this post that Official release by Abörted Hitler Cöck, 2011-12-23 triggers the AB acronym AcousticBrainz because of ö not being seen as a Letter by JavaScript (it's clearly a lowercase Letter O with trema accent).

I discovered that \b, \w and \W are only working with pure ASCII letters (English).
As JavaScript RegExpr does not know \p{L}/\P{L} syntax for Letter / non-Letter Unicode categories, I had to copy a full list of character ranges from https://github.com/slevithan/xregexp/blob/master/tools/output/categories.js#L40 .

@Freso, do you have a way to test it on Discourse?
The code is becoming way much more complex.
But international compatibility should be important…
Maybe there is a more simple way… I don't know.

@danielhollas
Copy link

If anybody wants to pick this up, take a look how it is done in the original discourse-linkify-words repo. (essentially, instead of using \b and `\w' we explicitly list the boundary characters).

@jesus2099
Copy link
Author

jesus2099 commented Apr 6, 2021

Thanks @danielhollas,

We had to modify the plugin because of #1 and #2. :)
But maybe it's a good idea to come back to this opposite approach, albeit by fixing it if still needed.

Oh wow, it seems the plugin structure changed quite much, as we come from a common/head_tag.html file to a javascripts/discourse-linkify/lib/utilities.js.es6 file now.
Is it below code?

https://github.com/discourse/discourse-linkify-words/blob/91fd7d44820e01504c548860e1a47664514e291e/javascripts/discourse-linkify/lib/utilities.js.es6#L33-L34

@danielhollas
Copy link

danielhollas commented Apr 6, 2021

@jesus2099 I believe the current version of discoure-linkify should handle the issues in #1 and #2.

Yes, I've refactored the code quite a bit to make it in line with current Discourse theme approach. Incidentally, the code is now much easier to modify for the purposes of your version. Essentially, it should suffice to just modify a couple things here:
https://github.com/discourse/discourse-linkify-words/blob/91fd7d44820e01504c548860e1a47664514e291e/javascripts/discourse-linkify/initializers/initialize-discourse-linkify.js.es6#L31

So in principle it might be worthwhile to just merge with upstream and reapply the modifications, instead of pulling individual pieces. I am happy to submit a PR for that if you'd agree with that idea.

Updating to upstream would also make this theme more maintainable for the future I think. We're currently using it on one of our forum so I am interested in that aspect as well. :-)

@jesus2099
Copy link
Author

I agree it could be great to align back to your source before re-fixing, if even still needed.
I don't know where are our specific stuff (the words we highlight) but @Freso, as you maintain this, what do you think of above proposal?

@danielhollas
Copy link

@jesus2099 the word list is maintained in the theme component settings in the Discourse Admin interface and that shouldn't break with the update.

@Freso
Copy link
Member

Freso commented Apr 7, 2021

I would be happy with a PR of a merge with the updates from the source repository of this codebase. :)

@danielhollas
Copy link

@Freso @jesus2099 Great! I'll cook up a PR. Next week is going to be busy so probably the one after that. Thanks!

@jesus2099
Copy link
Author

Apparently #6 will fix this with Unicode property escapes category \p{Letter}.
I put this in draft and will close it once I can see #6 in action.

@jesus2099 jesus2099 marked this pull request as draft February 3, 2023 18:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants