-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX: a temporary fix when CJK user tries to add a long title #7045
Conversation
Discourse doesn't analyze the sentence components. So it counts the whole sentence as a word for CJK. https://meta.discoursecn.org/t/topic/3033
You've signed the CLA, fantasticfears. Thank you! This pull request is ready for review. |
Wouldn't it make more sense to improve the word counting in |
It's also in client side right? Unicode regexp implementation in JS and Ruby are different. Plus Chinese words are connected without blanks. |
I agree with @gschlager here. We should instead improve |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should instead improve how TextSentinel
works for CJK locales.
Ok. Some related information would be:
I think the team would prefer an implementation on character level. In that case, |
Yeah, let's not do that please. KISS
I'm fine bypassing the check for CJK.
I'm fine defining new |
|
Happy to improve it to work with Unicode provided we don't need a huge regexp :)
I guess we'll have to. Are there any ways text can be badly written in CJK and we can identify it? |
I see. I'll find some time to get this rolling.
|
I think to put out the immediate fire we are ok to merge this, but I would bypass all word length tests in CJK titles as they make little sense. |
I haven't got the time for implementation but I was thinking a different mechanism. If we can make some improvement on the control flow, such methods could return probability (or confidence) on their decision. Then we can ask admin to check if posts met the requirement when the confidence is low. Otherwise, we could just stop the posting as now. |
Discourse doesn't analyze the sentence components. So it counts the whole sentence as a word for CJK.
https://meta.discoursecn.org/t/topic/3033