Kashida/Tatweel check still too aggressive #8228
Labels
backlog
This is not on the Weblate roadmap for now. Can be prioritized by sponsorship.
enhancement
Adding or requesting a new feature.
good first issue
Opportunity for newcoming contributors.
hacktoberfest
This is suitable for Hacktoberfest. Don’t try to spam.
help wanted
Extra attention is needed.
Describe the issue
This follows up on #6877 in which exceptions were added for certain Arabic prepositions. While it was pointed out that there are limited number of these in Arabic, most languages which use Arabic script characters are not Arabic.
Uses cases of tatweel/kashida which should be permitted:
Arabic-based scripts have a number of combining characters and diacritics for which the tatweel is used as a "holder" in examples and illustrations, as in ــ٘ـ to highlight the form of the ghunna marker (used in Pakistani languages) without attaching it to surrounding characters. This may come up in translating documentation or applications which have language-specific considerations. Tatweel/kashida + any combining mark should be permitted, as well as tatweel/kashida + combining mark with tatweel/kashida on either side (easier to read with some buffer space around it).
Any sequence ending in a single tatweel/kashida followed by a spacing character, punctuation, or a combining mark should be permitted. An example I came across is translating an app which has a "Mo Tu We Th Fr Sa Su" header. Weekday abbreviations are not typically used in Punjabi, but there is no space in this context to use full words. So a workaround is مـ or اتـ for example to abbreviate the weekdays in a way that is more legible than just putting the isolated form of each letter. Any number of characters should be permitted before the kashida/tatweel for this, since in many alphabets a combination of multiple characters is required to represent a single sound (for example دھ or ای).
Really what the original kashida/tatweel check was likely trying to prevent is strings like this:
صـــفــــحـــــے
This is fair enough, but the check should be limited to [actual letter] + tatweel/kashida(s) + [actual letter] so that errors are not thrown for other contexts which have more valid use cases.
I already tried
Steps to reproduce the behavior
No response
Expected behavior
No response
Screenshots
No response
Exception traceback
No response
How do you run Weblate?
weblate.org service
Weblate versions
No response
Weblate deploy checks
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: