-
-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hyphenation on words joined with hyphen in Polish #1960
Comments
Perhaps it's not too impolite to ask @jakubkaczor (who opened the above-mentioned Typst issue) on that matter? |
Some 2022 (LaTeX) Babel for Polish manual also mentions it: "According to Polish rules, when a break occurs at an explicit hyphen, the hyphen gets repeated at the beginning of the new line." (Of course they "fix" it by requiring the user to typeset some specific markup, with active "catcodes"... Sigh.) |
It is not impolite at all to ask me. If I understood the question correctly, you wonder whether there should be any hyphenation points in the word between a hyphen if it occurs. I am not knowledgeable enough in the topic, but I can link some sources. I believe the most common package, and the one I used, for correcting hyphenation in LaTeX is |
@jakubkaczor Many thanks! Indeed on p. 11 of the document you mentioned: "... allow both parts of the word to be considered That answers my question. (I can't understand the 0-valued kerns in TeX, but whatever, the conclusion is the key.) |
Apparently also in Czech |
Thanks for looking into this @Omikhleia, and thanks for the feedback @jakubkaczor. This should be working properly in the next patch release. It might even be worth adding an example to the website to showcase this. I could then add an Turkish example too so we have some samples of how atypical hyphenation rules are or can be handled. |
Rumour has it that in Polish, when a word containing a hyphen is cut at that point, the hyphen must be repeated on the next line -- and that several typesetting systems have unaddressed enhancement requests on that topic:
Typst (issue in 2024): typst/typst#3235
OpenOffice (issue in 2006): https://bz.apache.org/ooo/show_bug.cgi?id=71679
If this typography convention for Polish is correct, then SILE too is currently wrong:
Except that it would be a fairly trivial thing to fix... Just a few lines of code, possibly:
(EDIT: Well, a bit more, see PR)
Yay! 😄
I could make a PR, but the details are in the devil. Here I stack the "-" onto the previous word, and insert a postbreak discretionary.
But we could also stack the "-" on the next word, or even wholly ignore it (of course, each time using an adequate postbreak/prebreak/replacement discretionary).
The question at stakes is how is supposed to be hyphenated the first word?
With the above fix, we get bia•ło-•czer•wony (I am marking the hyphenation points with • to distinguish them from the word's hyphen)
... because
SILE.typesetter:typeset(SILE.showHyphenationPoints("biało-, "pl"))
➡️ bia•ło-But note that
SILE.showHyphenationPoints("biało", "pl")
➡️ biało (EDIT: no hyphenation point currently)... so depending on how we do it, we can get different hyphenations...
It seems to me that bia•ło-•czer•wony might be correct, but we'd need a Polish friend to confirm the expectations 🇵🇱
EDIT I corrected
SILE.showHyphenationPoints("czerwony", "pl")
➡️ czer•wony (not czer•wo•ny) with default settings.The text was updated successfully, but these errors were encountered: