-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hyphenation only works in English #1451
Comments
Can you try with And share a full reproducible example so that I can have a look. Hyphenation is not a bookdown feature or R Markdown feature; It is from latex language support. As Pandoc is creating the LaTeX and handling the
You can add Thanks |
I detected that the same TEX-file produced by "bookdown" gives a different PDF-file in TeXstudio - a correct one with hyphenation. As TEX-file generation seems to not cause the problem, I used pdflatex in my server machine and in my desktop machine. The server machine generated a PDF-file that has no hyphenation and the desktop machine generated a PDF-file that has hyphenation. In the server machine:
In the desktop machine:
The only difference is the minor-minor version and it makes such a huge difference. The full code of the bookdown project:
|
BY this I meant rendering a document using ---
output: pdf_document
--- in the header instead of using bookdown. This would help build a simpler example to reproduce the issue you are seeing |
I understand you are saying that without bookdown or R being involved you get a difference using two different One thing that could lead to big difference could be the package version. You could also check that both your environment are using the same version of the tools - especially those involved in hyphenation. |
Yes, without bookdown or R being involved, I get a difference using two different pdflatex versions. I have the newest version on both machines which is possible. The server machine with RStudio Server is running with Ubuntu 22.04 LTS and the desktop machine is running with Ubuntu 23.10. |
This is a bummer - I am not sure what we could do. If you find what changes in the tooling and there is some TeX adjustment to make in preamble included for example, we'll be happy to look at it
I would really check the packages versions between both your environment Maybe by comparing output of Hopefully you'll find the issue. |
In both machines:
|
Your installation is different than what I am used to. I am using TinyTeX FWIW (https://yihui.org/tinytex/) You are probably using the system TeX Live So you can read Overall, I don't think this is an issue with bookdown, R Markdown ecosystem or even R. This is related to how TeX Live works and has potentially changed. You can also read the documentation of Hope it helps |
@piiskop have you installed the respective Debian TeX Live packages for your language? That would be one of the Can you only create the LaTeX file and run pdflatex manually on it, and send the output? That would make it much easier to debug. |
It means that you are using two different tex systems which is a huge difference as tex systems can contain more or less packages. Compile this
If the language support is available the log should say something like this
|
It turns out that the cause for the difference was not the version difference of pdflatex. I followed the suggestion of @norbusan which was not straightforward at first as there is no replacement like Before following the suggestion of @norbusan , I gave a try what @u-fischer suggested. Here is the result on my desktop machine:
I see that babel-estonian.tex is listed but no message like importing is displayed. The PDF output has hyphenated text. This is the result on my server machine:
Also babel_estonian.tex is listed but there are Overfull-messages for the strings containing letters not available in Latin. The text in the PDF output was not hyphenated. Then I came back to the suggestion of @norbusan and tried to figure out what package is available for Estonian: I followed the command by pressing T. The list of available language packages:
It has no direct package for Estonian as it has for Greek or Polish. However, there are some packages that might contain useful things for Estonian. One of them is texlive-lang-all. I did not want to install that because obviously, it would have installed all the possible languages in our visible universe known to TeX. I used an Internet search engine to find out what is in texlive-lang-other because as you might know: in the series Lost, there were these mysterious others. It turned out that it did not contain Estonian-others. Next, I used the same latter approach for texlive-lang-european although that name if consistent with my previous experiences on computer languages, should not contain Estonian things neither because in early days, they counted Estonian to Baltic languages although Estonian is not a Baltic language. Estonian is listed for that package. So I installed the package:
Giving the same command to my desktop computer resulted with:
This difference of results gave me hope that as the package was missing on my server machine, I can now compile PDF-files with hyphenated text in Estonian on my server machine:
There was no complaining about something being overfilled and the resulting PDF file contains hyphenated text in Estonian. Also bookdown compiles the PDF file with hyphenated text in Estonian now. Before that, the text was not hyphenated and although, I am forcefully using aligning on both edges, some lines were longer than the rest. I do not know why TeX produces such a crap if hyphenation is not in use, however this is another topic for another discussion. I thank you TeX-guys for your quick replies and hints that brought me to a better-working environment! It is still confusing though that if so many different people and organizations are involved in software development of software pieces that must work together, there is no easy and fast way to figure out solutions for special cases like my cases pretty often are. In this case: R and its packages, RStudio Server, bookdown, knitr, pandoc,_ TeX and several packages of the latter one. Each of them has their own sometimes not full documentation and there is no sufficient integration documentation. This is why I write down solutions for my own problems. |
you are looking only at the (shortened) terminal output. The log-file contains much more informations. |
I agree. |
Hi @piiskop You write a lot and complain a lot to volunteers who agreed to help you and who spend countless hundreds of hours ...
It would have been a very good idea to install all those, if you don't know which you need.
Yes, a simple search on packages.debian.org would have showed you that.
I have studied about 15 languages, including Sanskrit. I know what indo-european languages are, but we called it
The log file tells you this. TeX tries to do the best even in such bad circumstances. You call things out as "crap" without understanding even the basic functionality. Go forth and write a program that does block alignment without hypenation that behaves better. Show me what you can do! And if you can't, don't call out things as crap.
Because they are different software programs. Many people use R/Rserver without EVER touching TeX. Many people use pandoc without EVER touching R, or TeX. Many people use TeX without ever having heard of R or pandoc. Why do you assume just because you feel that is "the way", everyone else needs to do that, too?
The problem was that you did NOT followed the advise we always give: Install ALL of TeX Live, and not do partial installs. If you do that, you need to know what might go wrong. Enjoy. |
I have got my solution from two of you already however as you wanted to debate this topic further, I respond. I am also a volunteer occasionally; once I volunteer, I do my job diligently. In software development, too many volunteers miss testing or documenting, or following a common style. Estonian is not an Indo-European language. Estonia is part of Europe and its location has nothing to do with the languages spoken on that territory. Estonian is therefore still a language spoken in Europe however its inclusion in European languages is not in correlation with ISO. Estonian belongs to the Uralic languages. At the same time, Czech and Slovakian belong to Indo-European languages but they have a separate package. How is this not confusing and not worth complaining? You wrote that TeX does its best. How is that doing the best if some lines are longer than other lines while aligning according to both edges and while these lines contain at least two words? I think that TeX can do better, not to mention do well. If the current behavior is the best one then I am pretty disappointed in TeX as it has been around for so long. Why are you using demagogy by telling me to write a program for aligning? LibreOffice Writer has no issue with that and it has not been around so long. Writing TeX core is not my job. This is the job of Tex-core writers. I am a physics and chemistry teacher. If you want me to do something better in teaching then please tell me so. Something is crap if it is crap no matter whether I can do better or not. It is not my responsibility. I am responsible for my thoughts, words, and actions. I do not assume that everyone has to integrate pieces of software. I do assume that since it is possible to integrate two pieces of software there must be comprehensive and easy-to-read documentation on that. Is installing all of TeX Live rational? I mean it allegedly takes a lot of space and probably most of the functionalities would not be in use. |
That TeX doesn't do that by default has something to do with quality not with inability
Sure. But the same is true for all your software, including your OS and R and whatever. I didn't the full texlive because I thought I need all of it, but because I didn't want to waste my time thinking which parts I will need. As long as I have the disk space it is only a few more dead bytes. |
Do you see the main difference on line 11 of the output of the first column where the text goes into the second column? Should it be so or does better quality mean longer space between the first two words and bringing the third word to the next line? |
I use RStudio Server and when I create a bookdown project it can only hyphenate in English in PDF output. If I set the language to be something else like Estonian then there is no hyphenation at all. The same code in TexStudio makes hyphenation in Estonian work.
So in index.Rmd, I have:
lang: et
In preamble.tex, I have:
\babelprovide[main,import]{Estonian}
Inside somewhere I have the text:
lepingu kehtivuse ajal ja pärast lepingu lõppemist hoidma konfidentsiaalsena ja ilma Tööandja kirjaliku nõusolekuta mitte avaldama kolmandatele isikutele asutusesiseseks kasutamiseks mõeldud teavet sellele kehtestatud juurdepääsupiirangu tähtaja jooksul või muude andmete hulka, millele juurdepääs või mille kasutamine on mistahes õigusaktiga piiratud.
If I build "bookdown::pdf_book" then the result is that the given text is not hyphenated. If I change "lang" to "en" then the text English text will be hyphenated according to English rules.
> xfun::session_info('bookdown')
The text was updated successfully, but these errors were encountered: