Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add “real” Latin text that will hyphenate properly with language tags #13

Closed
svenper opened this issue Dec 25, 2018 · 7 comments
Closed

Comments

@svenper
Copy link

svenper commented Dec 25, 2018

Maybe this is better suited to be a new package, but considering the similarity I think this would be a proper de-duplication of effort. (Maybe kantlipsum etc. qualify as well...)

As the “lorem ipsum” text is not really in the Latin language, babel or polyglossia will not hyphenate it properly, which could lead to wrong assumptions about text metrics. Instead the inspirational text of “lorem ipsum” could be used – De finibus bonorum et malorum by Cicero (“dolorem ipsum, quia dolor sit”, etc.).

I suggest a source addition like this, composed from Wikisource, and other sources(one,two,three,four,five) for correcting the Greek phrases (that in themselves present some problems).

I understand these could not be the default paragraphs, which could be worked around either by:

  1. Adding the paragraphs at the end, beginning at 151, or
  2. Using a separate counter, adding a command \AddCiceroPar or similar, and using
    an optional argument either to \usepackage or to \lipsum to access Cicero’s paragraphs. This would also allow the paragraphs to be easily wrapped in otherlanguage environments.
@patta42
Copy link
Owner

patta42 commented Dec 26, 2018

I was thinking about providing a general interface for lipsum-like packages to allow the implementation of different languages. However, providing an interface that allows to select a command name, let's say cicero, and defines \cicero, \cicero*, \SetCiceroDefault and all the commands provided by lispum.sty was too much effort, thus I skipped it. Another, much easier way, would be to implement an inzterface that allows defining different texts that can be used by \lipsum and friends.

If you save your source addition as cicero.ltd (ltd = lipsum text definition ;-)), a very simple way to implement different texts (and languages) would be:

\documentclass{article}

\usepackage{lipsum}
\ExplSyntaxOn
\str_new:N\g_lipsum_language
\cs_generate_variant:Nn \str_if_e:nnTF {VnTF}
\NewDocumentCommand\IfLipsumLanguage{mmm}{%
  \str_if_eq:VnTF\g_lipsum_language{#1}{#2}{#3}
}

\NewDocumentCommand\SetLipsumText{m}{%
  \IfLipsumLanguage{#1}{}{
    \seq_clear:N\g_lipsum_paragraph_seq
    \file_input:n{#1.ltd}
    \str_set:Nn\g_lipsum_language{#1}
  }
}

\ExplSyntaxOff
\begin{document}
\lipsum[1][1]

\SetLipsumText{cicero}
\lipsum[1][1]

\end{document}

I will think about this, but I like the idea.

@patta42
Copy link
Owner

patta42 commented Dec 26, 2018

However, what I don't like is the formatting commands in your text (for example \hfill, \noindent or \\). I agree that \foreignlanguage might be required (although I do not really think that this is in the scope of lipsum), but if you are developing, say, a new table environment, it will cause unexpected effects if \\ is included in the dummy text. And the only purpose of lipsum is to provide fill text for code development or examples.

@svenper
Copy link
Author

svenper commented Dec 26, 2018

That works nice! My first thought was to add something like:

\NewDocumentCommand\cicero{ s O{1-7} o }{%
	\SetLipsumText{cicero}
	\IfBooleanTF{#1}
		{\lipsum*[#2][#3]}
		{\lipsum[#2][#3]}
}

but that might be slow, and would require using a lower level command for \lipsum etc.

Regarding the mess that is my paragraph 11, I agree. As it is a “verse”, it should have some indication of line breaks, which could maybe be replaced with |, or the like, or even be removed altogether.

Because the fontspec feature Ligatures=TeX can be disabled I think the file should use \textendash instead of -- and the like. Though \space after such commands could be better as simply {}.

@patta42
Copy link
Owner

patta42 commented Dec 27, 2018

Some more thoughts on this: I think I will implement this interface since this not much work and easy to use. I guess I will not implement the Cicero-text, since the greek (and some other sentences) will make problems. To use it, one will need a suitable font, a set of packages and so on. I am not willing to provide support for this, I guess ;-)

However, it would be nice to have a second text bundled with lipsum as an example. I was looking at "de bello gallico", which seems to be much easier in terms of typesetting it, as well as on some german texts, which will only need umlauts (which will also make problems, for example the encoding of the file and the input encoding used for LaTeX will have to match).

However, it will be easy for others to add other texts to CTAN, for example Cicero's speech, and use the lipsum interface, but they will have have to provide support on their own.

@svenper
Copy link
Author

svenper commented Dec 28, 2018

If another text is more suitable, by all means, the interface is nice regardless. I fixed the major problems with De finibus though:

  • The greek really isn’t that important, and transliterating it would create new hyphenation problems, so I replaced it with [\ldots].
  • Single guillemets will fail in the default encoding, so I replaced them with ( ).
  • ¶11 was made into running text.

Were there any others you thought about? What about this?

@patta42
Copy link
Owner

patta42 commented Dec 30, 2018

This looks better, it compiles without any additional packages! If you don't mind, I will use it for the next release of lipsum. I will of course mention your nick. If you want your real name to be mentioned, send me an email if you don't want to write it here.

@patta42
Copy link
Owner

patta42 commented Jan 3, 2019

217afb1

@patta42 patta42 closed this as completed Jan 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants