Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Boxed within floating keep sends troff in endless loop (no output) #79

Open
ljrk0 opened this issue Oct 28, 2018 · 12 comments
Open

Boxed within floating keep sends troff in endless loop (no output) #79

ljrk0 opened this issue Oct 28, 2018 · 12 comments

Comments

@ljrk0
Copy link

ljrk0 commented Oct 28, 2018

Minimum example (with -ms -Tpost):

.KF
.B1
FOO
.B2
.KE

Interestingly reversing the order does not produce the problem (but probably results in different results):

.B1
.KF
FOO
.KE
.B2

This does not occur on the default troff of Illumos 2018.10 nor on Oracle Solaris 10 1/13, but on latest master (tested on Arch).

@reffort
Copy link
Contributor

reffort commented Oct 29, 2018

I can confirm what you're seeing. It appears to be due to double backslashes \\ being converted into \e when in a diversion, which prevents the use of registers in a diversion (it also prevents the use of output line traps).

The traditional behavior was changed in commit 3f47f6... "Preserve \\ through diversions (like groff, unlike traditional roff)" dated 25 Jun 2015. Despite the description, the effect of this change is to not preserve \\ in a diversion--it converts \\ to \e--also, it does not quite work like groff. If this commit is reverted, your example works as expected.

@ljrk0
Copy link
Author

ljrk0 commented Oct 30, 2018

@reffort Thanks for digging into this, I had no idea where to start. This definitely seems the point to fix it. I have not so much looked into the source, but it sounds like this could be not-so-wrong (from my phone, beware):

case ESC:	/* double backslash */
                if (dilev)
                        i = ESC;
		else if (prdblesc)
			i = PRESC;
		else
			i = eschar;
		goto gx;

I'm absolutely not sure whether we should just perhaps fallthrough instead as I do not know what gx does etc., though.

@reffort
Copy link
Contributor

reffort commented Oct 31, 2018

What I use is the original troff code:

case ESC:	/* double backslash */
	i = eschar;
	goto gx;

but with a switch so I can use the other behavior if need be:

case ESC:	/* double backslash */
	if ((prdblesc || dilev) && !escesc)
		i = PRESC;
	else
		i = eschar;
	goto gx;

The variable escesc is controlled by a request .ee (escape means escape).

The document will then need to make a distinction between an escape sequence and a printable escape character \e, but that's the way troff and nroff have always been. You can get away with using \\ to get a printable backslash in the top level only.

I think the behavior was probably changed to accommodate man pages that use the non-portable groff-specific convention, but I really don't know for sure.

@ljrk0
Copy link
Author

ljrk0 commented Nov 1, 2018

Hm, but isn't the current behavior wrong anyway? It is neither the 'old' behavior nor the groff behavior, as you wrote. On groff the code works as well as on 'old' troff.

@n-t-roff
Copy link
Owner

n-t-roff commented Nov 1, 2018

Reverting 3f47f6f, @leonardkoenig suggestion, and the original troff code cause wrong output of many manpages.
How about something like

case ESC:
    if (prdblesc || (dilev && escesc))
        i = PRESC;
    else
        i = eschr;

with escesc == 0 by default? That would be equal to heirloom's traditional behavior and I could set .ee in the manpage macros.

@n-t-roff
Copy link
Owner

n-t-roff commented Nov 1, 2018

It would fix the manpage issue while keeping compatibility to use:

case ESC:
    if (prdblesc || (dilev && gemu))
        i = PRESC;
    else
        i = eschr;
    goto gx;

@ljrk0
Copy link
Author

ljrk0 commented Nov 1, 2018

Hm, but as far as I understand, heirloom's traditional behavior is not groff's behavior, which works for both, man-pages and this example code. Instead of supporting heirloom+old behavior, why not implement groff+old behavior -- or am I missing something?

@reffort
Copy link
Contributor

reffort commented Nov 2, 2018

When I looked into this a while back, I got as far as realizing that groff handles escapes a different way than troff and I would have to dig into the groff code to find out how it worked, so I just took the easy way out with .ee. If the reverse meaning is adopted, perhaps it could be given a different name, because it would then mean "escape isn't necessarily escape" (maybe .eg or something).

The double backslashes do work in groff with macros in nested diversions, and groff even correctly handles the on-the-fly macro code for output line traps defined several diversion levels deep (groff does not have output line traps, of course, but the macros work).

Although groff's behavior is non-standard, I think it would be great if the double backslash could be made to work in diversions at multiple levels and with macros, because it would solve a consistency problem with C and sed about what \\ does. The flip side is that it makes document coding ambiguous to the point that writers tend to add extra backslashes everywhere one is used; this can be seen in many groff documents and man pages, and some of them are really sloppy. With the traditional behavior, the meaning is unambiguous except when the writer exploits the quirk in the top level. Gunnar fortunately deleted the sentence condoning this usage that has been in the User's Manual since at least the late 1970s.

As for gemu, if that is what takes effect with the .cp request, the implication would be that it should work in groff mode, but it doesn't appear to solve the problem in Leonard's example, which is something that would probably occur in groff documents.

What is the effect of prdblesc? The description indicates it enables the use of \\ to mean \e in fields, but it doesn't elaborate on that (the line was changed earlier than dilev). If it was done for the same reason as dilev, it seems to me it would be more consistent to have either the traditional troff behavior or the modified behavior in effect for both prdblesc and dilev. That's the reason I switched both of them with escesc.

@n-t-roff
Copy link
Owner

n-t-roff commented Nov 2, 2018

Because of the endless loop issue it seems to be better to have the traditional behavior as the default by replacing .ee with something like .eg (combine prdblesc and dilev with something that is 0 by default).

@reffort
Copy link
Contributor

reffort commented Nov 3, 2018 via email

@n-t-roff
Copy link
Owner

n-t-roff commented Nov 3, 2018

Without analyzing it--it is necessary for manpages, unfortunately. I agree that the actual issue are sloppy written manpages (and manpage generators like pod2man), but I can't change them and they work with groff.

@reffort
Copy link
Contributor

reffort commented Nov 4, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants