Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend perlpodspec (and POD parsers) to allow escaping text #18305

Open
Ionic opened this issue Nov 9, 2020 · 0 comments
Open

Extend perlpodspec (and POD parsers) to allow escaping text #18305

Ionic opened this issue Nov 9, 2020 · 0 comments

Comments

@Ionic
Copy link

Ionic commented Nov 9, 2020

Currently, there's no (good) way to escape whitespace (or other text) in a POD.

This is problematic if you have to (or rather, want to) wrap long lines, especially those containing formatting codes.

Example 1:


=pod

Refer to the L<foobar section in the
Module::Really::Really::Really::Very::Very::Long documentation|
Module::Really::Really::Really::Very::Very::Long/foobar>.

=cut

Formatters that don't support hyperlinks, like Pod::Man, won't run into issues here because they ignore hyperlinks to begin with, but others, like Pod::Simple::HTML will be unable to generate a reference to

 Module::Really::Really::Really::Very::Very::Long

(note the leading space character), which certainly can be classified as a bug.

The good folks at PerlMonks were nice enough to help me debug this and pointed me to Pod::Simple::BlackBox, which either (hard-codedly) replaces newlines with space characters or doesn't transform the input at all.

Apparently, modules/tools such as Pod::Man and Pod::Text are not mangling whitespace and newlines at all (which creates a different set of problems), while Pod::Simple::HTML uses the default setting, which seems to be 0/false (i.e., to squash newlines into spaces).

If it were possible to escape whitespace (and especially newlines, but I really do mean whitespace in general), this issue could be worked around.

There are also some other problems with the interaction between POD parser behaviors and typesetting languages. For instance, when using Pod::Man to generate man pages from POD, {n,t}roff output will be generated, and at least GNU troff has a special quirk documented in Sentences, which by default doubles the whitespace generated after a punctuation character IFF it's immediately followed by a newline, but the paragraph continues.

Example 2:


=pod

This is a sentence. It's ending here, but the paragraph continues.
This is another sentence.

=cut

This will, when viewed with GNU man, generate such output:

This is a sentence. It's ending here, but the paragraph continues.  This is another sentence.

Admittedly, this is just a stylistic "issue" and there are probably other ways to fix it, but one workaround would likewise to manually escape the newline in order to make the output more consistent. Let's not talk about this in THIS ticket, though, I'm just providing it as an example of inconsistencies that could be fixed manually with a proper escaping mechanism.


Now, there is a workaround for these sort of problems, but it isn't exactly satisfactory.

The Z<> formatting code can be abused to such an effect, but the abuse is greeted with either errors or data loss:


=pod

Refer to the L<foobar section in the
Module::Really::Really::Really::Very::Very::Long documentation|Z<
>Module::Really::Really::Really::Very::Very::Long/foobar>.

=cut

This makes pod2html terminate correctly and also generate a valid link, but leads to errors (both textual and in the return value) from pod2man, podchecker and the likes.

With a bit of, err, "creative interpretation" of the whitespace requirement when using multiple angle brackets with formatting characters, we can come up with something like this:


=pod

Refer to the L<foobar section in the
Module::Really::Really::Really::Very::Very::Long documentation|Z<<
>>Module::Really::Really::Really::Very::Very::Long/foobar>.

=cut

This, however, let's podchecker (and pod2man) generate error messages such as Unterminated L<Z< ... >> sequence at line ... and A non-empty Z<> at line. Obviously, even though the perlpod documentation says that Z<< and >> must be separated by whitespace, newlines don't seem to satisfy this condition.

Not a big deal, we can slightly amend this:


=pod

Refer to the L<foobar section in the
Module::Really::Really::Really::Very::Very::Long documentation|Z<<
 >>Module::Really::Really::Really::Very::Very::Long/foobar>.

=cut

(I've put the space right before the closing angle brackets to make it more visible, but it would have the same effect if placed right after Z<<.)

This does silence errors from podchecker, pod2man and perldoc and likewise doesn't lead to (direct) errors from pod2html, but breaks the latter output.

This is the generated HTML output for this paragraph:

<p>Refer to the <a>foobar section in the Module::Really::Really::Really::Very::Very::Long documentation</a></p>

The obvious observation is that the link is broken, because it doesn't contain any location. The not-so-obvious observation is that any data in the paragraph after the L<> code is stripped and essentially lost.

For instance, the same HTML output is generated by that input:


=pod

Refer to the L<foobar section in the
Module::Really::Really::Really::Very::Very::Long documentation|Z<<
 >>Module::Really::Really::Really::Very::Very::Long/foobar>. Additional
information can be found in other places.

=cut

Hence, neither is a universally working/usable workaround.


Therefore, I propose a new formatting code like M<>, which does optionally accept text and just... munges it.

With that, any data, but also whitespace, could just be "eaten"/hidden away.


Site configuration information for perl 5.30.3:

Configured by Gentoo at Wed Sep 16 05:49:20 CEST 2020.

Summary of my perl5 (revision 5 version 30 subversion 3) configuration:
   
  Platform:
    osname=linux
    osvers=5.5.9
    archname=x86_64-linux
    uname='linux apgunner 5.5.9 #1 smp preempt tue mar 17 04:57:12 cet 2020 x86_64 intel(r) core(tm) i7-7700k cpu @ 4.20ghz genuineintel gnulinux '
    config_args='-des -Dinstallprefix=/usr -Dinstallusrbinperl=n -Ui_xlocale -Di_ndbm -Di_gdbm -Di_db -DDEBUGGING=-g -Dinc_version_list=5.30.1/x86_64-linux 5.30.1 5.30.0/x86_64-linux 5.30.0  -Dlibpth=/usr/local/lib64 /lib64 /usr/lib64 -Dnoextensions=ODBM_File -Duseshrplib -Darchname=x86_64-linux -Dcc=x86_64-pc-linux-gnu-gcc -Dar=x86_64-pc-linux-gnu-ar -Dnm=x86_64-pc-linux-gnu-nm -Dcpp=x86_64-pc-linux-gnu-gcc -E -Dranlib=x86_64-pc-linux-gnu-ranlib -Doptimize=-march=native -mtune=native -Wall -pipe -g3 -ggdb3 -gdwarf-4 -O2 -fno-omit-frame-pointer -Dldflags=-Wl,-O1 -Wl,--as-needed -Wl,-O2 -Wl,--as-needed -Dprefix=/usr -Dsiteprefix=/usr/local -Dvendorprefix=/usr -Dscriptdir=/usr/bin -Dprivlib=/usr/lib64/perl5/5.30.3 -Darchlib=/usr/lib64/perl5/5.30.3/x86_64-linux -Dsitelib=/usr/local/lib64/perl5/5.30.3 -Dsitearch=/usr/local/lib64/perl5/5.30.3/x86_64-linux
-Dvendorlib=/usr/lib64/perl5/vendor_perl/5.30.3 -Dvendorarch=/usr/lib64/perl5/vendor_perl/5.30.3/x86_64-linux -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1 -Dsiteman3dir=/usr/local/man/man3 -Dvendorman1dir=/usr/share/man/man1 -Dvendorman3dir=/usr/share/man/man3 -Dman1ext=1 -Dman3ext=3pm -Dlibperl=libperl.so.5.30.3 -Dlocincpth=/usr/include  -Dglibpth=/lib64 /usr/lib64  -Duselargefiles -Dd_semctl_semun -Dcf_by=Gentoo -Dmyhostname=localhost -Dperladmin=root@localhost -Ud_csh -Dsh=/bin/sh -Dtargetsh=/bin/sh -Uusenm -Ui_xlocale -Di_ndbm -Di_gdbm -Di_db -DDEBUGGING=-g -Dinc_version_list=5.30.1/x86_64-linux 5.30.1 5.30.0/x86_64-linux 5.30.0  -Dlibpth=/usr/local/lib64 /lib64 /usr/lib64 -Dnoextensions=ODBM_File'
    hint=recommended
    useposix=true
    d_sigaction=define
    useithreads=undef
    usemultiplicity=undef
    use64bitint=define
    use64bitall=define
    uselongdouble=undef
    usemymalloc=n
    default_inc_excludes_dot=define
    bincompat5005=undef
  Compiler:
    cc='x86_64-pc-linux-gnu-gcc'
    ccflags ='-fwrapv -fno-strict-aliasing -pipe -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'
    optimize='-march=native -mtune=native -Wall -pipe -g3 -ggdb3 -gdwarf-4 -O2 -fno-omit-frame-pointer'
    cppflags='-fwrapv -fno-strict-aliasing -pipe'
    ccversion=''
    gccversion='9.3.0'
    gccosandvers=''
    intsize=4
    longsize=8
    ptrsize=8
    doublesize=8
    byteorder=12345678
    doublekind=3
    d_longlong=define
    longlongsize=8
    d_longdbl=define
    longdblsize=16
    longdblkind=3
    ivtype='long'
    ivsize=8
    nvtype='double'
    nvsize=8
    Off_t='off_t'
    lseeksize=8
    alignbytes=8
    prototype=define
  Linker and Libraries:
    ld='x86_64-pc-linux-gnu-gcc'
    ldflags ='-Wl,-O1 -Wl,--as-needed -Wl,-O2 -Wl,--as-needed'
    libpth=/usr/local/lib64 /lib64 /usr/lib64 /usr/local/lib /usr/lib/gcc/x86_64-pc-linux-gnu/9.3.0/include-fixed /usr/lib /lib/../lib64 /usr/lib/../lib64 /lib
    libs=-lgdbm -ldb -ldl -lm -lcrypt -lutil -lc -lgdbm_compat
    perllibs=-ldl -lm -lcrypt -lutil -lc
    libc=libc-2.32.so
    so=so
    useshrplib=true
    libperl=libperl.so.5.30.3
    gnulibc_version='2.32'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs
    dlext=so
    d_dlsymun=undef
    ccdlflags='-Wl,-E'
    cccdlflags='-fPIC'
    lddlflags='-shared -march=native -mtune=native -Wall -pipe -g3 -ggdb3 -gdwarf-4 -O2 -fno-omit-frame-pointer -Wl,-O1 -Wl,--as-needed -Wl,-O2 -Wl,--as-needed'

Locally applied patches:
    gentoo/hints_hpux - Fix hpux hints
    gentoo/aix_soname - aix gcc detection and shared library soname support
    gentoo/EUMM-RUNPATH - https://bugs.gentoo.org/105054 cpan/ExtUtils-MakeMaker: drop $PORTAGE_TMPDIR from LD_RUN_PATH
    gentoo/config_over - Remove -rpath and append LDFLAGS to lddlflags
    gentoo/opensolaris_headers - Add headers for opensolaris
    gentoo/patchlevel - List packaged patches for perl-5.30.3-r1(#1) in patchlevel.h
    gentoo/cleanup-paths - Cleanup PATH and shrpenv
    gentoo/enc2xs - Tweak enc2xs to follow symlinks and ignore missing @INC directories.
    gentoo/darwin-cc-ld - https://bugs.gentoo.org/297751 darwin: Use $CC to link
    gentoo/cpan_definstalldirs - Provide a sensible INSTALLDIRS default for modules installed from CPAN.
    gentoo/interix - Fix interix hints
    gentoo/create_libperl_soname - https://bugs.gentoo.org/286840 Set libperl soname
    gentoo/mod_paths - Add /etc/perl to @INC
    gentoo/EUMM_perllocalpod - cpan/ExtUtils-MakeMaker: remove targets that generate perllocal.pod
    gentoo/drop_fstack_protector - https://bugs.gentoo.org/348557 Don't force -fstack-protector on everyone
    gentoo/usr_local - Configure: Don't include sources in /usr/local/ for compiling perl
    gentoo/D-SHA-CFLAGS - https://bugs.gentoo.org/506818 Do not set custom CFLAGS in cpan/Digest-SHA
    gentoo/io_socket_ip_tests - cpan/IO-Socket-IP: Disable network tests
    gentoo/tests - Fix EUMM podlocal tests
    gentoo/no-nsl-cl.patch -
    gentoo/no_porting_tests - Disable porting tests which create fun false-failures all over travis
    gentoo/pathtools_enoent - Disable PathTools tests which fails under sandboxing
    debian/cpan-missing-site-dirs - Fix CPAN::FirstTime defaults with nonexisting site dirs if a parent is writable
    debian/makemaker-pasthru - Pass LD settings through to subdirectories
    fixes/memoize_storable_nstore - [rt.cpan.org #77790] Memoize::Storable: respect 'nstore' option not respected
    fixes/podman-pipe - Better errors for man pages from standard input
    fixes/respect_umask - Respect umask during installation
    fixes/net_smtp_docs - [rt.cpan.org #36038] Document the Net::SMTP 'Port' option
    fixes/document_makemaker_ccflags - [rt.cpan.org #68613] Document that CCFLAGS should include $Config{ccflags}
    fixes/parallel-manisort.patch - Fix parallel building

---
@INC for perl 5.30.3:
    /etc/perl
    /usr/local/lib64/perl5/5.30.3/x86_64-linux
    /usr/local/lib64/perl5/5.30.3
    /usr/lib64/perl5/vendor_perl/5.30.3/x86_64-linux
    /usr/lib64/perl5/vendor_perl/5.30.3
    /usr/local/lib64/perl5/5.30.0/x86_64-linux
    /usr/local/lib64/perl5/5.30.0
    /usr/local/lib64/perl5
    /usr/lib64/perl5/vendor_perl/5.30.1/x86_64-linux
    /usr/lib64/perl5/vendor_perl/5.30.1
    /usr/lib64/perl5/vendor_perl/5.30.0/x86_64-linux
    /usr/lib64/perl5/vendor_perl/5.30.0
    /usr/lib64/perl5/vendor_perl
    /usr/lib64/perl5/5.30.3/x86_64-linux
    /usr/lib64/perl5/5.30.3

---
Environment for perl 5.30.3:
    HOME=/home/ionic
    LANG=en_US.utf8
    LANGUAGE=en_US.utf8
    LC_ADDRESS=en_US.utf8
    LC_COLLATE=en_US.utf8
    LC_CTYPE=en_US.utf8
    LC_IDENTIFICATION=en_US.utf8
    LC_MEASUREMENT=en_US.utf8
    LC_MESSAGES=en_US.utf8
    LC_MONETARY=en_US.utf8
    LC_NAME=en_US.utf8
    LC_NUMERIC=en_US.utf8
    LC_PAPER=en_US.utf8
    LC_TELEPHONE=en_US.utf8
    LC_TIME=en_US.utf8
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/home/ionic/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/bin:/usr/lib/llvm/11/bin:/usr/lib/llvm/10/bin:/opt/android-sdk-update-manager/tools:/opt/android-sdk-update-manager/platform-tools:/opt/nvidia-cg-toolkit/bin:/opt/cuda/bin
    PERL_BADLANG (unset)
    SHELL=/bin/zsh
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant