Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

5.11 breaks HTML::FormatExternal #102

Closed
firefart opened this issue Jul 7, 2022 · 8 comments
Closed

5.11 breaks HTML::FormatExternal #102

firefart opened this issue Jul 7, 2022 · 8 comments

Comments

@firefart
Copy link

firefart commented Jul 7, 2022

When installing HTML::FormatExternal the latest 5.11 version leads to errors during build. If I downgrade to 5.10 it works again.

Here is some example output

Logs
# HTML::FormatText::Elinks program_version $VAR1 = '0.13.2';
# $output = "123 567 9012\nabc def ghij\n";
# Colon character is ordinary in filenames: 0
# Temporary directory /tmp/e8BGskz3_e

#   Failed test 'HTML::FormatText::Elinks format_file() filename "/tmp/e8BGskz3_e/-###"'
#   at t/FormatExternal.t line 321.
#                   'Local directory
# /[1]root/[2].cpanm/[3]work/[4]1657174058.2983/[5]HTML-FormatExternal-26/
#
#  drwxr-xr-x   7 root     root          4096 Jul  7 06:08 [6]..
#  drwxr-sr-x   8 root     rt            4096 Jul  7 06:08 [7]blib
#  drwxr-sr-x   3 rt       rt            4096 Aug 29  2015 [8]debian
#  drwxr-sr-x   2 rt       rt            4096 Aug 29  2015 [9]devel
#  drwxr-sr-x   2 rt       rt            4096 Aug 29  2015 [10]examples
#  drwxr-sr-x   2 rt       rt            4096 Aug 29  2015 [11]inc
#  drwxr-sr-x   3 rt       rt            4096 Aug 29  2015 [12]lib
#  drwxr-sr-x   2 rt       rt            4096 Aug 29  2015 [13]t
#  drwxr-sr-x   2 rt       rt            4096 Aug 29  2015 [14]xt
#  drwxr-sr-x   2 rt       rt            4096 Aug 29  2015 [15]xtools
#  -rw-r--r--   1 rt       rt           35068 Jun 29  2007 [16]COPYING
#  -rw-r--r--   1 rt       rt            2294 Aug 29  2015 [17]Changes
#  -rw-r--r--   1 rt       rt            1241 Aug 29  2015 [18]MANIFEST
#  -rw-r--r--   1 rt       rt            3104 Nov 15  2013 [19]MANIFEST.SKIP
#  -rw-r--r--   1 rt       rt            1547 Aug 29  2015 [20]META.json
#  -rw-r--r--   1 rt       rt             853 Aug 29  2015 [21]META.yml
#  -rw-r--r--   1 root     rt            1582 Jul  7 06:07 [22]MYMETA.json
#  -rw-r--r--   1 root     rt             893 Jul  7 06:07 [23]MYMETA.yml
#  -rw-r--r--   1 root     rt           32579 Jul  7 06:07 [24]Makefile
#  -rwxr-xr-x   1 rt       rt            2558 Apr 23  2015 [25]Makefile.PL
#  -rw-r--r--   1 rt       rt            1684 Jun  5  2013 [26]README
#  -rw-r--r--   1 rt       rt            4390 Aug 29  2015 [27]SIGNATURE
#  -rw-r--r--   1 root     rt               0 Jul  7 06:08 [28]pm_to_blib
#
#    --------------------------------------------------------------------------
#
# References
#
#    Visible links
#    1. file:///root/
#    2. file:///root/.cpanm/
#    3. file:///root/.cpanm/work/
#    4. file:///root/.cpanm/work/1657174058.2983/
#    5. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/
#    6. file:///root/.cpanm/work/1657174058.2983/
#    7. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/blib/
#    8. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/debian/
#    9. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/devel/
#   10. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/examples/
#   11. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/inc/
#   12. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/lib/
#   13. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/t/
#   14. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/xt/
#   15. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/xtools/
#   16. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/COPYING
#   17. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/Changes
#   18. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/MANIFEST
#   19. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/MANIFEST.SKIP
#   20. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/META.json
#   21. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/META.yml
#   22. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/MYMETA.json
#   23. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/MYMETA.yml
#   24. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/Makefile
#   25. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/Makefile.PL
#   26. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/README
#   27. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/SIGNATURE
#   28. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/pm_to_blib
# '
#     doesn't match '(?^:body.*text)'

#   Failed test 'HTML::FormatText::Elinks format_file() filename "/tmp/e8BGskz3_e/%57"'
#   at t/FormatExternal.t line 321.
#                   'Local directory
# /[1]root/[2].cpanm/[3]work/[4]1657174058.2983/[5]HTML-FormatExternal-26/
#
#  drwxr-xr-x   7 root     root          4096 Jul  7 06:08 [6]..
#  drwxr-sr-x   8 root     rt            4096 Jul  7 06:08 [7]blib
#  drwxr-sr-x   3 rt       rt            4096 Aug 29  2015 [8]debian
#  drwxr-sr-x   2 rt       rt            4096 Aug 29  2015 [9]devel
#  drwxr-sr-x   2 rt       rt            4096 Aug 29  2015 [10]examples
#  drwxr-sr-x   2 rt       rt            4096 Aug 29  2015 [11]inc
#  drwxr-sr-x   3 rt       rt            4096 Aug 29  2015 [12]lib
#  drwxr-sr-x   2 rt       rt            4096 Aug 29  2015 [13]t
#  drwxr-sr-x   2 rt       rt            4096 Aug 29  2015 [14]xt
#  drwxr-sr-x   2 rt       rt            4096 Aug 29  2015 [15]xtools
#  -rw-r--r--   1 rt       rt           35068 Jun 29  2007 [16]COPYING
#  -rw-r--r--   1 rt       rt            2294 Aug 29  2015 [17]Changes
#  -rw-r--r--   1 rt       rt            1241 Aug 29  2015 [18]MANIFEST
#  -rw-r--r--   1 rt       rt            3104 Nov 15  2013 [19]MANIFEST.SKIP
#  -rw-r--r--   1 rt       rt            1547 Aug 29  2015 [20]META.json
#  -rw-r--r--   1 rt       rt             853 Aug 29  2015 [21]META.yml
#  -rw-r--r--   1 root     rt            1582 Jul  7 06:07 [22]MYMETA.json
#  -rw-r--r--   1 root     rt             893 Jul  7 06:07 [23]MYMETA.yml
#  -rw-r--r--   1 root     rt           32579 Jul  7 06:07 [24]Makefile
#  -rwxr-xr-x   1 rt       rt            2558 Apr 23  2015 [25]Makefile.PL
#  -rw-r--r--   1 rt       rt            1684 Jun  5  2013 [26]README
#  -rw-r--r--   1 rt       rt            4390 Aug 29  2015 [27]SIGNATURE
#  -rw-r--r--   1 root     rt               0 Jul  7 06:08 [28]pm_to_blib
#
#    --------------------------------------------------------------------------
#
# References
#
#    Visible links
#    1. file:///root/
#    2. file:///root/.cpanm/
#    3. file:///root/.cpanm/work/
#    4. file:///root/.cpanm/work/1657174058.2983/
#    5. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/
#    6. file:///root/.cpanm/work/1657174058.2983/
#    7. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/blib/
#    8. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/debian/
#    9. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/devel/
#   10. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/examples/
#   11. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/inc/
#   12. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/lib/
#   13. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/t/
#   14. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/xt/
#   15. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/xtools/
#   16. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/COPYING
#   17. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/Changes
#   18. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/MANIFEST
#   19. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/MANIFEST.SKIP
#   20. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/META.json
#   21. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/META.yml
#   22. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/MYMETA.json
#   23. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/MYMETA.yml
#   24. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/Makefile
#   25. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/Makefile.PL
#   26. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/README
#   27. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/SIGNATURE
#   28. file:///root/.cpanm/work/1657174058.2983/HTML-FormatExternal-26/pm_to_blib
# '
#     doesn't match '(?^:body.*text)'

As the HTML::FormatExternal package seems pretty outdated I just wanted to create an issue here too just in case this is some introduced bug in this library.

Cross reference:
https://rt.cpan.org/Ticket/Display.html?id=143689

@kryde
Copy link

kryde commented Jul 7, 2022

With

use URI::file;
print URI::file->new_abs("/tmp/###"),"\n";

I hoped to get file:///tmp/%23%23%23 which seems to be not so in 5.11.

I used this in FormatExternal to ensure that a weird filename is encoded in a way which is unambiguous for the program called.

@oalders
Copy link
Member

oalders commented Jul 7, 2022

This would have been introduced via #100.

@oalders
Copy link
Member

oalders commented Jul 7, 2022

@Perlbotics I think we need to consider which URI types should not be affected by this new behaviour?

@Perlbotics
Copy link

Whoaah. I guess, it happens around line 104 in URI.pm. Let me check.
Do you need a quick fix fast?

@oalders
Copy link
Member

oalders commented Jul 7, 2022

@Perlbotics I can revert #100 and upload a new release. That would leave time to be thorough about a fix and get some new tests in place. We could then re-introduce the changes with the tweaks. Does that sound reasonable?

@Perlbotics
Copy link

Perlbotics commented Jul 7, 2022

Hi Olaf (@oalders), sounds reasonable.

Locally, I added a testcase for the given issue 102 and ran the test suite again.
The problem is that the sub in line 100 of URI.pm does not work for schemes that do not have an authority part and those
where '#' should be escaped. Same for '?', I guess.
Other candidates: sftp, ftp, ldap, rsync, etc.

Quick fix for issue#102 (leaving similar problems open)
Insert at line 103: return if $_[0] =~ /^file/; #-- selective for issue#102

I'll return to this in the evening.
Sorry for the inconvenience.

Update:
Problem is a zero size match as in scheme:///, scheme://#, scheme://?.
In case of file:/// it is worse since it returns an empty string which is turned into the current directory (at least with new_abs())!

Changing line 105 from:
my $orig = $2;
into
my $orig = $2 || return;
should do the trick.

Perlbotics pushed a commit to Perlbotics/URI that referenced this issue Jul 7, 2022
Test case added.
Skipping attempt to unescape empty authority part.

See: libwww-perl#102
Perlbotics pushed a commit to Perlbotics/URI that referenced this issue Jul 7, 2022
Perlbotics pushed a commit to Perlbotics/URI that referenced this issue Jul 8, 2022
The previous fix checked the result of the regex-match.
However, the regex-match could have avoided the situation
in the first place.

The new regex now asks for a non-zero authority part.

Details: libwww-perl#102
oalders pushed a commit that referenced this issue Jul 10, 2022
Test case added.
Skipping attempt to unescape empty authority part.

See: #102

The previous fix checked the result of the regex-match.
However, the regex-match could have avoided the situation
in the first place.

The new regex now asks for a non-zero authority part.

Skip IPv6 handling of schemes that do not have an authority part.

Currently: data, file, ldapi, urn, sqlite, sqlite3

Fix: Fallback to pre 5.11 for specific schemes (i.e. 'mailto:').

Short test cases added for 'mailto:' URIs having
address literals (IPv4 and IPv6).

Modernized t/file.t to use Test::More instead of plain TAP.

In preparation of more future tests.

Tests added to show that domain in file:// is properly escaped.
@oalders
Copy link
Member

oalders commented Jul 10, 2022

Closed via 725fbfb

@oalders oalders closed this as completed Jul 10, 2022
oalders added a commit that referenced this issue Jul 10, 2022
    - Fix an issue where i.e. 'file:///tmp/###' was not properly escaped.
      A non-existing authority part was accidentally processed.
      Details: #102
      (GH#102) (Perlbotics)
    - Reverts to previous behavior (5.10) for 'mailto:' scheme for
      escaping square brackets.
@oalders
Copy link
Member

oalders commented Jul 10, 2022

5.12 has just been released to CPAN. Thanks @firefart for raising the issue!

firefart added a commit to firefart/rt-docker that referenced this issue Jul 11, 2022
clrpackages pushed a commit to clearlinux-pkgs/perl-URI that referenced this issue Jul 12, 2022
5.12      2022-07-10 23:48:50Z
    - Fix an issue where i.e. 'file:///tmp/###' was not properly escaped.
      A non-existing authority part was accidentally processed.
      Details: libwww-perl/URI#102
      (GH#102) (Perlbotics)
    - Reverts to previous behavior (5.10) for 'mailto:' scheme for
      escaping square brackets.
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this issue Apr 30, 2023
Upstream changes:
5.18      2023-04-29 16:08:14Z
    - Add a GH workflow to test LWP::Curl (GH#116) (Olaf Alders)
    - Add documentation examples for the host() and ihost() methods (GH#28)
      (Sebastian Willing)
    - Remove colon from username:password if there is no password (GH#31)
      (David E. Wheeler, Joenio Marques da Costa, Julien Fiegehenn)
    - Prefix private methods with _ in URI::_punycode (GH#47) (David E Wheeler)

5.17      2022-11-02 17:03:48Z
    - Updated RFC references in the pod documentation for URI::file (GH#117)
      (Håkon Hgland)
    - Fix SIP URI encoder/decoder (GH#118) (ryankereliuk)

5.16      2022-10-12 13:10:40Z
    - Merge the methods from URI::QueryParam into URI, so they are always
      available (GH#114) (Graham Knop)

5.15      2022-10-11 14:48:28Z
    - Teach uri_escape to accept a Regexp object as the characters to escape
      as an alternative to a character class. (GH#113) (Graham Knop)

5.14      2022-10-10 20:37:57Z
    - Fix uri_escape allowing \w style character classes in its character set
      parameter (GH#112) (Graham Knop)

5.13      2022-10-06 16:46:32Z
    - Regression test added for a previous bug (5.11) in URI::file (Perlbotics).
      file() method of URI::file can return the current working directory
      instead of the properly unescaped path. (GH#106) (Perlbotics)
    - Replace "Test" with "Test::More" (GH#107) (James Raspass)
    - Replace raw TAP printing with "Test::More" (GH#108) (James Raspass)
    - Apply perlimports to tests (GH#110) (Olaf Alders)
    - Improve escaping of unwanted characters (GH#78) (Branislav Zahradnk)

5.12      2022-07-10 23:48:50Z
    - Fix an issue where i.e. 'file:///tmp/###' was not properly escaped.
      A non-existing authority part was accidentally processed.
      Details: libwww-perl/URI#102
      (GH#102) (Perlbotics)
    - Reverts to previous behavior (5.10) for 'mailto:' scheme for
      escaping square brackets.

5.11      2022-07-04 20:53:38Z
    - Fix some typos in URI::file (GH#94) (Olaf Alders)
    - Escape square brackets in path (GH#100) (Perlbotics)
    - Fix storable.t (GH#97) (Shoichi Kaji)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants