Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UTF-8 bug in Test::More and/or Test::Builder #11912

Closed
p5pRT opened this issue Jan 27, 2012 · 8 comments
Closed

UTF-8 bug in Test::More and/or Test::Builder #11912

p5pRT opened this issue Jan 27, 2012 · 8 comments

Comments

@p5pRT
Copy link

@p5pRT p5pRT commented Jan 27, 2012

Migrated from rt.perl.org#109204 (status was 'rejected')

Searchable as RT109204$

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Jan 27, 2012

From tchrist@perl.com

As far as I can tell, it is not possible to reliably use Test​::More's
diag() function to reliably emit anything but pure ASCII. This is not
a documented restriction, and even if it were, it is still unacceptable.

The problem is that Test​::Builder​::_new_fh() doesn't set to the encoding on
the output handle. This causes a "Wide character in print". Attempting to
work around that by setting PERL_UNICODE (or running under -C) just moves
the problem around. It doesn't fix it.

Here are the four possibilities​:

  -C0 -CS
  ---- ----
  run directly fail pass
  run w/Test​::Harness runtests fail fail

The only way to make it work is to run *without* a test harness and with -CS
set, or for the test program to itself binmode STDOUT to utf8.

However, none of that works when run under a test harness. That means
there's no way to make it behave properly from a test harness.

Here's the simple demo test program​:

  use utf8;
  use strict;
  use warnings;
  use charnames qw(​:full);

  use Test​::More;
  use Carp qw(cluck);

  ## Uncomment this to learn where the real problem is happening​:
  ##
  ## $SIG{__WARN__} = sub { cluck "TRAPPED WARNING​: @​_" };

  my $sisyphus = "© 2011, Σίσυφος";

  diag("…starting Sisyphean tests");

  like $sisyphus, qr/\N{COPYRIGHT SIGN}/, "it’s got a copyright sign";
  like $sisyphus, qr/\p{sc=Greek}/, "græcum est​: non potest legi";

  diag("¿Qué pasó?");

  done_testing();

This one screws up, and also produces illegal UTF-8​:

  $ perl -C0 /tmp/badenc.t
  Wide character in print at /usr/local/lib/perl5/5.14.0/Test/Builder.pm line 1759.
  # …starting Sisyphean tests
  Wide character in print at /usr/local/lib/perl5/5.14.0/Test/Builder.pm line 1759.
  ok 1 - it’s got a copyright sign
  ok 2 - gr?cum est​: non potest legi
  # ?Qu? pas??
  1..2

This is the only one that works​:

  $ perl -CS /tmp/badenc.t
  # …starting Sisyphean tests
  ok 1 - it’s got a copyright sign
  ok 2 - græcum est​: non potest legi
  # ¿Qué pasó?
  1..2

This one also screws up​:

  $ perl -C0 -MTest​::Harness -e 'runtests(@​ARGV)' /tmp/badenc.t /tmp/badenc.t .. Wide character in print at /usr/local/lib/perl5/5.14.0/Test/Builder.pm line 1759.
  # …starting Sisyphean tests
  /tmp/badenc.t .. 1/? Wide character in print at /usr/local/lib/perl5/5.14.0/Test/Builder.pm line 1759.
  # ?Qu? pas??
  /tmp/badenc.t .. ok
  All tests successful.
  Files=1, Tests=2, 0 wallclock secs ( 0.02 usr 0.01 sys + 0.05 cusr 0.01 csys = 0.09 CPU)
  Result​: PASS

And this doesn't help it at all -- notice the double encoding​:

  $ perl -CS -MTest​::Harness -e 'runtests(@​ARGV)' /tmp/badenc.t
  /tmp/badenc.t .. Wide character in print at /usr/local/lib/perl5/5.14.0/Test/Builder.pm line 1759.
  # â�¦starting Sisyphean tests
  /tmp/badenc.t .. 1/? Wide character in print at /usr/local/lib/perl5/5.14.0/Test/Builder.pm line 1759.
  # ¿Qué pasó?
  /tmp/badenc.t .. ok
  All tests successful.
  Files=1, Tests=2, 0 wallclock secs ( 0.02 usr 0.01 sys + 0.05 cusr 0.01 csys = 0.09 CPU)
  Result​: PASS

--tom

Summary of my perl5 (revision 5 version 14 subversion 0) configuration​:
 
  Platform​:
  osname=openbsd, osvers=4.4, archname=OpenBSD.i386-openbsd
  uname='openbsd chthon 4.4 generic#0 i386 '
  config_args='-des'
  hint=recommended, useposix=true, d_sigaction=define
  useithreads=undef, usemultiplicity=undef
  useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
  use64bitint=undef, use64bitall=undef, uselongdouble=undef
  usemymalloc=y, bincompat5005=undef
  Compiler​:
  cc='cc', ccflags ='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include',
  optimize='-O2',
  cppflags='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
  ccversion='', gccversion='3.3.5 (propolice)', gccosandvers='openbsd4.4'
  intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
  d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
  ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
  alignbytes=4, prototype=define
  Linker and Libraries​:
  ld='cc', ldflags ='-Wl,-E -fstack-protector -L/usr/local/lib'
  libpth=/usr/local/lib /usr/lib
  libs=-lgdbm -lm -lutil -lc
  perllibs=-lm -lutil -lc
  libc=/usr/lib/libc.so.48.0, so=so, useshrplib=false, libperl=libperl.a
  gnulibc_version=''
  Dynamic Linking​:
  dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags=' '
  cccdlflags='-DPIC -fPIC ', lddlflags='-shared -fPIC -L/usr/local/lib -fstack-protector'

Characteristics of this binary (from libperl)​:
  Compile-time options​: MYMALLOC PERL_DONT_CREATE_GVSV PERL_MALLOC_WRAP
  PERL_PRESERVE_IVUV USE_LARGE_FILES USE_PERLIO
  USE_PERL_ATOF
  Built under openbsd
  Compiled at Jun 11 2011 11​:48​:28
  %ENV​:
  PERL_UNICODE="SA"
  @​INC​:
  /usr/local/lib/perl5/site_perl/5.14.0/OpenBSD.i386-openbsd
  /usr/local/lib/perl5/site_perl/5.14.0
  /usr/local/lib/perl5/5.14.0/OpenBSD.i386-openbsd
  /usr/local/lib/perl5/5.14.0
  /usr/local/lib/perl5/site_perl/5.12.3
  /usr/local/lib/perl5/site_perl/5.11.3
  /usr/local/lib/perl5/site_perl/5.10.1
  /usr/local/lib/perl5/site_perl/5.10.0
  /usr/local/lib/perl5/site_perl/5.8.7
  /usr/local/lib/perl5/site_perl/5.8.0
  /usr/local/lib/perl5/site_perl/5.6.0
  /usr/local/lib/perl5/site_perl/5.005
  /usr/local/lib/perl5/site_perl
  .

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Jan 27, 2012

From @Hugmeir

On Fri, Jan 27, 2012 at 11​:16 AM, tchrist1 <perlbug-followup@​perl.org>wrote​:

# New Ticket Created by tchrist1
# Please include the string​: [perl #109204]
# in the subject line of all future correspondence about this issue.
# <URL​: https://rt-archive.perl.org/perl5/Ticket/Display.html?id=109204 >

As far as I can tell, it is not possible to reliably use Test​::More's
diag() function to reliably emit anything but pure ASCII. This is not
a documented restriction, and even if it were, it is still unacceptable.

The problem is that Test​::Builder​::_new_fh() doesn't set to the encoding on
the output handle. This causes a "Wide character in print". Attempting to
work around that by setting PERL_UNICODE (or running under -C) just moves
the problem around. It doesn't fix it.

Here are the four possibilities​:

                                   \-C0     \-CS
                                   \-\-\-\-    \-\-\-\-

run directly fail pass
run w/Test​::Harness runtests fail fail

The only way to make it work is to run *without* a test harness and with
-CS
set, or for the test program to itself binmode STDOUT to utf8.

However, none of that works when run under a test harness. That means
there's no way to make it behave properly from a test harness.

Here's the simple demo test program​:

use utf8;
use strict;
use warnings;
use charnames qw(​:full);

use Test​::More;
use Carp qw(cluck);

## Uncomment this to learn where the real problem is happening​:
##
## $SIG{__WARN__} = sub { cluck "TRAPPED WARNING​: @​_" };

my $sisyphus = "© 2011, Σίσυφος";

diag("…starting Sisyphean tests");

like $sisyphus, qr/\N{COPYRIGHT SIGN}/, "it’s got a copyright sign";
like $sisyphus, qr/\p{sc=Greek}/, "græcum est​: non potest legi";

diag("¿Qué pasó?");

done_testing();

This one screws up, and also produces illegal UTF-8​:

$ perl -C0 /tmp/badenc.t
Wide character in print at /usr/local/lib/perl5/5.14.0/Test/Builder.pm
line 1759.
# …starting Sisyphean tests
Wide character in print at /usr/local/lib/perl5/5.14.0/Test/Builder.pm
line 1759.
ok 1 - it’s got a copyright sign
ok 2 - gr?cum est​: non potest legi
# ?Qu? pas??
1..2

This is the only one that works​:

$ perl -CS /tmp/badenc.t
# …starting Sisyphean tests
ok 1 - it’s got a copyright sign
ok 2 - græcum est​: non potest legi
# ¿Qué pasó?
1..2

This one also screws up​:

$ perl -C0 -MTest​::Harness -e 'runtests(@​ARGV)' /tmp/badenc.t

           /tmp/badenc\.t \.\. Wide character in print at

/usr/local/lib/perl5/5.14.0/Test/Builder.pm line 1759.
# …starting Sisyphean tests
/tmp/badenc.t .. 1/? Wide character in print at
/usr/local/lib/perl5/5.14.0/Test/Builder.pm line 1759.
# ?Qu? pas??
/tmp/badenc.t .. ok
All tests successful.
Files=1, Tests=2, 0 wallclock secs ( 0.02 usr 0.01 sys + 0.05 cusr
0.01 csys = 0.09 CPU)
Result​: PASS

And this doesn't help it at all -- notice the double encoding​:

$ perl -CS -MTest​::Harness -e 'runtests(@​ARGV)' /tmp/badenc.t
/tmp/badenc.t .. Wide character in print at
/usr/local/lib/perl5/5.14.0/Test/Builder.pm line 1759.
# …starting Sisyphean tests
/tmp/badenc.t .. 1/? Wide character in print at
/usr/local/lib/perl5/5.14.0/Test/Builder.pm line 1759.
# ¿Qué pasó?
/tmp/badenc.t .. ok
All tests successful.
Files=1, Tests=2, 0 wallclock secs ( 0.02 usr 0.01 sys + 0.05 cusr
0.01 csys = 0.09 CPU)
Result​: PASS

--tom

Summary of my perl5 (revision 5 version 14 subversion 0) configuration​:

Platform​:
osname=openbsd, osvers=4.4, archname=OpenBSD.i386-openbsd
uname='openbsd chthon 4.4 generic#0 i386 '
config_args='-des'
hint=recommended, useposix=true, d_sigaction=define
useithreads=undef, usemultiplicity=undef
useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
use64bitint=undef, use64bitall=undef, uselongdouble=undef
usemymalloc=y, bincompat5005=undef
Compiler​:
cc='cc', ccflags ='-fno-strict-aliasing -pipe -fstack-protector
-I/usr/local/include',
optimize='-O2',
cppflags='-fno-strict-aliasing -pipe -fstack-protector
-I/usr/local/include'
ccversion='', gccversion='3.3.5 (propolice)', gccosandvers='openbsd4.4'
intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
alignbytes=4, prototype=define
Linker and Libraries​:
ld='cc', ldflags ='-Wl,-E -fstack-protector -L/usr/local/lib'
libpth=/usr/local/lib /usr/lib
libs=-lgdbm -lm -lutil -lc
perllibs=-lm -lutil -lc
libc=/usr/lib/libc.so.48.0, so=so, useshrplib=false, libperl=libperl.a
gnulibc_version=''
Dynamic Linking​:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags=' '
cccdlflags='-DPIC -fPIC ', lddlflags='-shared -fPIC -L/usr/local/lib
-fstack-protector'

Characteristics of this binary (from libperl)​:
Compile-time options​: MYMALLOC PERL_DONT_CREATE_GVSV PERL_MALLOC_WRAP
PERL_PRESERVE_IVUV USE_LARGE_FILES USE_PERLIO
USE_PERL_ATOF
Built under openbsd
Compiled at Jun 11 2011 11​:48​:28
%ENV​:
PERL_UNICODE="SA"
@​INC​:
/usr/local/lib/perl5/site_perl/5.14.0/OpenBSD.i386-openbsd
/usr/local/lib/perl5/site_perl/5.14.0
/usr/local/lib/perl5/5.14.0/OpenBSD.i386-openbsd
/usr/local/lib/perl5/5.14.0
/usr/local/lib/perl5/site_perl/5.12.3
/usr/local/lib/perl5/site_perl/5.11.3
/usr/local/lib/perl5/site_perl/5.10.1
/usr/local/lib/perl5/site_perl/5.10.0
/usr/local/lib/perl5/site_perl/5.8.7
/usr/local/lib/perl5/site_perl/5.8.0
/usr/local/lib/perl5/site_perl/5.6.0
/usr/local/lib/perl5/site_perl/5.005
/usr/local/lib/perl5/site_perl
.

http​://www.effectiveperlprogramming.com/blog/1226

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Jan 27, 2012

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Jan 27, 2012

From tchrist@perl.com

It gets even worse if you swap the like()s to unlike()s,
to make it fail. Then it can't even print out what
the original values are.

--tom

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Jan 27, 2012

From tchrist@perl.com

"Brian Fraser via RT" <perlbug-followup@​perl.org> wrote
  on Fri, 27 Jan 2012 06​:19​:44 PST​:

http​://www.effectiveperlprogramming.com/blog/1226

That just moves the bug/error around a bit. Now it seems to be in MakeMaker.

  use utf8;
  use strict;
  use warnings;
  use warnings FATAL => "utf8";
  use charnames qw(​:full);

  use open qw(​:std :utf8);

  use Test​::More;
  use Carp qw(cluck);

  for my $meth ( qw(output failure_output) ) {
  binmode Test​::More->builder->$meth(), "​:utf8";
  }

  ## Uncomment this to learn where the real problem is happening​:
  ##
  ## $SIG{__WARN__} = sub { cluck "TRAPPED WARNING​: @​_" };

  my $sisyphus = "© 2011, Σίσυφος";

  diag("…starting Sisyphean tests");

  unlike $sisyphus, qr/\N{COPYRIGHT SIGN}/, "it’s got a copyright sign";
  unlike $sisyphus, qr/\p{sc=Greek}/, "græcum est​: non potest legi";

  diag("¿Qué pasó?");

  done_testing();

Which gives us​:

  $ perl -CS -MTest​::Harness -e 'runtests(@​ARGV)' /tmp/badenc.t
  /tmp/badenc.t .. # â�¦starting Sisyphean tests
  /tmp/badenc.t .. 1/?
  # Failed test 'itâ��s got a copyright sign'
  # at /tmp/badenc.t line 24.
  # '© 2011, ΣίÏ�Ï�Ï�οÏ�'
  # matches '(?^u​:\N{U+A9})'

  # Failed test 'græcum est​: non potest legi'
  # at /tmp/badenc.t line 25.
  # '© 2011, ΣίÏ�Ï�Ï�οÏ�'
  # matches '(?^u​:\p{sc=Greek})'
  # ¿Qué pasó?
  # Looks like you failed 2 tests of 2.
  /tmp/badenc.t .. Dubious, test returned 2 (wstat 512, 0x200)
  Failed 2/2 subtests

  Test Summary Report
  -------------------
  /tmp/badenc.t (Wstat​: 512 Tests​: 2 Failed​: 2)
  Failed tests​: 1-2
  Non-zero exit status​: 2
  Files=1, Tests=2, 0 wallclock secs ( 0.03 usr 0.01 sys + 0.05 cusr 0.01 csys = 0.10 CPU)
  Result​: FAIL
  Failed 1/1 test programs. 2/2 subtests failed.

If -C0 is required and expected, it should not be left up to the user to forget.

You can argue that the default Makefile test target should add -C0 to its
options, but that just shuffles the blame off somewhere else.

I'm running with PERL_UNICODE=SA.

--tom

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Dec 12, 2017

From zefram@fysh.org

Test​::More and Test​::Builder are maintained as part of the Test-Simple
CPAN distribution, so the core bug queue is the wrong place for this
issue. [perl #109204] should be closed, and an equivalent ticket opened
against Test-Simple. If you're Githubly inclined, Test-Simple's preferred
issue tracker lives there. If you're not a Github-head, though, that
issue tracker won't accept a report from you. The Test-Simple maintainer
is also responsive to tickets filed at rt.cpan.org.

-zefram

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Dec 13, 2017

From @xsawyerx

On Tue, 12 Dec 2017 03​:20​:52 -0800, zefram@​fysh.org wrote​:

Test​::More and Test​::Builder are maintained as part of the Test-Simple
CPAN distribution, so the core bug queue is the wrong place for this
issue. [perl #109204] should be closed, and an equivalent ticket opened
against Test-Simple. If you're Githubly inclined, Test-Simple's preferred
issue tracker lives there. If you're not a Github-head, though, that
issue tracker won't accept a report from you. The Test-Simple maintainer
is also responsive to tickets filed at rt.cpan.org.

Created​: Test-More/test-more#802.

I'm making "Rejected" to reflect it is rejected from this queue, but it now exists in the GH issue tracker for Test​::More as the link above.

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Dec 13, 2017

@xsawyerx - Status changed from 'open' to 'rejected'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant