Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bleadperl 2014-06-25T12:37:43Z breaks CFAERBER/Net-IDN-Encode-2.200.tar.gz #13956

Closed
p5pRT opened this issue Jun 25, 2014 · 8 comments
Closed

Bleadperl 2014-06-25T12:37:43Z breaks CFAERBER/Net-IDN-Encode-2.200.tar.gz #13956

p5pRT opened this issue Jun 25, 2014 · 8 comments

Comments

@p5pRT
Copy link

@p5pRT p5pRT commented Jun 25, 2014

Migrated from rt.perl.org#122179 (status was 'rejected')

Searchable as RT122179$

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Jun 25, 2014

From @andk

git bisect


commit 09edd81
Author​: Karl Williamson <public@​khwilliamson.com>
Date​: Thu Feb 20 21​:59​:00 2014 -0700

  Use Unicode 7.0

sample fail report


http​://www.cpantesters.org/cpan/report/9115cd08-f93b-11e3-bca3-b1a10a370852

perl -V


Summary of my perl5 (revision 5 version 21 subversion 1) configuration​:
  Commit id​: 62406c8
  Platform​:
  osname=linux, osvers=3.14-1-amd64, archname=x86_64-linux-thread-multi
  uname='linux k83 3.14-1-amd64 #1 smp debian 3.14.5-1 (2014-06-05) x86_64 gnulinux '
  config_args='-Dprefix=/home/sand/src/perl/repoperls/installed-perls/perl/v5.21.1/9980 -Dmyhostname=k83 -Dinstallusrbinperl=n -Uversiononly -Dusedevel -des -Ui_db -Duseithreads -Uuselongdouble -DDEBUGGING=-g'
  hint=recommended, useposix=true, d_sigaction=define
  useithreads=define, usemultiplicity=define
  use64bitint=define, use64bitall=define, uselongdouble=undef
  usemymalloc=n, bincompat5005=undef
  Compiler​:
  cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -fwrapv -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
  optimize='-O2 -g',
  cppflags='-D_REENTRANT -D_GNU_SOURCE -fwrapv -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
  ccversion='', gccversion='4.8.3', gccosandvers=''
  intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
  d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
  ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
  alignbytes=8, prototype=define
  Linker and Libraries​:
  ld='cc', ldflags =' -fstack-protector -L/usr/local/lib'
  libpth=/usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/4.8/include-fixed /usr/include/x86_64-linux-gnu /usr/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib
  libs=-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc -lgdbm_compat
  perllibs=-lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
  libc=libc-2.19.so, so=so, useshrplib=false, libperl=libperl.a
  gnulibc_version='2.19'
  Dynamic Linking​:
  dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
  cccdlflags='-fPIC', lddlflags='-shared -O2 -g -L/usr/local/lib -fstack-protector'

Characteristics of this binary (from libperl)​:
  Compile-time options​: HAS_TIMES MULTIPLICITY PERLIO_LAYERS
  PERL_DONT_CREATE_GVSV
  PERL_HASH_FUNC_ONE_AT_A_TIME_HARD
  PERL_IMPLICIT_CONTEXT PERL_MALLOC_WRAP
  PERL_NEW_COPY_ON_WRITE PERL_PRESERVE_IVUV
  PERL_USE_DEVEL USE_64_BIT_ALL USE_64_BIT_INT
  USE_ITHREADS USE_LARGE_FILES USE_LOCALE
  USE_LOCALE_COLLATE USE_LOCALE_CTYPE
  USE_LOCALE_NUMERIC USE_PERLIO USE_PERL_ATOF
  USE_REENTRANT_API
  Built under linux
  Compiled at Jun 20 2014 21​:27​:33
  %ENV​:
  PERL5LIB=""
  PERL5OPT=""
  PERL5_CPANPLUS_IS_RUNNING="17922"
  PERL5_CPAN_IS_RUNNING="17922"
  PERL_MM_USE_DEFAULT="1"
  @​INC​:
  /home/sand/src/perl/repoperls/installed-perls/perl/v5.21.1/9980/lib/site_perl/5.21.1/x86_64-linux-thread-multi
  /home/sand/src/perl/repoperls/installed-perls/perl/v5.21.1/9980/lib/site_perl/5.21.1
  /home/sand/src/perl/repoperls/installed-perls/perl/v5.21.1/9980/lib/5.21.1/x86_64-linux-thread-multi
  /home/sand/src/perl/repoperls/installed-perls/perl/v5.21.1/9980/lib/5.21.1
  .
--
andreas

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Jun 25, 2014

From @khwilliamson

I'm rejecting this ticket because the flaws are in the tests.

Unicode will continue to encode characters between 0 and 0x10FFFF. These tests were assuming that certain code points were unassigned, but Unicode 7 has assigned them.

The only code points that are guaranteed to never be assigned are the noncharacters.
Code points unlikely to be assigned are ones listed as <reserved> in NamesList.txt
http​://www.unicode.org/Public/7.0.0/ucd/NamesList.txt
and things like U+03A2, which would be the uppercase of the greek small letter final sigma (but there is no upper case of that).
--
Karl Williamson

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Jun 25, 2014

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Jun 25, 2014

@khwilliamson - Status changed from 'open' to 'rejected'

@p5pRT p5pRT closed this Jun 25, 2014
@p5pRT
Copy link
Author

@p5pRT p5pRT commented Jun 29, 2014

From CFAERBER@cpan.org

The interesting part is that the tests are the tests provided with Unicode 7.0.0.

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Jul 1, 2014

From @khwilliamson

On Sun Jun 29 03​:11​:12 2014, cfaerber wrote​:

The interesting part is that the tests are the tests provided with
Unicode 7.0.0.

Yes, and my comments in rejecting this ticket were based on ignorance. I'm sorry. Though it still should have been rejected, as it still doesn't appear to me to be a core Perl bug. I looked in more detail at the first error in the CPAN report. It is this​:

# Failed test 'to_ascii('Ã�ã��ð��³â´�\u1DD8') throws error P1 V6 [data/IdnaTest.txt​:992]'
# at t/uts46_to_ascii-trans.t line 765.
# got​: 'ss.xn--weg506dvy5n'
# expected​: undef

I looked at that line in the .t file, and it is this​:

𐋳ⴌ\x{1DD8}", %p)}, undef, "to_ascii\(\'ß\。𐋳ⴌ\\u1DD8\'\)\ throws\ error\ P1\ V6\ \[data\/IdnaTest\.txt\​:992\]") or ($@​ and diag($@​));

(Most likely you will not have the fonts to display all of this correctly. the one character I don't have in my fonts is U+102F3, COPTIC EPACT NUMBER ONE HUNDRED, newly encoded in Unicode 7.0. I got that far before, and just assumed the test was supposed to return undef because the code point had not been encoded before, but now is. But I was wrong. There is another reason it is supposed to be undef.

Line 992 from the Unicode 7.0 IdnaTest.txt file is this​:
B; ß。𐋳ⴌ\u1DD8; [P1 V6]; [P1 V6] # ß.𐋳ⴌᷘ

(BTW, thanks for cross referencing the line number of the Unicode file in the .t test; it made this a lot easier.)

The brackets indicate that this is supposed to fail, and the codes within the brackets indicate why. I started to follow why it should fail, but it wasn't obvious without more digging than I had time for. So perhaps the test from Unicode is wrong, or the module is buggy. I see that the .t correctly gets failure with the preceeding, similar tests,

--
Karl Williamson

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Jul 4, 2014

From CFAERBER@cpan.org

Actually, I think your conclusion is correct and this is a bug in the test files supplied with Unicode 7.0.

The error codes P1 and V16 indicate that there is a character that is not 'valid' (and that cannot be 'mapped' in P1). However, U+102F3 is valid according to IdnaMapping.txt (line 5557)​:

102E1..102FB ; valid ; ; NV8 # 7.0 COPTIC EPACT DIGIT ONE..COPTIC EPACT NUMBER NINE HUNDRED

It's not valid in IDNA 2008, though (indicated by "NV8"). However, the tests in IdnaTests.txt are not supposed to test for that.

If I change the module to treat all characters added in Unicode 7.0 as 'invalid', the tests for Net​::IDN​::Encode complete without error under bleadperl (5.21.1) and earlier perls.

I have already reported the suspected error through the form at www.unicode.org but have not yet received a response.

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Jul 11, 2014

From @khwilliamson

On 07/04/2014 04​:39 AM, Claus Färber via RT wrote​:

Actually, I think your conclusion is correct and this is a bug in the test files supplied with Unicode 7.0.

The error codes P1 and V16 indicate that there is a character that is not 'valid' (and that cannot be 'mapped' in P1). However, U+102F3 is valid according to IdnaMapping.txt (line 5557)​:

102E1..102FB ; valid ; ; NV8 # 7.0 COPTIC EPACT DIGIT ONE..COPTIC EPACT NUMBER NINE HUNDRED

It's not valid in IDNA 2008, though (indicated by "NV8"). However, the tests in IdnaTests.txt are not supposed to test for that.

If I change the module to treat all characters added in Unicode 7.0 as 'invalid', the tests for Net​::IDN​::Encode complete without error under bleadperl (5.21.1) and earlier perls.

I have already reported the suspected error through the form at www.unicode.org but have not yet received a response.

---
via perlbug​: queue​: perl5 status​: rejected
https://rt-archive.perl.org/perl5/Ticket/Display.html?id=122179

Unicode has now responded, agreeing that the test file was in error, and
creating a new one. See
http​://www.unicode.org/errata/#current_errata

The announcement email credits Claus with finding the problem, but
wasn't sent to the public at large. I just sent a private email to
them suggesting they send this to their public email list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant