New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BBC: Blead Breaks Cpanel::JSON::XS #19842
Comments
This appears to be a result of 7008caa, which removed the long-deprecated utf8n_to_uvuni:
I don't know when or why that function was deprecated, or what replaces it, so I don't know whether it's just down to Reini to fix this or whether we need to reconsider the removal. @khwilliamson? |
Reini should fix it. It looks like its replacement was added in 5.7.1, and is ported via ppport.h to 5.6.1 /* Instead use L</utf8_to_uvchr_buf>, or rarely, L</utf8n_to_uvchr>. This function was useful for code that wanted to handle both EBCDIC and =cut |
I have copypasted the relevant summaries above into an upstream bug report - rurban/Cpanel-JSON-XS#196 |
If we're getting failures on multiple CPAN modules with different maintainers, then perhaps we should re-think the breaking commit? Or at least try to develop a patching strategy that these different maintainers could follow. |
I don't understand the reason of the deprecation, let alone if it was correct. Am I correct to conclude that:
What is entirely unclear to me is what changed in v5.20 that would matter for this function. perl5200delta didn't give me the answer either. In either case I'm kind of surprised by this deprecation, given how little it's in the way of core maintenance. |
You are right about number 1; wrong on 2. They both operate on UTF-EBCDIC on EBCDIC platforms, and both on UTF-8 on ASCIIish ones. The difference is that uvchr knows, for example, that 65 should map to 193. (ASCII 'A' to EBCDIC 'A'). uvuni does no mapping My presumption has been that when uvchr was introduced in 5.7.1, that new code was supposed to use it, and uvuni was kept around for backwards compatibility and for specialist purposes, notably Encode. I presume it was me who deprecated uvuni, and I thought I was following along with the original intent. A module using uvuni is pretty much guaranteed to fail on EBCDIC boxes. It's no extra trouble to type 'chr' instead of 'uni' and remove that failure path. Further, most code shouldn't be using either of these, but instead, utf8_touvchr_buf (By the time that was invented, I suspect, it was realized that the uvuni wasn't the way to go, so there is no 'utf8_to_uvuni_buf' in existence.) The thinking on what to do about malformed multibyte input has evolved over the years, from doing a panic at the most extreme, to now changing it to REPLACEMENT CHARACTERs and continuing on. The code I looked it appeared to be using uvuni so that it could handle malformed input without the core crashing (but I didn't look closely). Best practice now is to use utf8_to_uvchr_buf.. I don't believe that core maintenance is the only reason to deprecate and remove functions. Another is to maximize the chances that a module will work more portably and safely,. The industry has learned various lessons about security holes with UTF-8. (and hence UTF-EBCDIC). That is why I have backported those lessons in Devel::PPPort. Even if a release has a functioning (on well-formed input) version of these functions, ppport.h overrides them with something containing the latest security fixes. Module writers shouldn't have to be burdened by the complexities of these kinds of things, as long as they follow a few principles. One of those is to use uvchr not uvuni We may actually have to revert this removal because of the unavailability of some of the maintainers of high-in-the-CPAN-river modules. I went to submit a patch to Cpanel::JSON::XS, and found the source had already been fixed. But I'd like to continue ithe removal from core for a development release or two. I believe we can afford to do so at this point in the development cycle, and it could be a wake up call for people who are actively maintaining their modules that not only is this out-moded code (since 5.7.1!), but utf8_to_uvchr_buf() is probably what you want to change to. I can write a paragraph or two summary of the issue to send to such authors.for their consideration |
Also affected: SMUELLER/Data-Dumper-Limited-0.03.tar.gz |
utf8n_to_uvuni is not mentioned in perl5371delta |
Also affected: DFARRELL/URI-Encode-XS-0.11.tar.gz |
Also affected: JKUTEJ/XML-Char-0.04.tar.gz |
Also affected: SADAHIRO/Lingua-AR-MacArabic-0.20.tar.gz |
Also affected: SADAHIRO/Unicode-Transform-0.51.tar.gz |
Also affected: SADAHIRO/Lingua-KO-MacKorean-0.20.tar.gz |
I have reverted this in f847c0b. Distributions should convert away from using this function, but the drop-in replacement remains cumbersome to use, and there really isn't yet a good API for this functionality. I will investigate and propose some replacements |
The reversion in f847c0b cleared up most, but not all, of the BBC failures reported in this ticket. While installing some other modules against blead, I noticed that there were still test failures in Data-Dumper-Limited and XML-Char. Since those failures were caused by the same commit, I have reported them both in #19897. I think this ticket is closable now. Does anyone see a reason for keeping it open? |
No one has argued for keeping this ticket open. Closing. |
This is a bug report for perl from "Carlos Guevara" carlos@carlosguevara.com,
generated with the help of perlbug 1.42 running under perl 5.37.1.
[Please describe your issue here]
BBC: Blead Breaks Cpanel::JSON::XS
Please see http://matrix.cpantesters.org/?dist=Cpanel::JSON::XS
[Please do not change anything below this line]
Flags:
category=core
severity=low
Site configuration information for perl 5.37.1:
Configured by cpan at Tue Jun 7 03:40:36 EDT 2022.
Summary of my perl5 (revision 5 version 37 subversion 1) configuration:
Commit id: c31ae9a
Platform:
osname=openbsd
osvers=7.1
archname=OpenBSD.amd64-openbsd
uname='openbsd cjg-openbsd7 7.1 generic#3 amd64 '
config_args='-des -Dprefix=
/bin/perl-blead -Dscriptdir=/bin/perl-blead/bin -Dusedevel -Duse64bitall'hint=recommended
useposix=true
d_sigaction=define
useithreads=undef
usemultiplicity=undef
use64bitint=define
use64bitall=define
uselongdouble=undef
usemymalloc=n
default_inc_excludes_dot=define
Compiler:
cc='cc'
ccflags ='-fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include'
optimize='-O2'
cppflags='-fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include'
ccversion=''
gccversion='OpenBSD Clang 13.0.0'
gccosandvers=''
intsize=4
longsize=8
ptrsize=8
doublesize=8
byteorder=12345678
doublekind=3
d_longlong=define
longlongsize=8
d_longdbl=define
longdblsize=16
longdblkind=3
ivtype='long'
ivsize=8
nvtype='double'
nvsize=8
Off_t='off_t'
lseeksize=8
alignbytes=8
prototype=define
Linker and Libraries:
ld='cc'
ldflags ='-Wl,-E -fstack-protector-strong -L/usr/local/lib'
libpth=/usr/lib/clang/13.0.0/lib /usr/lib /usr/local/lib
libs=-lpthread -lm -lutil -lc
perllibs=-lpthread -lm -lutil -lc
libc=/usr/lib/libc.so.96.1
so=so
useshrplib=false
libperl=libperl.a
gnulibc_version=''
Dynamic Linking:
dlsrc=dl_dlopen.xs
dlext=so
d_dlsymun=undef
ccdlflags=' '
cccdlflags='-DPIC -fPIC '
lddlflags='-shared -fPIC -L/usr/local/lib -fstack-protector-strong'
@inc for perl 5.37.1:
/home/cpan/bin/perl-blead/lib/site_perl/5.37.1/OpenBSD.amd64-openbsd
/home/cpan/bin/perl-blead/lib/site_perl/5.37.1
/home/cpan/bin/perl-blead/lib/5.37.1/OpenBSD.amd64-openbsd
/home/cpan/bin/perl-blead/lib/5.37.1
Environment for perl 5.37.1:
HOME=/home/cpan
LANG (unset)
LANGUAGE (unset)
LC_ALL=C
LD_LIBRARY_PATH (unset)
LOGDIR (unset)
PATH=/home/cpan/bin/perl-blead/bin:/home/cpan/bin:/home/cpan/bin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/X11R6/bin:/usr/local/bin:/usr/local/sbin:/usr/games
PERL_BADLANG (unset)
SHELL=/usr/local/bin/bash
The text was updated successfully, but these errors were encountered: