Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix "Malformed UTF-8 character" warnings on old perls #16

Closed

Conversation

andrewalker
Copy link

As per https://rt.cpan.org/Ticket/Display.html?id=101582, the following code in Perl 5.10:

use utf8;
use Types::Standard qw/Dict Int/;
use Type::Params qw/compile/;
compile( Dict [ foo => Int ] );

Will generate the following output:

Malformed UTF-8 character (unexpected continuation byte 0xb8, with no preceding start byte) in subroutine entry at lib/Type/Params.pm line 155.
Malformed UTF-8 character (unexpected non-continuation byte 0x3b, immediately after start byte 0xd0) in subroutine entry at lib/Type/Params.pm line 155.
Malformed UTF-8 character (unexpected non-continuation byte 0x78, immediately after start byte 0xf8) in subroutine entry at lib/Type/Params.pm line 155.
Malformed UTF-8 character (unexpected non-continuation byte 0x3b, immediately after start byte 0xd0) in subroutine entry at lib/Type/Params.pm line 155.
Malformed UTF-8 character (unexpected non-continuation byte 0x3b, immediately after start byte 0xd0) in subroutine entry at lib/Type/Params.pm line 155.
Malformed UTF-8 character (unexpected continuation byte 0xb8, with no preceding start byte) in subroutine entry at lib/Type/Params.pm line 155.
...
(goes on for thousands of lines)

Example: https://travis-ci.org/andrewalker/p5-webservice-digitalocean/jobs/47282523

I have now tested and confirmed that older versions of Perl (5.8, 5.6) also fail.

After 5.10, B::perlstring() was made an alias to B::cstring(), as shown in this commit:

http://perl5.git.perl.org/perl.git/commitdiff/84556172294db864f27a4b5df6dac9127e1e7205

If we replace all instances of B::perlstring with B::cstring in the "compile" subroutine in Test::Params, the issue goes away, and all tests pass.

Please tell me if this solution is OK for you, and if I should replace all other cases of perlstring with cstring. If this is a bad idea, would you help me find another solution?

Compiling "Dict" with the utf8 pragma on, using any Perl version older
than 5.10 would cause the interpreter to emit warnings about malformed
utf-8 characters.

The fix is to replace B::perlstring with B::cstring in the "compile"
subroutine in Type::Params, as they are aliases in recent perls.
tobyink added a commit that referenced this pull request May 18, 2017
@tobyink
Copy link
Owner

tobyink commented May 18, 2017

I did this slightly differently, but thanks for your PR.

@tobyink tobyink closed this May 18, 2017
jsonn pushed a commit to jsonn/pkgsrc that referenced this pull request Jun 4, 2017
Upstream changes:
1.002000	2017-06-01

 [ Packaging ]
 - Stable version number.

1.001_016	2017-05-30

 [ Documentation ]
 - Include page-numbers.pl example

1.001_015	2017-05-20

 [ Bug Fixes ]
 - Fix HashRef[Str]|Undef|Str parsing on Perl < 5.14.
   Fixes RT#121764.
   Aran Clary Deltac++
   Graham Knop++
   <https://rt.cpan.org/Ticket/Display.html?id=121764>

1.001_014	2017-05-19

 - Include trailing line break at the end of stringified version of some
   exceptions.
   Peter Valdemar M繪rch++

1.001_013	2017-05-18	Kittiversary

 [ Bug Fixes ]
 - Fixed crazy amount of UTF-8 warnings from Type::Params on Perl 5.6.x and
   Perl 5.8.x.
   Fixes RT#101582.
   Andr矇 Walker++
   <https://rt.cpan.org/Ticket/Display.html?id=101582>
   <tobyink/p5-type-tiny#16>
 - StrMatch changes in previous release broke the ability to check type
   equality between two parameterized StrMatch types under some
   circumstances. Changed how the hash key for stashing regexp references
   gets built �� is now closer to the old way. This doesn't revert the
   change in 1.001_012 where regexp checks can be inlined better, but only
   applies to those regexp references that can't easily be inlined.

1.001_012	2017-05-17

 [ BACK COMPAT ]
 - RegexpRef now accepts blessed objects if $object->isa('Regexp') returns
   true.

 [ Other ]
 - StrMatch will use Regexp::Util (if available) to inline regular
   expressions more sensibly.

1.001_011	2017-05-17

 [ Bug Fixes ]
 - Type constraints like Tuple[Int] shouldn't report they have a coercion
   if Int doesn't have a coercion.

 [ Other ]
 - Added: Types::Standard now has a CycleTuple type.

1.001_010	2017-05-16	Puppiversary

 [ Test Suite ]
 - t/00-begin.t will now work around ANDK's apparently broken XS testing
   environment.

1.001_009	2017-05-13

 - Rewrite some benchmarking scripts to use
   Benchmark::Featureset::ParamCheck.
 - Use Ref::Util::XS (if it's installed) to speed up certain type checks.

1.001_008	2017-05-10

 [ Bug Fixes ]
 - Type::Params should make sure Type::Utils is loaded before calling
   english_list().

 [ Documentation ]
 - Rearrange the examples directory in the distribution.

 [ Other ]
 - Added: Named parameter validation benchmarking script.
 - Added: Reduce scope of local $SIG{__DIE__} in Type::Registry.
   Graham Knop++

1.001_007	2017-05-04	May the fourth be with you

 [ Documentation ]
 - Comparison of Type::Params with new(ish) CPAN module
   Params::ValidationCompiler.
 - Show example of how to set defaults for parameters with Type::Params.

 [ Other ]
 - Added: Type::Params' `multisig` function now sets a variable
   `${^TYPE_PARAMS_MULTISIG}` to indicate which signature succeeded.
 - Optimization of Type::Params positional parameter checking for simple
   cases with no slurpy parameter and no coercions.
 - Optimizations for Tuple and StrMatch type constraints from
   Types::Standard.

1.001_006	2017-04-30

 - Allow Type::Tiny's `constraint` parameter to be a string of Perl code.
 - Localize $SIG{__DIE__} in Type::Registry.
   Fixes RT#100780.
   <https://rt.cpan.org/Ticket/Display.html?id=100780>

1.001_005	2017-04-19

 [ Bug Fixes ]
 - 02-api.t should check version of Moose available.
   <tobyink/p5-type-tiny#20>
 - 20-unit/Type-Utils/warnings.t should check version of Test::Warnings.
   Alexandr Ciornii++
   <tobyink/p5-type-tiny#21>
 - Fix minor typos in documentation for Types::Standard.
   Zoffix Znet++
   <tobyink/p5-type-tiny#30>
 - Fix variable name typo in documentation for Type::Params.
   Lucas Buchala++
   <tobyink/p5-type-tiny#37>

 [ Documentation ]
 - Include projected release date for Type::Tiny 1.002000 in NEWS.

 [ Test Suite ]
 - Bundle a test case for GH issue 14.
   <tobyink/p5-type-tiny#14>

 [ Other ]
 - Improved error location reporting for Moo
   Peter Valdemar M繪rch++
   <tobyink/p5-type-tiny#35>
 - Updated: NumericCode now coerces from strings with whitespace in them,
   like MooseX::Types::Common::Numeric.
   Denis Ibaev++
   <tobyink/p5-type-tiny#22>

1.001_004	2017-02-06

 - Attempting ArrayRef[Int, Int] or similar now throws an exception.
   Fixes RT#105299.
   Thomas Sibley++
   <https://rt.cpan.org/Ticket/Display.html?id=105299>

1.001_003	2017-02-02

 - Updated: Merge fixes from stable Type-Tiny 1.000006.

1.001_002	2014-10-25

 [ Bug Fixes ]
 - Fix short-circuiting optimizations for parameterized HashRef, ArrayRef,
   ScalarRef, and Map type constraints.
   Fixes RT#99312.
   Marcel Timmerman++
   <https://rt.cpan.org/Ticket/Display.html?id=99312>
 - Inlined version of Types::Standard::Int should check that the value is
   not a reference.

 [ Test Suite ]
 - Fix annoying warning message in test suite with recent versions of
   Exporter::Tiny.

 [ Other ]
 - Make equals/is_a_type_of/is_subtype_of/is_supertype_of in
   Type::Tiny::Union work more like Moose::Meta::TypeConstraint::Union.

1.001_001	2014-09-19

 - Lazy-load Text::Balanced in Type::Parser. (Many parses don't even need
   it.)
 - Lazy-load Type::Tiny::Union in Type::Params.
 - Updated: Prefer Sub::Util over Sub::Name. (The former is smaller.)

1.001_000	2014-09-07

 [ Bug Fixes ]
 - Fix for Type::Registry::DWIM.
   Fixes RT#98458.
   Marcel Montes++
   <https://rt.cpan.org/Ticket/Display.html?id=98458>
 - Fix issues with coercions and native attribute traits with some oldish
   versions of Moose on oldish versions of Perl.
   Fixes RT#98159.
   Peter Flanigan++
   <https://rt.cpan.org/Ticket/Display.html?id=98159>

 [ Documentation ]
 - Updated NEWS file.
 - Updated TODO file.
 - Updates to Type::Tiny::Manual::UsingWithMoose,
   Type::Tiny::Manual::UsingWithMoo, and
   Type::Tiny::Manual::UsingWithMouse.

 [ Test Suite ]
 - Make some of the test case skip_all bits more ambitious; test older
   versions of Moose and Moo than we were testing before.

 [ Other ]
 - Added: Type::Params now provides `compile_named` and `validate_named`
   functions which do the same thing as `compile` and `validate` but are
   better for named arguments.
 - Updated: If Sub::Name is unavailable, but the shiny new core Sub::Util
   is available, then use it instead.
 - Updated: Want Type::Tiny::XS 0.011.
 - `Type::Utils::dwim_type` now allows more control over fallback
   behaviours.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants