Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

find_encoding returns Internal encoding #157

Closed
Rikkuru opened this issue May 14, 2021 · 3 comments
Closed

find_encoding returns Internal encoding #157

Rikkuru opened this issue May 14, 2021 · 3 comments

Comments

@Rikkuru
Copy link

Rikkuru commented May 14, 2021

Hi I caught an error " Unknown encoding 'Internal' " in call to from_to with encoding object returned from find_encoding.
Here is an example:

perl -e 'use strict; use warnings; use Encode; my $enc = Encode::find_encoding("Unicode"); Encode::from_to("abc", $enc->name, "utf-8");'

I think Internal encoding obj should not be returned if it cant be used.

@pali
Copy link
Contributor

pali commented May 14, 2021

It is caused by following code, which defines Unicode encoding via Internal package.

p5-encode/Encode.pm

Lines 176 to 217 in 67f59dd

if ($ON_EBCDIC) {
package Encode::UTF_EBCDIC;
use parent 'Encode::Encoding';
my $obj = bless { Name => "UTF_EBCDIC" } => "Encode::UTF_EBCDIC";
Encode::define_encoding($obj, 'Unicode');
sub decode {
my ( undef, $str, $chk ) = @_;
my $res = '';
for ( my $i = 0 ; $i < length($str) ; $i++ ) {
$res .=
chr(
utf8::unicode_to_native( ord( substr( $str, $i, 1 ) ) )
);
}
$_[1] = '' if $chk;
return $res;
}
sub encode {
my ( undef, $str, $chk ) = @_;
my $res = '';
for ( my $i = 0 ; $i < length($str) ; $i++ ) {
$res .=
chr(
utf8::native_to_unicode( ord( substr( $str, $i, 1 ) ) )
);
}
$_[1] = '' if $chk;
return $res;
}
} else {
package Encode::Internal;
use parent 'Encode::Encoding';
my $obj = bless { Name => "Internal" } => "Encode::Internal";
Encode::define_encoding($obj, 'Unicode');
sub decode {
my ( undef, $str, $chk ) = @_;
utf8::upgrade($str);
$_[1] = '' if $chk;
return $str;
}
*encode = \&decode;
}

And I'm not sure what if purpose of that code. Because it contains following implementation of encode and decode methods:

p5-encode/Encode.pm

Lines 210 to 216 in 67f59dd

sub decode {
my ( undef, $str, $chk ) = @_;
utf8::upgrade($str);
$_[1] = '' if $chk;
return $str;
}
*encode = \&decode;

@dankogai would probably know more why it is needed...

@dankogai
Copy link
Owner

Even I cannot recall what this code was for but since it is w/ UTF_EBCDIC this code existed BEFORE I took over the maintainer from late Nick Ing-Simmons. Luckily virtually no one used Unicode as encoding names till this report. Since Unicode is not a name of encoding, this code should be removed.

dankogai added a commit that referenced this issue May 14, 2021
@dankogai
Copy link
Owner

Done. 06591f7
Unicode is no longer a valid encoding name, as it should be.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants