New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Encode with strictly valid UTF-8 output once before emitting #37

Closed
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
2 participants
@zakame

zakame commented Oct 9, 2018

Prevent double-encoding with UTF-8 especially for META.* files.

Also read input through strictly valid UTF-8 representation.

Possibly fixes #36, needs further testing (hence WIP).

Encode with strictly valid UTF-8 output once before emitting
Prevent double-encoding with UTF-8 especially for META.* files.

Also read input through strictly valid UTF-8 representation.
@Leont

This comment has been minimized.

Show comment
Hide comment
@Leont

Leont Oct 9, 2018

Owner

App::ModuleBuildTiny::dist::get_file returns binary data, but the write_text in regenerate expects text.

I would expect that this bug can be fixed simply by changing that write_text into a write_binary, can you verify this?

Owner

Leont commented Oct 9, 2018

App::ModuleBuildTiny::dist::get_file returns binary data, but the write_text in regenerate expects text.

I would expect that this bug can be fixed simply by changing that write_text into a write_binary, can you verify this?

@zakame

This comment has been minimized.

Show comment
Hide comment
@zakame

zakame Oct 10, 2018

Let's check. Reverting this commit here and replacing it with this change:

diff --git a/lib/App/ModuleBuildTiny.pm b/lib/App/ModuleBuildTiny.pm
index 4c8345e..9f8bf4a 100644
--- a/lib/App/ModuleBuildTiny.pm
+++ b/lib/App/ModuleBuildTiny.pm
@@ -200,7 +200,7 @@ my %actions = (
 
 		my $dist = App::ModuleBuildTiny::Dist->new(regenerate => \%files);
 		for my $filename ($dist->files) {
-			write_text($filename, $dist->get_file($filename)) if $dist->is_generated($filename);
+			write_binary($filename, $dist->get_file($filename)) if $dist->is_generated($filename);
 		}
 		return 0;
 	},
app-modulebuildtiny $ cpanm --reinstall -l ../Hijk/local .
--> Working on .
Configuring App-ModuleBuildTiny-0.023 ... OK
Building and testing App-ModuleBuildTiny-0.023 ... OK
Successfully installed App-ModuleBuildTiny-0.023
1 distribution installed
app-modulebuildtiny $ cd ../Hijk
Hijk $ perl -Ilocal/lib/perl5 local/bin/mbtiny regenerate
Hijk $ head META.yml
head META.yml
---
abstract: 'Fast & minimal low-level HTTP client'
author:
  - 'Kang-min Liu <gugod@gugod.org>'
  - "Ã\x86var Arnfjörð Bjarmason <avar@cpan.org>"
  - 'Borislav Nikolov <jack@sofialondonmoskva.com>'
  - 'Damian Gryski <damian@gryski.com>'
build_requires:
  HTTP::Server::Simple::PSGI: '0'
  Net::Ping: '2.41'
Hijk $ head META.json
head META.json
{
   "abstract" : "Fast & minimal low-level HTTP client",
   "author" : [
      "Kang-min Liu <gugod@gugod.org>",
      "�var Arnfjörð Bjarmason <avar@cpan.org>",
      "Borislav Nikolov <jack@sofialondonmoskva.com>",
      "Damian Gryski <damian@gryski.com>"
   ],
   "dynamic_config" : 0,
   "generated_by" : "App::ModuleBuildTiny version 0.023",
Hijk $ file META.*
META.json: UTF-8 Unicode text
META.yml:  UTF-8 Unicode text

zakame commented Oct 10, 2018

Let's check. Reverting this commit here and replacing it with this change:

diff --git a/lib/App/ModuleBuildTiny.pm b/lib/App/ModuleBuildTiny.pm
index 4c8345e..9f8bf4a 100644
--- a/lib/App/ModuleBuildTiny.pm
+++ b/lib/App/ModuleBuildTiny.pm
@@ -200,7 +200,7 @@ my %actions = (
 
 		my $dist = App::ModuleBuildTiny::Dist->new(regenerate => \%files);
 		for my $filename ($dist->files) {
-			write_text($filename, $dist->get_file($filename)) if $dist->is_generated($filename);
+			write_binary($filename, $dist->get_file($filename)) if $dist->is_generated($filename);
 		}
 		return 0;
 	},
app-modulebuildtiny $ cpanm --reinstall -l ../Hijk/local .
--> Working on .
Configuring App-ModuleBuildTiny-0.023 ... OK
Building and testing App-ModuleBuildTiny-0.023 ... OK
Successfully installed App-ModuleBuildTiny-0.023
1 distribution installed
app-modulebuildtiny $ cd ../Hijk
Hijk $ perl -Ilocal/lib/perl5 local/bin/mbtiny regenerate
Hijk $ head META.yml
head META.yml
---
abstract: 'Fast & minimal low-level HTTP client'
author:
  - 'Kang-min Liu <gugod@gugod.org>'
  - "Ã\x86var Arnfjörð Bjarmason <avar@cpan.org>"
  - 'Borislav Nikolov <jack@sofialondonmoskva.com>'
  - 'Damian Gryski <damian@gryski.com>'
build_requires:
  HTTP::Server::Simple::PSGI: '0'
  Net::Ping: '2.41'
Hijk $ head META.json
head META.json
{
   "abstract" : "Fast & minimal low-level HTTP client",
   "author" : [
      "Kang-min Liu <gugod@gugod.org>",
      "�var Arnfjörð Bjarmason <avar@cpan.org>",
      "Borislav Nikolov <jack@sofialondonmoskva.com>",
      "Damian Gryski <damian@gryski.com>"
   ],
   "dynamic_config" : 0,
   "generated_by" : "App::ModuleBuildTiny version 0.023",
Hijk $ file META.*
META.json: UTF-8 Unicode text
META.yml:  UTF-8 Unicode text
@Leont

This comment has been minimized.

Show comment
Hide comment
@Leont

Leont Oct 10, 2018

Owner

Well that does look like progress , now it's only double encoded instead of triple-encoded.

The other half of this problem appears to be that Module::Metadata doesn't honor =encoding utf8, given me utf-8 encoded data instead of decoded data. I guess I'll have to decode that myself :-/.

Owner

Leont commented Oct 10, 2018

Well that does look like progress , now it's only double encoded instead of triple-encoded.

The other half of this problem appears to be that Module::Metadata doesn't honor =encoding utf8, given me utf-8 encoded data instead of decoded data. I guess I'll have to decode that myself :-/.

@Leont

This comment has been minimized.

Show comment
Hide comment
@Leont

Leont Oct 11, 2018

Owner

I just pushed an encoding branch that should fix this issue.

Owner

Leont commented Oct 11, 2018

I just pushed an encoding branch that should fix this issue.

@Leont

This comment has been minimized.

Show comment
Hide comment
@Leont

Leont Oct 12, 2018

Owner

Released :-)

Owner

Leont commented Oct 12, 2018

Released :-)

@Leont Leont closed this Oct 12, 2018

@zakame zakame deleted the zakame:fix-double-encoding branch Oct 13, 2018

@zakame

This comment has been minimized.

Show comment
Hide comment
@zakame

zakame Oct 13, 2018

Nice, thanks! 👍

zakame commented Oct 13, 2018

Nice, thanks! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment