Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Release] ArgumentError: invalid byte sequence in US-ASCII when signing binaries for RC0 #41238

Closed
raulcd opened this issue Apr 16, 2024 · 6 comments

Comments

@raulcd
Copy link
Member

raulcd commented Apr 16, 2024

Describe the bug, including details regarding any error messages, version, and platform.

I had an issue when signing binaries locally executing:$ dev/release/05-binary-upload.sh 16.0.0 0

I am not entirely sure if it's due to my locale settings or due to my GPG key containing non-ASCII characters.

perl: warning: Falling back to the standard locale ("C").
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
	LANGUAGE = (unset),
	LC_ALL = (unset),
	LC_ADDRESS = "es_ES.UTF-8",
	LC_NAME = "es_ES.UTF-8",
	LC_MONETARY = "es_ES.UTF-8",
	LC_PAPER = "es_ES.UTF-8",
	LC_IDENTIFICATION = "es_ES.UTF-8",
	LC_TELEPHONE = "es_ES.UTF-8",
	LC_MEASUREMENT = "es_ES.UTF-8",
	LC_CTYPE = "C",
	LC_TIME = "es_ES.UTF-8",
	LC_NUMERIC = "es_ES.UTF-8",
	LANG = "C"
    are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
Signing: almalinux-9 aarch64 -  10.2% [6/59] 00:00:02 00:00:20  2/sperl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
	LANGUAGE = (unset),
	LC_ALL = (unset),
	LC_ADDRESS = "es_ES.UTF-8",
	LC_NAME = "es_ES.UTF-8",
	LC_MONETARY = "es_ES.UTF-8",
	LC_PAPER = "es_ES.UTF-8",
	LC_IDENTIFICATION = "es_ES.UTF-8",
	LC_TELEPHONE = "es_ES.UTF-8",
	LC_MEASUREMENT = "es_ES.UTF-8",
	LC_CTYPE = "C",
	LC_TIME = "es_ES.UTF-8",
	LC_NUMERIC = "es_ES.UTF-8",
	LANG = "C"
    are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
rake aborted!
ArgumentError: invalid byte sequence in US-ASCII
/host/binary-task.rb:881:in `==='
/host/binary-task.rb:881:in `block in valid_sign?'
/host/binary-task.rb:867:in `pipe'
/host/binary-task.rb:867:in `valid_sign?'
/host/binary-task.rb:887:in `sign'
/host/binary-task.rb:930:in `block in sign_dir'
/host/binary-task.rb:929:in `each'
/host/binary-task.rb:929:in `sign_dir'
/host/binary-task.rb:1733:in `block (5 levels) in define_yum_rc_tasks'
/host/binary-task.rb:1729:in `glob'
/host/binary-task.rb:1729:in `glob'
/host/binary-task.rb:1729:in `block (4 levels) in define_yum_rc_tasks'
/host/binary-task.rb:1721:in `each'
/host/binary-task.rb:1721:in `block (3 levels) in define_yum_rc_tasks'
/usr/share/rubygems-integration/all/gems/rake-13.0.3/exe/rake:27:in `<top (required)>'
Tasks: TOP => yum:rc => yum:rc:update
(See full trace by running task with --trace)
Connection to 127.0.0.1 closed.

Component(s)

Release

kou added a commit to kou/arrow that referenced this issue Apr 16, 2024
… binary

We may have non ASCII characters in the process. For example, PGP uid
may include non ASCII characters.
@raulcd
Copy link
Member Author

raulcd commented Apr 16, 2024

I tried:

diff --git a/dev/release/binary/runner.sh b/dev/release/binary/runner.sh
index 465d60d..d92d1cd 100755
--- a/dev/release/binary/runner.sh
+++ b/dev/release/binary/runner.sh
@@ -19,7 +19,7 @@
 
 set -u
 
-export LANG=C
+export LANG=C.UTF-8
 
 target_dir=/host/binary/tmp
 original_owner=$(stat --format=%u ${target_dir})

and got the same error:

perl: warning: Falling back to a fallback locale ("C.UTF-8").
Signing: almalinux-9 aarch64 -  10.2% [6/59] 00:00:02 00:00:20  2/sperl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
	LANGUAGE = (unset),
	LC_ALL = (unset),
	LC_ADDRESS = "es_ES.UTF-8",
	LC_NAME = "es_ES.UTF-8",
	LC_MONETARY = "es_ES.UTF-8",
	LC_PAPER = "es_ES.UTF-8",
	LC_IDENTIFICATION = "es_ES.UTF-8",
	LC_TELEPHONE = "es_ES.UTF-8",
	LC_MEASUREMENT = "es_ES.UTF-8",
	LC_CTYPE = "C",
	LC_TIME = "es_ES.UTF-8",
	LC_NUMERIC = "es_ES.UTF-8",
	LANG = "C.UTF-8"
    are supported and installed on your system.
perl: warning: Falling back to a fallback locale ("C.UTF-8").
rake aborted!
ArgumentError: invalid byte sequence in US-ASCII
/host/binary-task.rb:881:in `==='

@raulcd
Copy link
Member Author

raulcd commented Apr 16, 2024

@kou feel free to sign and upload yourself to trigger the binary verification so I can send the vote tomorrow. I am going to sleep now :)

@kou
Copy link
Member

kou commented Apr 16, 2024

OK! I'll do it!

@kou
Copy link
Member

kou commented Apr 17, 2024

Ah, you define not only LANG but also LC_* explicitly. Our script should override them too. (LC_CTYPE is important in this case.)

kou added a commit that referenced this issue Apr 17, 2024
#41242)

### Rationale for this change

We may have non ASCII characters in the process. For example, PGP uid may include non ASCII characters.

### What changes are included in this PR?

Use `LANG=C.UTF-8` and `LC_*=C.UTF-8` to use UTF-8 as the default encoding.

### Are these changes tested?

Yes. I used this for 16.0.0 RC0.

### Are there any user-facing changes?

No.
* GitHub Issue: #41238

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
@kou
Copy link
Member

kou commented Apr 17, 2024

Issue resolved by pull request 41242
#41242

@kou kou added this to the 17.0.0 milestone Apr 17, 2024
@kou kou closed this as completed Apr 17, 2024
@kou
Copy link
Member

kou commented Apr 17, 2024

sign and upload yourself to trigger the binary verification

Done!
#41235 (comment)

@raulcd raulcd modified the milestones: 17.0.0, 16.1.0 Apr 19, 2024
raulcd pushed a commit that referenced this issue Apr 29, 2024
#41242)

### Rationale for this change

We may have non ASCII characters in the process. For example, PGP uid may include non ASCII characters.

### What changes are included in this PR?

Use `LANG=C.UTF-8` and `LC_*=C.UTF-8` to use UTF-8 as the default encoding.

### Are these changes tested?

Yes. I used this for 16.0.0 RC0.

### Are there any user-facing changes?

No.
* GitHub Issue: #41238

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
tolleybot pushed a commit to tmct/arrow that referenced this issue May 2, 2024
… binary (apache#41242)

### Rationale for this change

We may have non ASCII characters in the process. For example, PGP uid may include non ASCII characters.

### What changes are included in this PR?

Use `LANG=C.UTF-8` and `LC_*=C.UTF-8` to use UTF-8 as the default encoding.

### Are these changes tested?

Yes. I used this for 16.0.0 RC0.

### Are there any user-facing changes?

No.
* GitHub Issue: apache#41238

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
tolleybot pushed a commit to tmct/arrow that referenced this issue May 4, 2024
… binary (apache#41242)

### Rationale for this change

We may have non ASCII characters in the process. For example, PGP uid may include non ASCII characters.

### What changes are included in this PR?

Use `LANG=C.UTF-8` and `LC_*=C.UTF-8` to use UTF-8 as the default encoding.

### Are these changes tested?

Yes. I used this for 16.0.0 RC0.

### Are there any user-facing changes?

No.
* GitHub Issue: apache#41238

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
rok pushed a commit to tmct/arrow that referenced this issue May 8, 2024
… binary (apache#41242)

### Rationale for this change

We may have non ASCII characters in the process. For example, PGP uid may include non ASCII characters.

### What changes are included in this PR?

Use `LANG=C.UTF-8` and `LC_*=C.UTF-8` to use UTF-8 as the default encoding.

### Are these changes tested?

Yes. I used this for 16.0.0 RC0.

### Are there any user-facing changes?

No.
* GitHub Issue: apache#41238

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
rok pushed a commit to tmct/arrow that referenced this issue May 8, 2024
… binary (apache#41242)

### Rationale for this change

We may have non ASCII characters in the process. For example, PGP uid may include non ASCII characters.

### What changes are included in this PR?

Use `LANG=C.UTF-8` and `LC_*=C.UTF-8` to use UTF-8 as the default encoding.

### Are these changes tested?

Yes. I used this for 16.0.0 RC0.

### Are there any user-facing changes?

No.
* GitHub Issue: apache#41238

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
vibhatha pushed a commit to vibhatha/arrow that referenced this issue May 25, 2024
… binary (apache#41242)

### Rationale for this change

We may have non ASCII characters in the process. For example, PGP uid may include non ASCII characters.

### What changes are included in this PR?

Use `LANG=C.UTF-8` and `LC_*=C.UTF-8` to use UTF-8 as the default encoding.

### Are these changes tested?

Yes. I used this for 16.0.0 RC0.

### Are there any user-facing changes?

No.
* GitHub Issue: apache#41238

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants