Skip to content

Conversation

mauke
Copy link
Contributor

@mauke mauke commented Oct 11, 2025

-CA (also known as PERL_UNICODE=A) tells perl to assume the command-line arguments to be UTF-8. This did not apply to the global variables implicitly created by -s, however, only to the elements of @ARGV. Logically speaking it does not make sense to have half a command line treated as UTF-8, so this patch makes -CA apply to everything.

Fixes #23377.


  • This set of changes requires a perldelta entry, and it is included.

mauke added 3 commits October 11, 2025 23:27
- move "bug id" comment where it was originally placed: before the
  "#!perl -s" test
- add GH ticket number and short description
- remove 2-arg open
-CA (also known as PERL_UNICODE=A) tells perl to assume the command-line
arguments to be UTF-8. This did not apply to the global variables
implicitly created by -s, however, only to the elements of @ARGV.
Logically speaking it does not make sense to have half a command line
treated as UTF-8, so this patch makes -CA apply to everything.

Fixes Perl#23377.
Copy link
Contributor

@tonycoz tonycoz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a reasonable change, though I wonder if we should handle normalization forms (which is a much bigger issue out of scope for this change)

@Grinnz
Copy link
Contributor

Grinnz commented Oct 12, 2025

I think if we are to try to improve -C, it should first be by making a new second switch (or additional flag to -C) that properly uses :encoding(UTF-8) layers. But this seems like a fine enhancement to -CA.

@mauke mauke merged commit 87a16f6 into Perl:blead Oct 13, 2025
34 checks passed
@mauke mauke deleted the fix-23377-decode-argv-s-switches branch October 13, 2025 04:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Values from command-line switch -s are not in UTF-8

3 participants