Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Enable named characters via \N{CHARNAME) #8

Merged
merged 1 commit into from

2 participants

@sartak

From http://stackoverflow.com/questions/6162484/why-does-modern-perl-avoid-utf-8-by-default/6163129#6163129 (the big 7)

  1. ☑ decode @ARGV as UTF‑8 strings, and set the encoding of all three of stdin, stdout, and stderr to UTF‑8
  2. □ unicode strings ( #2 )
  3. □ enable unicode warnings ( #1 )
  4. ☑ declare that this source unit is encoded as UTF‑8
  5. ☑ declare that anything that opens a filehandles within this lexical scope but not elsewhere is to assume that that stream is encoded in UTF‑8 unless you tell it otherwise
  6. □ enable named characters via \N{CHARNAME} ( #8 ;) )
  7. ☑ if you have a DATA handle, you must explicitly set its encoding

I don't see any harm in loading the charnames module for users, since it doesn't export any functions, nor does it take particularly long to load.

@doherty doherty merged commit d7564ff into doherty:master
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Dec 19, 2011
  1. @sartak
This page is out of date. Refresh to see the latest.
Showing with 23 additions and 2 deletions.
  1. +8 −2 lib/utf8/all.pm
  2. +15 −0 t/charnames.t
View
10 lib/utf8/all.pm
@@ -18,8 +18,10 @@ use 5.010; # state
L<utf8> allows you to write your Perl encoded in UTF-8. That means UTF-8
strings, variable names, and regular expressions. C<utf8::all> goes further, and
makes C<@ARGV> encoded in UTF-8, and filehandles are opened with UTF-8 encoding
-turned on by default (including STDIN, STDOUT, STDERR). If you I<don't> want
-UTF-8 for a particular filehandle, you'll have to set C<binmode $filehandle>.
+turned on by default (including STDIN, STDOUT, STDERR), and charnames are
+imported so C<\N{...}> sequences can be used to compile Unicode characters based
+on names. If you I<don't> want UTF-8 for a particular filehandle, you'll have to
+set C<binmode $filehandle>.
The pragma is lexically-scoped, so you can do the following if you had some
reason to:
@@ -38,6 +40,7 @@ reason to:
=cut
use Encode ();
+use charnames ();
use parent 'utf8';
use parent 'open';
@@ -52,6 +55,9 @@ sub import {
# utf8 by default on filehandles
open::import($class, ':encoding(UTF-8)', ':std');
+ # charnames (\N{...})
+ charnames::import($class, ':full', ':short');
+
# utf8 in @ARGV
state $have_encoded_argv = 0;
_encode_argv() unless $have_encoded_argv++;
View
15 t/charnames.t
@@ -0,0 +1,15 @@
+#!perl
+# Test that utf8::all imports charnames for \N
+
+use utf8::all;
+use Test::More tests => 3;
+
+is_deeply "\N{GREEK SMALL LETTER SIGMA} is called sigma.",
+ "σ is called sigma.";
+
+is_deeply "\N{LATIN CAPITAL LETTER E WITH VERTICAL LINE BELOW}",
+ "";
+
+is_deeply charnames::vianame("GOTHIC LETTER AHSA"),
+ 66352;
+
Something went wrong with that request. Please try again.