Skip to content
mateu edited this page Apr 7, 2011 · 12 revisions

UTF-8 Manipulation

Enable utf8 pragma

If you want to manipulate UTF-8 string, you need enable utf8 pragma in all your script which contains UTF-8 string.

use Mojolciious::Lite;

use utf8;

my $name = "おおつか たろう";

This is basic convention in Perl, and you remember to save the srcipt as UTF-8.

Reqeust

In Mojolicious, all string which contains request is converted to Perl internal string.

# Parameter value of "foo" is Perl internal string
my $foo = $self->req->param('foo');

If you save it to data storage such as RDBMS, you must encode it to byte string by encode() of Encode.

use Encode 'encode';
$foo = encode('UTF-8', $foo);

Generally, you can use the DBD feature of converting Perl internal string to byte string if DBD provide that feature.

# SQLite
my $dbh = DBI->connect($data_source, undef, undef, {sqlite_unicode => 1});

# MySQL
my $dbh = DBI->connect($data_source, $user, $password, {mysql_enable_utf8 => 1});

It is good that this setting is done when connecting to database, not after connecting to database.

Rendering

In HTML rendering, Perl internal string is automatically converted to UTF-8 byte string, and it is good that Character set is specify in HTML header by "http-equiv" attribute.

get '/' => 'index';
app->start;

__DATA__

@@ index.html.ep
<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    <title>タイトル</title>
  </head>
  <body>
    コンテンツ
  </body>
</html>

JSON configuration file

When you read configuration from configuration file as JSON by json_config plugin, the data is converted from UTF-8 byte string to Perl internal string, so you remember to save configuration file as UTF-8.

# Load JSON configuration file
plugin 'json_config';

JSON Rendering

When you render JSON data, the data is converted from Perl internal string to UTF-8 byte string, so the strings which data contains must be UTF-8.

# JSON rendering
$self->render_json($data);

Testing

In test script, you enable utf8 pragma, and save the script as UTF-8.

use Test::More tests => 3;

use utf8;

my $t = Test::Mojo->new(...);

If you want to contain UTF-8 byte string in query string of URL, use url_escape() of Mojo::ByteStream. b() is shortcut of Mojo::ByteStream->new.

# Test get request 
my $url = '/foo?name=すずき';
$url = b($url)->url_escape->to_string;
$t->get_ok($url)
  ->status_is(200)

If you want to post form data for test, specify the Character set as the second argument. All parameter names and values are converted from Perl internal string to byte string.

# Test post request
$t->post_form_ok('/foo', 'UTF-8' => {name => 'すずき'})
  ->status_is(200)