Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't perform unicode operations in Safe compartment #10188

Closed
p5pRT opened this issue Feb 19, 2010 · 8 comments
Closed

Can't perform unicode operations in Safe compartment #10188

p5pRT opened this issue Feb 19, 2010 · 8 comments

Comments

@p5pRT
Copy link

@p5pRT p5pRT commented Feb 19, 2010

Migrated from rt.perl.org#72942 (status was 'resolved')

Searchable as RT72942$

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Feb 19, 2010

From @timbunce

Basic operations on unicode strings don't work inside a Safe compartment​:

perl -MSafe -MOpcode=full_opset -e '$a=Safe->new; $a->permit(full_opset()); $a->reval(q{ pack("U",0xC4) =~ /\\xE4/i; 1 }) or die $@​ '
Undefined subroutine Safe​::Root0​::utf8​::SWASHNEW called at (eval 5) line 1

Recent (unrelated) changes to Safe have exposed this problem in PostgreSQL
PL/Perl (a high-profile user of Safe). See the pgsql-bugs message
http​://archives.postgresql.org/pgsql-bugs/2010-02/msg00163.php
and the following thread.

The thread includes an outline of what I've learnt from my investigation into
the issue.

I developed a patch to Safe that works around the issue by only sharing
utf8​::SWASHNEW *iff* utf8 is loaded (unconditional sharing doesn't work,
see thread).

On reflection though, I think it would be better for Safe to simply *always*
require utf8 (and ensure utf8_heavy.pl is run) so it can always share
utf8​::SWASHNEW. That's not only cleaner because it hides the implementation
details of perl unicode, but it also fixes the PostgreSQL problem.

I'll followup this post with a patch that does that.

Perl Info

Flags:
    category=library
    severity=high

Site configuration information for perl 5.10.1:

Configured by timbo at Fri Feb  5 17:36:03 GMT 2010.

Summary of my perl5 (revision 5 version 10 subversion 1) configuration:
  Commit id: 5348debf9fd57fc15c26529386769684fab96e57
  Platform:
    osname=darwin, osvers=10.2.0, archname=darwin-thread-multi-2level
    uname='darwin timac.local 10.2.0 darwin kernel version 10.2.0: tue nov 3 10:37:10 pst 2009; root:xnu-1486.2.11~1release_i386 i386 '
    config_args='-des -Doptimize=-g -DEBUGGING=both -Accflags=-DDEBUG_LEAKING_SCALARS -Dusethreads -Dusemultiplicity -Duseshrplib -Dusedevel -Uversiononly -Dprefix=/Users/timbo/pg/perl5101'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=define, usemultiplicity=define
    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
    use64bitint=undef, use64bitall=undef, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-fno-common -DPERL_DARWIN -no-cpp-precomp -DDEBUG_LEAKING_SCALARS -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -I/opt/local/include',
    optimize='-g',
    cppflags='-no-cpp-precomp -fno-common -DPERL_DARWIN -no-cpp-precomp -DDEBUG_LEAKING_SCALARS -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -I/opt/local/include'
    ccversion='', gccversion='4.2.1 (Apple Inc. build 5646)', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='env MACOSX_DEPLOYMENT_TARGET=10.3 cc', ldflags =' -fstack-protector -L/usr/local/lib -L/opt/local/lib'
    libpth=/usr/local/lib /opt/local/lib /usr/lib
    libs=-lgdbm -ldbm -ldl -lm -lutil -lc
    perllibs=-ldl -lm -lutil -lc
    libc=/usr/lib/libc.dylib, so=dylib, useshrplib=true, libperl=libperl.dylib
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=bundle, d_dlsymun=undef, ccdlflags=' '
    cccdlflags=' ', lddlflags=' -bundle -undefined dynamic_lookup -L/usr/local/lib -L/opt/local/lib -fstack-protector'

Locally applied patches:
    


@INC for perl 5.10.1:
    /Users/timbo/pg/perl5101/lib/5.10.1/darwin-thread-multi-2level
    /Users/timbo/pg/perl5101/lib/5.10.1
    /Users/timbo/pg/perl5101/lib/site_perl/5.10.1/darwin-thread-multi-2level
    /Users/timbo/pg/perl5101/lib/site_perl/5.10.1
    .


Environment for perl 5.10.1:
    DYLD_LIBRARY_PATH=/usr/local/jogl-2.0-macosx-universal/lib::/opt/local/lib/mysql5/mysql:/Users/timbo/perl6/parrot/blib/lib
    HOME=/Users/timbo
    LANG=en_IE.UTF-8
    LANGUAGE (unset)
    LC_ALL=en_IE.UTF-8
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/usr/local/pgsql/bin:/Users/timbo/pg/perl5101/bin:/Users/timbo/pg/perl5100/bin:/Users/timbo/pg/perl598/bin:/Users/timbo/pg/perl5088/bin:/usr/local/bin:/opt/local/bin:/usr/bin:/bin:/opt/local/sbin:/usr/sbin:/sbin:/Users/timbo/bin
    PERLCRITIC=/Users/timbo/.setdev/perlcriticrc
    PERLTIDY=/Users/timbo/.setdev/perltidyrc
    PERL_BADLANG (unset)
    SHELL=/bin/bash

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Feb 19, 2010

From @timbunce

On Fri, Feb 19, 2010 at 05​:25​:33AM -0800, Tim Bunce wrote​:

Basic operations on unicode strings don't work inside a Safe compartment​:

perl -MSafe -MOpcode=full_opset -e '$a=Safe->new; $a->permit(full_opset()); $a->reval(q{ pack("U",0xC4) =~ /\\xE4/i; 1 }) or die $@​ '
Undefined subroutine Safe​::Root0​::utf8​::SWASHNEW called at (eval 5) line 1

On reflection though, I think it would be better for Safe to simply *always*
require utf8 (and ensure utf8_heavy.pl is run) so it can always share
utf8​::SWASHNEW. That's not only cleaner because it hides the implementation
details of perl unicode, but it also fixes the PostgreSQL problem.

I'll followup this post with a patch that does that.

Here's the patch.

Tim.

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Feb 19, 2010

From @timbunce

Safe-2.22a-swashnew1.patch
diff --git a/Safe.pm b/Safe.pm
index 7453f24..c0a1d3d 100644
--- a/Safe.pm
+++ b/Safe.pm
@@ -41,6 +41,23 @@ use Opcode 1.01, qw(
 
 *ops_to_opset = \&opset;   # Temporary alias for old Penguins
 
+# Regular expressions and other unicode-aware code may need to call
+# utf8->SWASHNEW (via perl's utf8.c).  That will fail unless we share the
+# SWASHNEW method.
+# Sadly we can't just add utf8::SWASHNEW to $default_share because perl's
+# utf8.c code does a fetchmethod on SWASHNEW to check if utf8.pm is loaded,
+# and sharing makes it look like the method exists.
+# The simplest and most robust fix is to ensure the utf8 module is loaded when
+# Safe is loaded. Then we can add utf8::SWASHNEW to $default_share.
+require utf8;
+# we must ensure that utf8_heavy.pl, where SWASHNEW is defined, is loaded
+# but without depending on knowledge of that implementation detail.
+# This code (//i on a unicode string) ensures utf8 is fully loaded
+# and also loads the ToFold SWASH.
+# (Swashes are cached internally by perl in PL_utf8_* variables
+# independent of being inside/outside of Safe. So once loaded they can be)
+do { my $unicode = pack('U',0xC4).'1a'; $unicode =~ /\xE4/i; };
+# now we can safely include utf8::SWASHNEW in $default_share defined below.
 
 my $default_root  = 0;
 # share *_ and functions defined in universal.c
@@ -60,6 +77,7 @@ my $default_share = [qw[
     &utf8::downgrade
     &utf8::native_to_unicode
     &utf8::unicode_to_native
+    &utf8::SWASHNEW
     $version::VERSION
     $version::CLASS
     $version::STRICT
@@ -130,6 +148,7 @@ sub new {
     # the whole glob *_ rather than $_ and @_ separately, otherwise
     # @_ in non default packages within the compartment don't work.
     $obj->share_from('main', $default_share);
+
     Opcode::_safe_pkg_prep($obj->{Root}) if($Opcode::VERSION > 1.04);
     return $obj;
 }
diff --git a/t/safeutf8.t b/t/safeutf8.t
new file mode 100644
index 0000000..28441da
--- /dev/null
+++ b/t/safeutf8.t
@@ -0,0 +1,46 @@
+#!perl -w
+$|=1;
+BEGIN {
+    require Config; import Config;
+    if ($Config{'extensions'} !~ /\bOpcode\b/ && $Config{'osname'} ne 'VMS') {
+        print "1..0\n";
+        exit 0;
+    }
+}
+
+use Test::More tests => 7;
+
+use Safe 1.00;
+use Opcode qw(full_opset);
+
+pass;
+
+my $safe = Safe->new('PLPerl');
+$safe->permit(qw(pack));
+
+# Expression that triggers require utf8 and call to SWASHNEW.
+# Fails with "Undefined subroutine PLPerl::utf8::SWASHNEW called"
+# if SWASHNEW is not shared, else returns true if unicode logic is working.
+my $trigger = q{ my $a = pack('U',0xC4); $a =~ /\\xE4/i };
+
+ok $safe->reval( $trigger ), 'trigger expression should return true';
+is $@, '', 'trigger expression should not die';
+
+# return a closure
+my $sub = $safe->reval(q{sub { warn pack('U',0xC4) }});
+
+# define code outside Safe that'll be triggered from inside
+my @warns;
+$SIG{__WARN__} = sub {
+    my $msg = shift;
+    # this regex requires a different SWASH digit data for \d)
+    # than the one used above and by the trigger code in Safe.pm
+    $msg =~ s/\(eval \d+\)/XXX/i; # uses IsDigit SWASH
+    push @warns, $msg;
+};
+
+is eval { $sub->() }, 1, 'warn should return 1';
+is $@, '', '__WARN__ hook should not die';
+is @warns, 1, 'should only be 1 warning';
+like $warns[0], qr/at XXX line/, 'warning should have been edited';
+
@p5pRT
Copy link
Author

@p5pRT p5pRT commented Feb 20, 2010

From badalex@gmail.com

On Fri, Feb 19, 2010 at 13​:48, Tim Bunce <Tim.Bunce@​pobox.com> wrote​:

On Fri, Feb 19, 2010 at 05​:25​:33AM -0800, Tim Bunce wrote​:

Basic operations on unicode strings don't work inside a Safe compartment​:

perl -MSafe -MOpcode=full_opset -e '$a=Safe->new; $a->permit(full_opset()); $a->reval(q{ pack("U",0xC4) =~ /\\xE4/i; 1 }) or die $@​ '
Undefined subroutine Safe​::Root0​::utf8​::SWASHNEW called at (eval 5) line 1

On reflection though, I think it would be better for Safe to simply *always*
require utf8 (and ensure utf8_heavy.pl is run) so it can always share
utf8​::SWASHNEW. That's not only cleaner because it hides the implementation
details of perl unicode, but it also fixes the PostgreSQL problem.

I'll followup this post with a patch that does that.

Here's the patch.

Tested on x86_64 with 5.10.1 and 5.8.9 and postgres 8.3 and 8.4.

BTW 5.8.9 with Safe 2.16 fails with the below. I was surprised at
first as that version is before the closure changes. But those tests
don't exercise closures, just the utf8 in safe part.
ok 1
not ok 2 - trigger expression should return true
# Failed test 'trigger expression should return true'
# at t/safeutf8.t line 26.
not ok 3 - trigger expression should not die
# Failed test 'trigger expression should not die'
# at t/safeutf8.t line 27.
# got​: ''require' trapped by operation mask at (eval 4) line 1.
# '
# expected​: ''
ok 4 - warn should return 1
ok 5 - __WARN__ hook should not die
ok 6 - should only be 1 warning
ok 7 - warning should have been edited
# Looks like you failed 2 tests of 7.

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Feb 20, 2010

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Feb 22, 2010

From @rgs

I've applied the patch and released 2.23 to CPAN, thanks. This is not
yet merged to bleadperl due to code freeze, but I think that can be made
safely.

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Feb 22, 2010

From [Unknown Contact. See original ticket]

I've applied the patch and released 2.23 to CPAN, thanks. This is not
yet merged to bleadperl due to code freeze, but I think that can be made
safely.

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Feb 22, 2010

@rgs - Status changed from 'open' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant