Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Safety for -i option #15216

Closed
p5pRT opened this issue Mar 6, 2016 · 38 comments
Labels

Comments

@p5pRT
Copy link
Collaborator

@p5pRT p5pRT commented Mar 6, 2016

Migrated from rt.perl.org#127663 (status was 'resolved')

Searchable as RT127663$

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Mar 6, 2016

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Mar 6, 2016

From @jiangyy

Regards,
Yanyan Jiang 蒋炎岩
Institute of Computer Software,
Dept. of Computer Science, Nanjing University

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Mar 6, 2016

From @jkeenan

On Sun Mar 06 02​:32​:46 2016, jiangyy@​outlook.com wrote​:

Regards,
Yanyan Jiang 蒋炎岩
Institute of Computer Software,
Dept. of Computer Science, Nanjing University

Since the bug report was attached with a file extension which RT reports as a binary file, the report may not be visible. I am re-attaching as a plain-text file.

Thank you very much.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Mar 6, 2016

From @jkeenan

To​: perlbug@​perl.org
Subject​: Safety for -i option
From​: jiangyy@​outlook.com
Message-Id​: <5.22.1_10199_1457258711@​ubuntuvm>
Reply-To​: jiangyy@​outlook.com

This is a bug report for perl from jiangyy@​outlook.com,
generated with the help of perlbug 1.40 running under perl 5.22.1.


Like sed, perl can be used with -i to change files in-place.

However, our tool discovered that the saving procedure is not as
safe as sed. The system call trace (from 5.22.1)​:

  open("file.txt", O_RDONLY|O_LARGEFILE) = 3
  _llseek(3, 0, [0], SEEK_CUR) = 0
  unlink("file.txt") = 0
  open("file.txt", O_WRONLY|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = 4
  read(3, ...)
  read(3, ...)
  write(4, ...)
  ...

If the program terminates in between, the file-system runs out of
space (when the replaced text is longer) or the system crashes, the
contents may lost (the worst case, completely gone due to the unlink).

sed uses a temporary file to get the job and rename it. But it seems
difficult to work considering portability. Many infrastructures (e.g.,
gtk and qt) provide portable solution, but seems not to apply with perl.

Thank you for your attention!



Flags​:
  category=core
  severity=medium


Site configuration information for perl 5.22.1​:

Configured by jyy at Sun Mar 6 04​:34​:47 EST 2016.

Summary of my perl5 (revision 5 version 22 subversion 1) configuration​:
 
  Platform​:
  osname=linux, osvers=4.2.0-27-generic, archname=i686-linux
  uname='linux ubuntuvm 4.2.0-27-generic #32~14.04.1-ubuntu smp fri jan 22 15​:32​:27 utc 2016 i686 i686 i686 gnulinux '
  config_args='-ds -e'
  hint=recommended, useposix=true, d_sigaction=define
  useithreads=undef, usemultiplicity=undef
  use64bitint=undef, use64bitall=undef, uselongdouble=undef
  usemymalloc=n, bincompat5005=undef
  Compiler​:
  cc='cc', ccflags ='-fwrapv -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
  optimize='-O2',
  cppflags='-fwrapv -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
  ccversion='', gccversion='4.8.4', gccosandvers=''
  intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234, doublekind=3
  d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12, longdblkind=3
  ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
  alignbytes=4, prototype=define
  Linker and Libraries​:
  ld='cc', ldflags =' -fstack-protector -L/usr/local/lib'
  libpth=/usr/local/lib /usr/lib/gcc/i686-linux-gnu/4.8/include-fixed /usr/include/i386-linux-gnu /usr/lib /lib/i386-linux-gnu /lib/../lib /usr/lib/i386-linux-gnu /usr/lib/../lib /lib
  libs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
  perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
  libc=libc-2.19.so, so=so, useshrplib=false, libperl=libperl.a
  gnulibc_version='2.19'
  Dynamic Linking​:
  dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
  cccdlflags='-fPIC', lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector'


@​INC for perl 5.22.1​:
  /usr/local/lib/perl5/site_perl/5.22.1/i686-linux
  /usr/local/lib/perl5/site_perl/5.22.1
  /usr/local/lib/perl5/5.22.1/i686-linux
  /usr/local/lib/perl5/5.22.1
  .


Environment for perl 5.22.1​:
  HOME=/home/jyy
  LANG=en_US.UTF-8
  LANGUAGE (unset)
  LD_LIBRARY_PATH (unset)
  LOGDIR (unset)
  PATH=/usr/local/sbin​:/usr/local/bin​:/usr/sbin​:/usr/bin​:/sbin​:/bin​:/usr/games​:/usr/local/games
  PERL_BADLANG (unset)
  SHELL=/bin/bash

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Mar 6, 2016

From [Unknown Contact. See original ticket]

On Sun Mar 06 02​:32​:46 2016, jiangyy@​outlook.com wrote​:

Regards,
Yanyan Jiang 蒋炎岩
Institute of Computer Software,
Dept. of Computer Science, Nanjing University

Since the bug report was attached with a file extension which RT reports as a binary file, the report may not be visible. I am re-attaching as a plain-text file.

Thank you very much.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Mar 6, 2016

From @jkeenan

On Sun Mar 06 02​:32​:46 2016, jiangyy@​outlook.com wrote​:

Regards,
Yanyan Jiang 蒋炎岩
Institute of Computer Software,
Dept. of Computer Science, Nanjing University

From original report​:
#####
Like sed, perl can be used with -i to change files in-place. However, our tool discovered that the saving procedure is not as safe as sed. The system call trace (from 5.22.1)​:

open("file.txt", O_RDONLY|O_LARGEFILE) = 3
_llseek(3, 0, [0], SEEK_CUR) = 0
unlink("file.txt") = 0

open("file.txt", O_WRONLY|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = 4
read(3, ...)
read(3, ...)
write(4, ...) ...
#####

Can you supply us with​: (a) the list of commands you invoked at the command-line to get these results; (b) some idea of the size of the file in question relative to the size of memory?

Thank you very much.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Mar 6, 2016

The RT System itself - Status changed from 'new' to 'open'

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Mar 6, 2016

From @jhi

Can you supply us with​: (a) the list of commands you invoked at the
command-line to get these results; (b) some idea of the size of the
file in question relative to the size of memory?

Thank you very much.

Also​:

(c) You said​: "sed uses a temporary file to get the job and rename it. But it seems
difficult to work considering portability. Many infrastructures (e.g.,
gtk and qt) provide portable solution, but seems not to apply with perl."

Can you elaborate on the portable solutions that gtk and qt provide?

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Mar 6, 2016

From @mauke

On Sun Mar 06 05​:58​:12 2016, jkeenan wrote​:

On Sun Mar 06 02​:32​:46 2016, jiangyy@​outlook.com wrote​:

Regards,
Yanyan Jiang 蒋炎岩
Institute of Computer Software,
Dept. of Computer Science, Nanjing University

From original report​:
#####
Like sed, perl can be used with -i to change files in-place. However,
our tool discovered that the saving procedure is not as safe as sed.
The system call trace (from 5.22.1)​:

open("file.txt", O_RDONLY|O_LARGEFILE) = 3
_llseek(3, 0, [0], SEEK_CUR) = 0
unlink("file.txt") = 0

open("file.txt", O_WRONLY|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = 4
read(3, ...)
read(3, ...)
write(4, ...) ...
#####

Can you supply us with​: (a) the list of commands you invoked at the
command-line to get these results; (b) some idea of the size of the
file in question relative to the size of memory?

Here's my attempt​:

% echo hi > tmp.txt
% strace -o strace.log perl -i -pe '' tmp.txt

Excerpt from strace.log​:

open("tmp.txt", O_RDONLY|O_LARGEFILE) = 3
ioctl(3, TCGETS, 0xbfdd070c) = -1 ENOTTY (Inappropriate ioctl for device)
_llseek(3, 0, [0], SEEK_CUR) = 0
fstat64(3, {st_mode=S_IFREG|0644, st_size=3, ...}) = 0
fcntl64(3, F_SETFD, FD_CLOEXEC) = 0
unlink("tmp.txt") = 0
open("tmp.txt", O_WRONLY|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = 4
ioctl(4, TCGETS, 0xbfdd070c) = -1 ENOTTY (Inappropriate ioctl for device)
_llseek(4, 0, [0], SEEK_CUR) = 0
fstat64(4, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
fcntl64(4, F_SETFD, FD_CLOEXEC) = 0
fstat64(4, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
fchmod(4, 0100644) = 0
read(3, "hi\n", 8192) = 3
read(3, "", 8192) = 0
write(4, "hi\n", 3) = 3
close(4) = 0
close(3) = 0

So, the file is tiny in this case (not sure why that matters?). Perl opens the input file (fd #3), unlinks it, then opens the same name again (fd #4), then streams data from fd 3 to fd 4.

If perl dies after the unlink() but before it is done writing to fd 4 and closing it, you get a truncated (or completely missing) output file.

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Mar 6, 2016

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Mar 6, 2016

From @jhi

On Sun Mar 06 07​:15​:04 2016, jhi wrote​:

Can you supply us with​: (a) the list of commands you invoked at the
command-line to get these results; (b) some idea of the size of the
file in question relative to the size of memory?

Thank you very much.

Also​:

(c) You said​: "sed uses a temporary file to get the job and rename it.
But it seems
difficult to work considering portability. Many infrastructures (e.g.,
gtk and qt) provide portable solution, but seems not to apply with
perl."

Can you elaborate on the portable solutions that gtk and qt provide?

I got the below email from jiangyy@​outlook.com​:

-- cut here --

Hi Jarkko,

My reply of bug #127663 is not appearing in the bug tracking system (I just replied the mail, sending to perlbug-followup@​perl.org with subject “Re​: [perl #127663] Safety for -i option”, and I have no idea why that does not work). I listed the comments below. Maybe you can post it.


Sed just uses rename() to replace the file with a temporary one, seems it is assuming a POSIX runtime, and this is POSIX safe. Gtk provides g_file_replace(), and Qt provides QSaveFile. Both are portable. We extensively tested these two implementations, and they are both safe in handling file overwrite.

We believe that perl is an extremely portable software, and semantics of rename() may be different on other platforms, and this shall be handled with care (though I’m not an expert on portability).

Regards,
Yanyan Jiang 蒋炎岩
Institute of Computer Software,
Dept. of Computer Science, Nanjing University

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Mar 7, 2016

From @jiangyy

Can you supply us with​: (a) the list of commands you invoked at the command-line to get these results; (b) some idea of the size of the file in question relative to the size of memory?

For perl, I just used a simple case of in-place text replacement​:

  perl5.22.1 -i -pe 's/old/new/g’ file.txt

I get the system-call trace via

  strace COMMAND

The file is small (just kilobytes). If the program terminates just after unlink(), the file is gone. I simulated this process by killing it immediately after unlink(), and the file is indeed gone. If the file contents are huge, the overwrite itself can cause inconsistency (the first half is updated, the second half is old, and there are some corruptions in the middle).

Can you elaborate on the portable solutions that gtk and qt provide?

Sed just uses rename() to replace the file with a temporary one, seems it is assuming a POSIX runtime, and this is POSIX safe. Gtk provides g_file_replace(), and Qt provides QSaveFile. Both are portable. We extensively tested these two implementations, and they are both safe in handling file overwrite.

We believe that perl is an extremely portable software, and semantics of rename() may be different on other platforms, and this shall be handled with care (though I’m not an expert on portability).

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Mar 8, 2016

From @tonycoz

On Sun Mar 06 02​:32​:46 2016, jiangyy@​outlook.com wrote​:

the file-system runs out of space (when the replaced text is longer)

It isn't necessary for the replaced text to be longer.

We're unlinking the file, but keeping a file handle open to it. On a POSIX system the file will continue to take space until the file handle is closed.

Tony

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Apr 5, 2016

From @iabyn

On Mon, Mar 07, 2016 at 04​:10​:03PM -0800, Tony Cook via RT wrote​:

On Sun Mar 06 02​:32​:46 2016, jiangyy@​outlook.com wrote​:

the file-system runs out of space (when the replaced text is longer)

It isn't necessary for the replaced text to be longer.

We're unlinking the file, but keeping a file handle open to it. On a
POSIX system the file will continue to take space until the file handle
is closed.

For anyone following this ticket, a simple demonstration of why -i is
currently unsafe​:

Here foo gets completely truncated​:

  $ echo "hello" > foo; ./perl -i -pe'die' foo
  Died at -e line 1, <> line 1.
  $ ls -l foo
  -rw-rw-r--. 1 davem davem 0 Apr 5 15​:49 foo
  $
 
and here foo gets partially truncated​:

  $ perl -le'print "a" x 80 for 1..10_000' > foo
  $ ls -l foo
  -rw-rw-r--. 1 davem davem 810000 Apr 5 15​:51 foo
  $ ./perl -i -pe'die if $. == 9_900' foo
  Died at -e line 1, <> line 9900.
  $ ls -l foo
  -rw-rw-r--. 1 davem davem 801819 Apr 5 15​:52 foo
  $

--
It's not that I'm afraid to die, I just don't want to be there when it
happens.
  -- Woody Allen

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Jun 21, 2016

From @tonycoz

On Tue Apr 05 07​:55​:14 2016, davem wrote​:

On Mon, Mar 07, 2016 at 04​:10​:03PM -0800, Tony Cook via RT wrote​:

On Sun Mar 06 02​:32​:46 2016, jiangyy@​outlook.com wrote​:

the file-system runs out of space (when the replaced text is longer)

It isn't necessary for the replaced text to be longer.

We're unlinking the file, but keeping a file handle open to it. On a
POSIX system the file will continue to take space until the file handle
is closed.

For anyone following this ticket, a simple demonstration of why -i is
currently unsafe​:

Here foo gets completely truncated​:

$ echo "hello" > foo; \./perl \-i \-pe'die' foo
Died at \-e line 1\, \<> line 1\.
$ ls \-l foo
\-rw\-rw\-r\-\-\. 1 davem davem 0 Apr  5 15&#8203;:49 foo
$ 

and here foo gets partially truncated​:

$ perl \-le'print "a" x 80 for 1\.\.10\_000' > foo
$ ls \-l foo
\-rw\-rw\-r\-\-\. 1 davem davem 810000 Apr  5 15&#8203;:51 foo
$ \./perl \-i \-pe'die if $\. == 9\_900' foo
Died at \-e line 1\, \<> line 9900\.
$ ls \-l foo
\-rw\-rw\-r\-\-\. 1 davem davem 801819 Apr  5 15&#8203;:52 foo
$

One problem I have with this example is I'm not sure die should be treated as a failure case.

Should a similar case where exit() is called instead of die() revert any edits?

If not, I don't see a reliable mechanism to distinguish the two.

For the standard -n or -p generated loop it's fine because the user can expect iterating to the next in-place file will close the old ARGVOUT and do whatever extra cleanup is needed to replace the input file with the output (nothing currently, a rename for my working branch), but what if the user "last"s out of the inplace loop for some reason?

Tony

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Dec 6, 2016

From @tonycoz

The attached patch attempts to fix this issue.

It also fixes an issue with nested in-place editing, where the inner
in-place edit could overwrite the permissions referenced for the outer
edit, which are used to restore set[gu]id flags on the output file.

Tony

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Dec 6, 2016

From @tonycoz

in-place-edit.patch
From ecdd0c8dc1cc35cdace9f67e5e08f7822e12813c Mon Sep 17 00:00:00 2001
From: Tony Cook <tony@develop-help.com>
Date: Wed, 18 May 2016 15:03:14 +1000
Subject: (perl #127663) create a separate random souce for internal use

and use it to initialize hash randomization and to innoculate against
quadratic behaviour in pp_sort
---
 embedvar.h | 1 +
 intrpvar.h | 8 ++++++++
 perl.c     | 2 ++
 pp_sort.c  | 2 +-
 util.c     | 4 +---
 util.h     | 6 ++++++
 6 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/embedvar.h b/embedvar.h
index c413932..7588807 100644
--- a/embedvar.h
+++ b/embedvar.h
@@ -173,6 +173,7 @@
 #define PL_incgv		(vTHX->Iincgv)
 #define PL_initav		(vTHX->Iinitav)
 #define PL_inplace		(vTHX->Iinplace)
+#define PL_internal_random_state	(vTHX->Iinternal_random_state)
 #define PL_isarev		(vTHX->Iisarev)
 #define PL_known_layers		(vTHX->Iknown_layers)
 #define PL_last_in_gv		(vTHX->Ilast_in_gv)
diff --git a/intrpvar.h b/intrpvar.h
index 1aa94f7..532a458 100644
--- a/intrpvar.h
+++ b/intrpvar.h
@@ -810,6 +810,14 @@ PERLVAR(I, random_state, PL_RANDOM_STATE_TYPE)
 
 PERLVARI(I, dump_re_max_len, STRLEN, 0)
 
+/* For internal uses of randomness, this ensures the sequence of
+ * random numbers returned by rand() isn't modified by perl's internal
+ * use of randomness.
+ * This is important if the user has called srand() with a seed.
+ */
+
+PERLVAR(I, internal_random_state, PL_RANDOM_STATE_TYPE)
+
 /* If you are adding a U8 or U16, check to see if there are 'Space' comments
  * above on where there are gaps which currently will be structure padding.  */
 
diff --git a/perl.c b/perl.c
index 3a647f7..dd67d4e 100644
--- a/perl.c
+++ b/perl.c
@@ -261,6 +261,8 @@ perl_construct(pTHXx)
 
     init_constants();
 
+    Perl_drand48_init_r(&PL_internal_random_state, seed());
+
     SvREADONLY_on(&PL_sv_placeholder);
     SvREFCNT(&PL_sv_placeholder) = SvREFCNT_IMMORTAL;
 
diff --git a/pp_sort.c b/pp_sort.c
index 68e65f9..7aa44eb 100644
--- a/pp_sort.c
+++ b/pp_sort.c
@@ -787,7 +787,7 @@ S_qsortsvu(pTHX_ SV ** array, size_t num_elts, SVCOMPARE_t compare)
       size_t n;
       SV ** const q = array;
       for (n = num_elts; n > 1; ) {
-         const size_t j = (size_t)(n-- * Drand01());
+         const size_t j = (size_t)(n-- * Perl_internal_drand48());
          temp = q[j];
          q[j] = q[n];
          q[n] = temp;
diff --git a/util.c b/util.c
index 02c84c8..ef13e8b 100644
--- a/util.c
+++ b/util.c
@@ -4757,10 +4757,8 @@ Perl_get_hash_seed(pTHX_ unsigned char * const seed_buffer)
     else
 #endif
     {
-        (void)seedDrand01((Rand_seed_t)seed());
-
         for( i = 0; i < PERL_HASH_SEED_BYTES; i++ ) {
-            seed_buffer[i] = (unsigned char)(Drand01() * (U8_MAX+1));
+            seed_buffer[i] = (unsigned char)(Perl_internal_drand48() * (U8_MAX+1));
         }
     }
 #ifdef USE_PERL_PERTURB_KEYS
diff --git a/util.h b/util.h
index 8f4171b..c71eefd 100644
--- a/util.h
+++ b/util.h
@@ -85,6 +85,12 @@ typedef struct PERL_DRAND48_T perl_drand48_t;
 #define Perl_drand48_init(seed) (Perl_drand48_init_r(&PL_random_state, (seed)))
 #define Perl_drand48() (Perl_drand48_r(&PL_random_state))
 
+#ifdef PERL_CORE
+/* uses a different source of randomness to avoid interfering with the results
+ * of rand() */
+#define Perl_internal_drand48() (Perl_drand48_r(&PL_internal_random_state))
+#endif
+
 #ifdef USE_C_BACKTRACE
 
 typedef struct {
-- 
2.1.4


From ee5c68b6dd0d9330e7040edef06854278d098766 Mon Sep 17 00:00:00 2001
From: Tony Cook <tony@develop-help.com>
Date: Thu, 4 Aug 2016 14:30:13 +1000
Subject: (perl #127663) add our own mkstemp() implementation

Needed to generate temp files for safer in-place editing.

Not based on any particular implementation, the BSD implementations
tend to be wrappers around a megafunction that also does a few variations
of mkstemp() and mkdtemp(), which we don't need (yet.)

One implementation I found, part of the heimdal crypto library, was
simpler, but horrible.
---
 embed.fnc |  4 ++++
 proto.h   |  5 +++++
 util.c    | 34 ++++++++++++++++++++++++++++++++++
 util.h    |  4 ++++
 4 files changed, 47 insertions(+)

diff --git a/embed.fnc b/embed.fnc
index e03c4d2..e96d686 100644
--- a/embed.fnc
+++ b/embed.fnc
@@ -2953,6 +2953,10 @@ Apnod	|Size_t	|my_strlcat	|NULLOK char *dst|NULLOK const char *src|Size_t size
 Apnod	|Size_t |my_strlcpy     |NULLOK char *dst|NULLOK const char *src|Size_t size
 #endif
 
+#ifndef HAS_MKSTEMP
+pno	|int	|my_mkstemp	|NN char *templte
+#endif
+
 Apdn	|bool	|isinfnan	|NV nv
 p	|bool	|isinfnansv	|NN SV *sv
 
diff --git a/proto.h b/proto.h
index b760924..d7e38ea 100644
--- a/proto.h
+++ b/proto.h
@@ -3765,6 +3765,11 @@ STATIC int	S_dooneliner(pTHX_ const char *cmd, const char *filename)
 
 #  endif
 #endif
+#if !defined(HAS_MKSTEMP)
+PERL_CALLCONV int	Perl_my_mkstemp(char *templte);
+#define PERL_ARGS_ASSERT_MY_MKSTEMP	\
+	assert(templte)
+#endif
 #if !defined(HAS_RENAME)
 PERL_CALLCONV I32	Perl_same_dirent(pTHX_ const char* a, const char* b);
 #define PERL_ARGS_ASSERT_SAME_DIRENT	\
diff --git a/util.c b/util.c
index ef13e8b..88105e0 100644
--- a/util.c
+++ b/util.c
@@ -5866,6 +5866,40 @@ Perl_my_dirfd(DIR * dir) {
 #endif 
 }
 
+#ifndef HAS_MKSTEMP
+
+#define TEMP_FILE_CH "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvxyz0123456789"
+#define TEMP_FILE_CH_COUNT (sizeof(TEMP_FILE_CH)-1)
+
+int
+Perl_my_mkstemp(char *templte) {
+    dTHX;
+    STRLEN len = strlen(templte);
+    int fd;
+    int attempts = 0;
+
+    PERL_ARGS_ASSERT_MY_MKSTEMP;
+
+    if (len < 6 ||
+        templte[len-1] != 'X' || templte[len-2] != 'X' || templte[len-3] != 'X' ||
+        templte[len-4] != 'X' || templte[len-5] != 'X' || templte[len-6] != 'X') {
+        errno = EINVAL;
+        return -1;
+    }
+
+    do {
+        int i;
+        for (i = 1; i <= 6; ++i) {
+            templte[len-i] = TEMP_FILE_CH[(int)(Perl_internal_drand48() * TEMP_FILE_CH_COUNT)];
+        }
+        fd = PerlLIO_open3(templte, O_RDWR | O_CREAT | O_EXCL, 0600);
+    } while (fd == -1 && errno == EEXIST && ++attempts <= 100);
+
+    return fd;
+}
+
+#endif
+
 REGEXP *
 Perl_get_re_arg(pTHX_ SV *sv) {
 
diff --git a/util.h b/util.h
index c71eefd..4ca3441 100644
--- a/util.h
+++ b/util.h
@@ -242,6 +242,10 @@ means arg not present, 1 is empty string/null byte */
             ((char *) memmem(big, bigend - big, little, lend - little))
 #endif
 
+#if defined(HAS_MKSTEMP) && defined(PERL_CORE)
+#   define Perl_my_mkstemp(templte) mkstemp(templte)
+#endif
+
 /*
  * ex: set ts=8 sts=4 sw=4 et:
  */
-- 
2.1.4


From 69e7365dce884a5ff5f99ae62fdba82e5f430da2 Mon Sep 17 00:00:00 2001
From: Tony Cook <tony@develop-help.com>
Date: Thu, 4 Aug 2016 14:34:21 +1000
Subject: (perl #127663) only test renaming directories with rename()
 available.

Perl's rename implementation falls back to link() to rename when
rename() isn't available, which is either disallowed or dangerous.
---
 t/io/fs.t | 27 ++++++++++++++++-----------
 1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/t/io/fs.t b/t/io/fs.t
index b6754d6..09eede1 100644
--- a/t/io/fs.t
+++ b/t/io/fs.t
@@ -468,18 +468,23 @@ SKIP: {
     chdir $wd || die "Can't cd back to $wd";
 }
 
-# check if rename() works on directories
-if ($^O eq 'VMS') {
-    # must have delete access to rename a directory
-    `set file $tmpdir.dir/protection=o:d`;
-    ok(rename("$tmpdir.dir", "$tmpdir1.dir"), "rename on directories") ||
-      print "# errno: $!\n";
-}
-else {
-    ok(rename($tmpdir, $tmpdir1), "rename on directories");
-}
+SKIP:
+{
+    $Config{d_rename}
+      or skip "Cannot rename directories with link()", 2;
+    # check if rename() works on directories
+    if ($^O eq 'VMS') {
+        # must have delete access to rename a directory
+        `set file $tmpdir.dir/protection=o:d`;
+        ok(rename("$tmpdir.dir", "$tmpdir1.dir"), "rename on directories") ||
+          print "# errno: $!\n";
+    }
+    else {
+        ok(rename($tmpdir, $tmpdir1), "rename on directories");
+    }
 
-ok(-d $tmpdir1, "rename on directories working");
+    ok(-d $tmpdir1, "rename on directories working");
+}
 
 {
     # Change 26011: Re: A surprising segfault
-- 
2.1.4


From d9946fe662551bd64b9d49a9bb88e3606040390f Mon Sep 17 00:00:00 2001
From: Tony Cook <tony@develop-help.com>
Date: Thu, 4 Aug 2016 14:34:35 +1000
Subject: (perl #127663) safer in-place editing

Previously in-place editing opened the file then immediately
*replaced* the file, so if an error occurs while writing the output,
such as running out of space, the content of the original file is lost.

This changes in-place editing to write to a work file which is renamed
over the original only once the output file is successfully closed.

It also fixes an issue with setting setuid/setgid file modes for
recursive in-place editing.

The implementation (beyond some TODO issues below) has at least one
problem - if the user code changes directory between the file open and
the close then the final clean-up stage is going to fail if the input
name wasn't an absolute path.

This might be fixable, but on some systems it may put the perl process
in a difficult to recover from position - if the system doesn't
implement getcwd() perl may change directory out of the original and not
have a way to return to it.
---
 doio.c           | 303 ++++++++++++++++++++++++++++++++++++++++---------------
 embed.fnc        |   1 +
 embed.h          |   1 +
 mg.c             |  36 +++++++
 pod/perldiag.pod |   9 +-
 proto.h          |   3 +
 6 files changed, 268 insertions(+), 85 deletions(-)

diff --git a/doio.c b/doio.c
index 67966b5..e8680db 100644
--- a/doio.c
+++ b/doio.c
@@ -805,6 +805,91 @@ S_openn_cleanup(pTHX_ GV *gv, IO *io, PerlIO *fp, char *mode, const char *oname,
     return FALSE;
 }
 
+/* Open a temp file in the same directory as an original name.
+*/
+
+static bool
+S_openindirtemp(pTHX_ GV *gv, SV *orig_name, SV *temp_out_name) {
+    int fd;
+    PerlIO *fp;
+    const char *p = SvPV_nolen(orig_name);
+    const char *sep;
+
+    /* look for the last directory separator */
+    sep = strrchr(p, '/');
+
+#ifdef DOSISH
+    {
+        const char *sep2;
+        if ((sep2 = strrchr(sep ? sep : p, '\\')))
+            sep = sep2;
+    }
+#endif
+#ifdef VMS
+    if (!sep) {
+        const char *openp = strchr(p, '[');
+        if (openp)
+            sep = strchr(openp, ']');
+        else {
+            sep = strchr(p, ':');
+        }
+    }
+#endif
+    if (sep) {
+        sv_setpvn(temp_out_name, p, sep - p + 1);
+        sv_catpvs(temp_out_name, "XXXXXXXX");
+    }
+    else
+        sv_setpvs(temp_out_name, "XXXXXXXX");
+
+    fd = Perl_my_mkstemp(SvPVX(temp_out_name));
+
+    if (fd < 0)
+        return FALSE;
+
+    fp = PerlIO_fdopen(fd, "w+");
+    if (!fp)
+        return FALSE;
+
+    return do_openn(gv, "+>&", 3, 0, 0, 0, fp, NULL, 0);
+}
+
+static int
+S_argvout_free(pTHX_ SV *sv, MAGIC *mg) {
+    SV **temp_psv;
+
+    PERL_UNUSED_ARG(sv);
+
+    /* note this can be entered once the file has been
+       successfully deleted too */
+    assert(mg->mg_obj && SvTYPE(mg->mg_obj) == SVt_PVAV);
+    temp_psv = av_fetch((AV*)mg->mg_obj, 1, FALSE);
+    if (temp_psv && *temp_psv && SvOK(*temp_psv)) {
+        UNLINK(SvPVX(*temp_psv));
+    }
+
+    return 0;
+}
+
+/* Magic of this type has an AV containing the following:
+   0: name of the backup file (if any)
+   1: name of the temp output file
+   2: name of the original file
+   3: file mode of the original file
+ */
+
+static const MGVTBL argvout_vtbl =
+    {
+        NULL, /* svt_get */
+        NULL, /* svt_set */
+        NULL, /* svt_len */
+        NULL, /* svt_clear */
+        S_argvout_free, /* svt_free */
+        NULL, /* svt_copy */
+        NULL, /* svt_dup */
+        NULL  /* svt_local */
+    };
+
 PerlIO *
 Perl_nextargv(pTHX_ GV *gv, bool nomagicopen)
 {
@@ -826,15 +911,14 @@ Perl_nextargv(pTHX_ GV *gv, bool nomagicopen)
 				    SvREFCNT_inc_simple_NN(PL_defoutgv));
 	}
     }
-    if (PL_filemode & (S_ISUID|S_ISGID)) {
-	PerlIO_flush(IoIFP(GvIOn(PL_argvoutgv)));  /* chmod must follow last write */
-#ifdef HAS_FCHMOD
-	if (PL_lastfd != -1)
-	    (void)fchmod(PL_lastfd,PL_filemode);
-#else
-	(void)PerlLIO_chmod(PL_oldname,PL_filemode);
-#endif
+
+    {
+        IO * const io = GvIOp(PL_argvoutgv);
+        if (io && IoIFP(io) && old_out_name) {
+            do_close(PL_argvoutgv, FALSE);
+        }
     }
+
     PL_lastfd = -1;
     PL_filemode = 0;
     if (!GvAV(gv))
@@ -857,13 +941,6 @@ Perl_nextargv(pTHX_ GV *gv, bool nomagicopen)
             }
         }
         else {
-            {
-                IO * const io = GvIOp(PL_argvoutgv);
-                if (io && IoIFP(io) && old_out_name && !io_close(io, PL_argvoutgv, FALSE, FALSE)) {
-                    Perl_croak(aTHX_ "Failed to close in-place edit file %"
-                               SVf ": %s\n", old_out_name, Strerror(errno));
-                }
-            }
             /* This very long block ends with return IoIFP(GvIOp(gv));
                Both this block and the block above fall through on open
                failure to the warning code, and then the while loop above tries
@@ -875,6 +952,8 @@ Perl_nextargv(pTHX_ GV *gv, bool nomagicopen)
 #endif
                 Uid_t fileuid;
                 Gid_t filegid;
+                AV *magic_av = NULL;
+                SV *temp_name_sv = NULL;
 
 		TAINT_PROPER("inplace open");
 		if (oldlen == 1 && *PL_oldname == '-') {
@@ -896,6 +975,7 @@ Perl_nextargv(pTHX_ GV *gv, bool nomagicopen)
 		    do_close(gv,FALSE);
 		    continue;
 		}
+                magic_av = newAV();
 		if (*PL_inplace && strNE(PL_inplace, "*")) {
 		    const char *star = strchr(PL_inplace, '*');
 		    if (star) {
@@ -925,71 +1005,33 @@ Perl_nextargv(pTHX_ GV *gv, bool nomagicopen)
 					 "Can't do inplace edit: %"
                                          SVf " would not be unique",
 					 SVfARG(sv));
-			do_close(gv,FALSE);
-			continue;
-		    }
-#endif
-#ifdef HAS_RENAME
-#if !defined(DOSISH) && !defined(__CYGWIN__)
-		    if (PerlLIO_rename(PL_oldname,SvPVX_const(sv)) < 0) {
-			Perl_ck_warner_d(aTHX_ packWARN(WARN_INPLACE),
-					 "Can't rename %s to %" SVf
-                                         ": %s, skipping file",
-					 PL_oldname, SVfARG(sv),
-                                         Strerror(errno));
-			do_close(gv,FALSE);
-			continue;
-		    }
-#else
-		    do_close(gv,FALSE);
-		    (void)PerlLIO_unlink(SvPVX_const(sv));
-		    (void)PerlLIO_rename(PL_oldname,SvPVX_const(sv));
-		    do_open_raw(gv, SvPVX_const(sv), SvCUR(sv), O_RDONLY, 0);
-#endif /* DOSISH */
-#else
-		    (void)UNLINK(SvPVX_const(sv));
-		    if (link(PL_oldname,SvPVX_const(sv)) < 0) {
-			Perl_ck_warner_d(aTHX_ packWARN(WARN_INPLACE),
-					 "Can't rename %s to %" SVf ": %s, skipping file",
-					 PL_oldname, SVfARG(sv), Strerror(errno) );
-			do_close(gv,FALSE);
-			continue;
+                        goto cleanup_argv;
 		    }
-		    (void)UNLINK(PL_oldname);
-#endif
-		}
-		else {
-#if !defined(DOSISH) && !defined(__amigaos4__)
-#  ifndef VMS  /* Don't delete; use automatic file versioning */
-		    if (UNLINK(PL_oldname) < 0) {
-			Perl_ck_warner_d(aTHX_ packWARN(WARN_INPLACE),
-					 "Can't remove %s: %s, skipping file",
-					 PL_oldname, Strerror(errno) );
-			do_close(gv,FALSE);
-			continue;
-		    }
-#  endif
-#else
-		    Perl_croak(aTHX_ "Can't do inplace edit without backup");
 #endif
+                    av_store(magic_av, 0, newSVsv(sv));
 		}
 
 		sv_setpvn(sv,PL_oldname,oldlen);
 		SETERRNO(0,0);		/* in case sprintf set errno */
-		if (!Perl_do_open_raw(aTHX_ PL_argvoutgv, SvPVX_const(sv),
-                                      SvCUR(sv),
-#ifdef VMS
-                                      O_WRONLY|O_CREAT|O_TRUNC, 0
-#else
-                                      O_WRONLY|O_CREAT|OPEN_EXCL, 0600
-#endif
-                        )) {
-		    Perl_ck_warner_d(aTHX_ packWARN(WARN_INPLACE), "Can't do inplace edit on %s: %s",
+                temp_name_sv = newSV(0);
+                if (!S_openindirtemp(aTHX_ PL_argvoutgv, GvSV(gv), temp_name_sv)) {
+                    SvREFCNT_dec(temp_name_sv);
+                    /* diag_listed_as: Can't do inplace edit on %s: %s */
+                    Perl_ck_warner_d(aTHX_ packWARN(WARN_INPLACE), "Can't do inplace edit on %s: Cannot make temp name: %s",
 				     PL_oldname, Strerror(errno) );
-		    do_close(gv,FALSE);
-		    continue;
+#ifndef FLEXFILENAMES
+                cleanup_argv:
+#endif
+                    do_close(gv,FALSE);
+                    SvREFCNT_dec(magic_av);
+                    continue;
 		}
+                av_store(magic_av, 1, temp_name_sv);
+                av_store(magic_av, 2, newSVsv(sv));
+                av_store(magic_av, 3, newSVuv(PL_filemode));
 		setdefout(PL_argvoutgv);
+                sv_magicext((SV*)GvIOp(PL_argvoutgv), (SV*)magic_av, PERL_MAGIC_uvar, &argvout_vtbl, NULL, 0);
+                SvREFCNT_dec(magic_av);
 		PL_lastfd = PerlIO_fileno(IoIFP(GvIOp(PL_argvoutgv)));
                 if (PL_lastfd >= 0) {
                     (void)PerlLIO_fstat(PL_lastfd,&statbuf);
@@ -1030,17 +1072,6 @@ Perl_nextargv(pTHX_ GV *gv, bool nomagicopen)
     if (io && (IoFLAGS(io) & IOf_ARGV))
 	IoFLAGS(io) |= IOf_START;
     if (PL_inplace) {
-        if (old_out_name) {
-            IO * const io = GvIOp(PL_argvoutgv);
-            if (io && IoIFP(io) && !io_close(io, PL_argvoutgv, FALSE, FALSE)) {
-                Perl_croak(aTHX_ "Failed to close in-place edit file %" SVf ": %s\n",
-                           old_out_name, Strerror(errno));
-            }
-        }
-        else {
-            /* maybe this is no longer wanted */
-            (void)do_close(PL_argvoutgv,FALSE);
-        }
 	if (io && (IoFLAGS(io) & IOf_ARGV)
 	    && PL_argvout_stack && AvFILLp(PL_argvout_stack) >= 0)
 	{
@@ -1060,6 +1091,7 @@ Perl_do_close(pTHX_ GV *gv, bool not_implicit)
 {
     bool retval;
     IO *io;
+    MAGIC *mg;
 
     if (!gv)
 	gv = PL_argvgv;
@@ -1076,7 +1108,112 @@ Perl_do_close(pTHX_ GV *gv, bool not_implicit)
 	}
 	return FALSE;
     }
-    retval = io_close(io, NULL, not_implicit, FALSE);
+    if ((mg = mg_findext((SV*)io, PERL_MAGIC_uvar, &argvout_vtbl))
+        && mg->mg_obj) {
+        /* handle to an in-place edit work file */
+        SV **back_psv = av_fetch((AV*)mg->mg_obj, 0, FALSE);
+        SV **temp_psv = av_fetch((AV*)mg->mg_obj, 1, FALSE);
+        /* PL_oldname may have been modified by a nested ARGV use at this point */
+        SV **orig_psv = av_fetch((AV*)mg->mg_obj, 2, FALSE);
+        SV **mode_psv = av_fetch((AV*)mg->mg_obj, 3, FALSE);
+        UV mode;
+        int fd;
+
+        const char *orig_pv;
+
+        assert(temp_psv && *temp_psv);
+        assert(orig_psv && *orig_psv);
+        assert(mode_psv && *mode_psv);
+
+        orig_pv = SvPVX(*orig_psv);
+
+        mode = SvUV(*mode_psv);
+
+        if ((mode & (S_ISUID|S_ISGID)) != 0
+            && (fd = PerlIO_fileno(IoIFP(io))) >= 0) {
+            (void)PerlIO_flush(IoIFP(io));
+#ifdef HAS_FCHMOD
+            (void)fchmod(fd, mode);
+#else
+            (void)PerlLIO_chmod(orig_pv, mode);
+#endif
+        }
+
+        retval = io_close(io, NULL, not_implicit, FALSE);
+
+        if (retval) {
+#if defined(DOSISH) || defined(__CYGWIN__)
+            if (PL_argvgv && GvIOp(PL_argvgv)
+                && IoIFP(GvIOp(PL_argvgv))
+                && (IoFLAGS(GvIOp(PL_argvgv)) & (IOf_ARGV|IOf_START)) == IOf_ARGV) {
+                do_close(PL_argvgv, FALSE);
+            }
+#endif
+            if (back_psv && *back_psv) {
+#if defined(HAS_LINK) && !defined(DOSISH) && !defined(__CYGWIN__) && defined(HAS_RENAME)
+                if (link(orig_pv, SvPVX(*back_psv)) < 0)
+#endif
+                {
+#ifdef HAS_RENAME
+                    if (PerlLIO_rename(orig_pv, SvPVX(*back_psv)) < 0) {
+                        if (!not_implicit) {
+                            Perl_croak(aTHX_ "Can't rename %s to %s: %s, skipping file",
+                                       SvPVX(*orig_psv), SvPVX(*back_psv), Strerror(errno));
+                        }
+                        /* should we warn here? */
+                        goto abort_inplace;
+                    }
+#else
+                    (void)UNLINK(SvPVX(*back_psv));
+                    if (link(orig_pv, SvPVX(*back_psv))) {
+                        if (!not_implicit) {
+                            Perl_croak(aTHX_ "Can't rename %s to %s: %s, skipping file",
+                                       SvPVX(*orig_psv), SvPVX(*back_psv), Strerror(errno));
+                        }
+                        goto abort_inplace;
+                    }
+                    /* we need to use link() to get the temp into place too, and linK()
+                       fails if the new link name exists */
+                    (void)UNLINK(orig_pv);
+#endif
+                }
+            }
+#if defined(DOSISH) || defined(__CYGWIN__) || !defined(HAS_RENAME)
+            else {
+                UNLINK(orig_pv);
+            }
+#endif
+            if (
+#ifdef HAS_RENAME
+                PerlLIO_rename(SvPVX(*temp_psv), orig_pv) < 0
+#else
+                link(SvPVX(*temp_psv), orig_pv) < 0
+#endif
+                ) {
+                if (!not_implicit) {
+                    Perl_croak(aTHX_ "Can't rename in-place work file '%s' to '%s': %s\n",
+                               SvPVX(*temp_psv), SvPVX(*orig_psv), Strerror(errno));
+                }
+            abort_inplace:
+                UNLINK(SvPVX_const(*temp_psv));
+                retval = FALSE;
+            }
+#ifndef HAS_RENAME
+            UNLINK(SvPVX(*temp_psv));
+#endif
+        }
+        else {
+            UNLINK(SvPVX_const(*temp_psv));
+            if (!not_implicit) {
+                Perl_croak(aTHX_ "Failed to close in-place work file %s: %s",
+                           SvPVX(*temp_psv), Strerror(errno));
+            }
+        }
+        mg_freeext((SV*)io, PERL_MAGIC_uvar, &argvout_vtbl);
+    }
+    else {
+        retval = io_close(io, NULL, not_implicit, FALSE);
+    }
     if (not_implicit) {
 	IoLINES(io) = 0;
 	IoPAGE(io) = 0;
diff --git a/embed.fnc b/embed.fnc
index e96d686..209746a 100644
--- a/embed.fnc
+++ b/embed.fnc
@@ -989,6 +989,7 @@ ApdRn	|MAGIC*	|mg_findext	|NULLOK const SV* sv|int type|NULLOK const MGVTBL *vtb
 EXpR	|MAGIC*	|mg_find_mglob	|NN SV* sv
 Apd	|int	|mg_free	|NN SV* sv
 Apd	|void	|mg_free_type	|NN SV* sv|int how
+pd	|void	|mg_freeext	|NN SV* sv|int how|NULLOK const MGVTBL *vtbl
 Apd	|int	|mg_get		|NN SV* sv
 ApdD	|U32	|mg_length	|NN SV* sv
 Apdn	|void	|mg_magical	|NN SV* sv
diff --git a/embed.h b/embed.h
index 6061d55..5c7ed00 100644
--- a/embed.h
+++ b/embed.h
@@ -1331,6 +1331,7 @@
 #define magic_setvec(a,b)	Perl_magic_setvec(aTHX_ a,b)
 #define magic_sizepack(a,b)	Perl_magic_sizepack(aTHX_ a,b)
 #define magic_wipepack(a,b)	Perl_magic_wipepack(aTHX_ a,b)
+#define mg_freeext(a,b,c)	Perl_mg_freeext(aTHX_ a,b,c)
 #define mg_localize(a,b,c)	Perl_mg_localize(aTHX_ a,b,c)
 #define mode_from_discipline(a,b)	Perl_mode_from_discipline(aTHX_ a,b)
 #define mro_isa_changed_in(a)	Perl_mro_isa_changed_in(aTHX_ a)
diff --git a/mg.c b/mg.c
index cbabcc6..8068f7f 100644
--- a/mg.c
+++ b/mg.c
@@ -607,6 +607,42 @@ Perl_mg_free_type(pTHX_ SV *sv, int how)
     mg_magical(sv);
 }
 
+/*
+=for mg_freeext
+
+Remove any magic of type C<how> using virtual table C<vtable> from the
+SV C<sv>.  See L</sv_magic>.
+
+C<mg_freeext(sv, how, NULL)> is equivalent to C<mg_free_type(sv, how)>.
+
+=cut
+*/
+
+void
+Perl_mg_freeext(pTHX_ SV *sv, int how, const MGVTBL *vtbl)
+{
+    MAGIC *mg, *prevmg, *moremg;
+    PERL_ARGS_ASSERT_MG_FREEEXT;
+    for (prevmg = NULL, mg = SvMAGIC(sv); mg; prevmg = mg, mg = moremg) {
+	MAGIC *newhead;
+	moremg = mg->mg_moremagic;
+	if (mg->mg_type == how && (vtbl == NULL || mg->mg_virtual == vtbl)) {
+	    /* temporarily move to the head of the magic chain, in case
+	       custom free code relies on this historical aspect of mg_free */
+	    if (prevmg) {
+		prevmg->mg_moremagic = moremg;
+		mg->mg_moremagic = SvMAGIC(sv);
+		SvMAGIC_set(sv, mg);
+	    }
+	    newhead = mg->mg_moremagic;
+	    mg_free_struct(sv, mg);
+	    SvMAGIC_set(sv, newhead);
+	    mg = prevmg;
+	}
+    }
+    mg_magical(sv);
+}
+
 #include <signal.h>
 
 U32
diff --git a/pod/perldiag.pod b/pod/perldiag.pod
index c0a717c..ab66f77 100644
--- a/pod/perldiag.pod
+++ b/pod/perldiag.pod
@@ -1307,9 +1307,14 @@ the modified file.  The file was left unmodified.
 
 =item Can't rename %s to %s: %s, skipping file
 
-(S inplace) The rename done by the B<-i> switch failed for some reason,
+(F) The rename done by the B<-i> switch failed for some reason,
 probably because you don't have write permission to the directory.
 
+=item Can't rename in-place work file '%s' to '%s': %s
+
+(F) When closed implicitly, the temporary file for in-place editing
+couldn't be renamed to the original filename.
+
 =item Can't reopen input pipe (name: %s) in binary mode
 
 (P) An error peculiar to VMS.  Perl thought stdin was a pipe, and tried
@@ -2287,7 +2292,7 @@ Check the #! line, or manually feed your script into Perl yourself.
 CHECK, INIT, or END subroutine.  Processing of the remainder of the
 queue of such routines has been prematurely ended.
 
-=item Failed to close in-place edit file %s: %s
+=item Failed to close in-place work file %s: %s
 
 (F) Closing an output file from in-place editing, as with the C<-i>
 command-line switch, failed.
diff --git a/proto.h b/proto.h
index d7e38ea..1560f9e 100644
--- a/proto.h
+++ b/proto.h
@@ -1928,6 +1928,9 @@ PERL_CALLCONV int	Perl_mg_free(pTHX_ SV* sv);
 PERL_CALLCONV void	Perl_mg_free_type(pTHX_ SV* sv, int how);
 #define PERL_ARGS_ASSERT_MG_FREE_TYPE	\
 	assert(sv)
+PERL_CALLCONV void	Perl_mg_freeext(pTHX_ SV* sv, int how, const MGVTBL *vtbl);
+#define PERL_ARGS_ASSERT_MG_FREEEXT	\
+	assert(sv)
 PERL_CALLCONV int	Perl_mg_get(pTHX_ SV* sv);
 #define PERL_ARGS_ASSERT_MG_GET	\
 	assert(sv)
-- 
2.1.4


From ed2ebe0d9083dc9f009bd987333ccd216d4d3c4c Mon Sep 17 00:00:00 2001
From: Tony Cook <tony@develop-help.com>
Date: Thu, 19 May 2016 15:22:32 +1000
Subject: (perl #127663) all platforms no longer require a backup file

Platforms that disallow deleting an open file, like Win32, Cygwin,
previously required a backup extension (defaulted for Cygwin), but
since we now write to a work file that's no longer necessary (but
might still be desirable.)
---
 perl.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/perl.c b/perl.c
index dd67d4e..6ff0e43 100644
--- a/perl.c
+++ b/perl.c
@@ -3338,12 +3338,6 @@ Perl_moreswitches(pTHX_ const char *s)
 
     case 'i':
 	Safefree(PL_inplace);
-#if defined(__CYGWIN__) /* do backup extension automagically */
-	if (*(s+1) == '\0') {
-	PL_inplace = savepvs(".bak");
-	return s+1;
-	}
-#endif /* __CYGWIN__ */
 	{
 	    const char * const start = ++s;
 	    while (*s && !isSPACE(*s))
-- 
2.1.4


From 075043d3289c352bc2b0575cd6e2496ca4fb5c8d Mon Sep 17 00:00:00 2001
From: Tony Cook <tony@develop-help.com>
Date: Tue, 24 May 2016 09:06:18 +1000
Subject: (perl #127663) add more in-place edit tests

test that setuid is preserved with nested in-place editing, which
fails previously.
---
 t/io/nargv.t     | 24 +++++++++++++++++++++++-
 t/run/switches.t | 51 ++++++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 73 insertions(+), 2 deletions(-)

diff --git a/t/io/nargv.t b/t/io/nargv.t
index f0eee30..598ceed 100644
--- a/t/io/nargv.t
+++ b/t/io/nargv.t
@@ -6,7 +6,7 @@ BEGIN {
     set_up_inc('../lib');
 }
 
-print "1..5\n";
+print "1..6\n";
 
 my $j = 1;
 for $i ( 1,2,5,4,3 ) {
@@ -43,6 +43,28 @@ while (<>) {
     show();
 }
 
+# test setuid is preserved (and hopefully setgid)
+#
+# With nested in-place editing PL_oldname and PL_filemode would
+# be overwritten by the values for the last file in the nested
+# loop.  This is now all stored as magic in *ARGVOUT{IO}
+$^I = "";
+@ARGV = mkfiles(1..3);
+my $sidfile = $ARGV[1];
+chmod(04600, $sidfile);
+my $mode = (stat $ARGV[1])[2];
+$n = 0;
+while (<>) {
+    print STDOUT "#final \@ARGV: [@ARGV]\n";
+    if ($n++ == 1) {
+	other();
+    }
+    print;
+}
+my $newmode = (stat $sidfile)[2];
+printf "# before %#o after %#o\n", $mode, $newmode;
+print +($mode == $newmode ? "" : "not "). "ok 6 # check setuid mode preserved\n";
+
 sub show {
     #warn "$ARGV: $_";
     s/^not //;
diff --git a/t/run/switches.t b/t/run/switches.t
index b61be56..5291436 100644
--- a/t/run/switches.t
+++ b/t/run/switches.t
@@ -12,7 +12,7 @@ BEGIN {
 
 BEGIN { require "./test.pl";  require "./loc_tools.pl"; }
 
-plan(tests => 115);
+plan(tests => 120);
 
 use Config;
 
@@ -400,6 +400,55 @@ __EOF__
         args     => ['file'],
     );
     is($out2, "", "no warning when files given");
+
+    open my $f, ">", "file" or die "$0: failed to create 'file': $!";
+    print $f "foo\nbar\n";
+    close $f;
+
+    # a backup extension is no longer required on any platform
+    my $out3 = runperl(
+        switches => [ '-i', '-p' ],
+        prog => 's/foo/quux/',
+        stderr => 1,
+        args => [ 'file' ],
+    );
+    is($out3, "", "no warnings/errors without backup extension");
+    open $f, "<", "file" or die "$0: cannot open 'file': $!";
+    chomp(my @out4 = <$f>);
+    close $f;
+    is(join(":", @out4), "quux:bar", "correct output without backup extension");
+
+    # test that path parsing is correct
+    -d "inplacetmp" or mkdir("inplacetmp")
+      or die "Cannot mkdir 'inplacetmp': $!";
+    require File::Spec;
+    my $work = File::Spec->catfile("inplacetmp", "foo");
+    open $f, ">", $work or die "Cannot create $work: $!";
+    print $f "foo\nbar\n";
+    close $f;
+
+    my $out4 = runperl
+      (
+       switches => [ "-i", "-p" ],
+       prog => 's/foo/bar/',
+       stderr => 1,
+       args => [ $work ],
+      );
+    is ($out4, "", "no errors or warnings");
+    open $f, "<", $work or die "Cannot open $work: $!";
+    chomp(my @file4 = <$f>);
+    close $f;
+    is(join(":", @file4), "bar:bar", "check output");
+
+    unlink $work;
+
+    # we now use temp files for in-place editing, make sure we didn't leave
+    # any behind in the above test
+    opendir my $d, "inplacetmp" or die "Cannot opendir inplacetmp: $!";
+    my @names = grep !/^\.\.?$/, readdir $d;
+    closedir $d;
+    is(scalar(@names), 0, "no extra files")
+      or diag "Found @names, expected none";
 }
 
 # Tests for -E
-- 
2.1.4


From 58d3a7f96ab9d396a44133bdebf225239879a02e Mon Sep 17 00:00:00 2001
From: Tony Cook <tony@develop-help.com>
Date: Wed, 3 Aug 2016 14:43:59 +1000
Subject: (perl #127663) discard any output if not closed properly

It can be closed by either iterating to the next file, or by
an explicit close(ARGVOUT);
---
 doio.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/doio.c b/doio.c
index e8680db..f45f1c1 100644
--- a/doio.c
+++ b/doio.c
@@ -855,17 +855,26 @@ S_openindirtemp(pTHX_ GV *gv, SV *orig_name, SV *temp_out_name) {
 }
 
 static int
-S_argvout_free(pTHX_ SV *sv, MAGIC *mg) {
+S_argvout_free(pTHX_ SV *io, MAGIC *mg) {
     SV **temp_psv;
 
-    PERL_UNUSED_ARG(sv);
+    PERL_UNUSED_ARG(io);
 
     /* note this can be entered once the file has been
        successfully deleted too */
     assert(mg->mg_obj && SvTYPE(mg->mg_obj) == SVt_PVAV);
-    temp_psv = av_fetch((AV*)mg->mg_obj, 1, FALSE);
-    if (temp_psv && *temp_psv && SvOK(*temp_psv)) {
-        UNLINK(SvPVX(*temp_psv));
+    assert(IoTYPE(io) != IoTYPE_PIPE);
+
+    if (IoIFP(io)) {
+        /* if we get here the file hasn't been closed explicitly by the
+           user and hadn't been closed implicitly by nextargv(), so
+           abandon the edit */
+        PerlIO *iop = IoIFP(io);
+        (void)PerlIO_close(iop);
+        IoIFP(io) = IoOFP(io) = NULL;
+        temp_psv = av_fetch((AV*)mg->mg_obj, 1, FALSE);
+        assert(temp_psv && *temp_psv && SvPOK(*temp_psv));
+        (void)UNLINK(SvPVX(*temp_psv));
     }
 
     return 0;
-- 
2.1.4


From 8e328fc443869085a370f7de98d5c41b4848ee49 Mon Sep 17 00:00:00 2001
From: Tony Cook <tony@develop-help.com>
Date: Tue, 2 Aug 2016 16:05:22 +1000
Subject: (perl #127663) test that die/exit leave the original file

---
 t/run/switches.t | 30 +++++++++++++++++++++++++++---
 1 file changed, 27 insertions(+), 3 deletions(-)

diff --git a/t/run/switches.t b/t/run/switches.t
index 5291436..86c9dcb 100644
--- a/t/run/switches.t
+++ b/t/run/switches.t
@@ -12,7 +12,7 @@ BEGIN {
 
 BEGIN { require "./test.pl";  require "./loc_tools.pl"; }
 
-plan(tests => 120);
+plan(tests => 124);
 
 use Config;
 
@@ -355,11 +355,12 @@ for (qw( e f x E S V )) {
     sub do_i_unlink { unlink_all("file", "file.bak") }
 
     open(FILE, ">file") or die "$0: Failed to create 'file': $!";
-    print FILE <<__EOF__;
+    my $yada = <<__EOF__;
 foo yada dada
 bada foo bing
 king kong foo
 __EOF__
+    print FILE $yada;
     close FILE;
 
     END { do_i_unlink() }
@@ -418,11 +419,32 @@ __EOF__
     close $f;
     is(join(":", @out4), "quux:bar", "correct output without backup extension");
 
-    # test that path parsing is correct
     -d "inplacetmp" or mkdir("inplacetmp")
       or die "Cannot mkdir 'inplacetmp': $!";
     require File::Spec;
     my $work = File::Spec->catfile("inplacetmp", "foo");
+
+    # exit or die should leave original content in file
+    for my $inplace (qw/-i -i.bak/) {
+        for my $prog (qw/die exit/) {
+            open my $fh, ">", $work or die "$0: failed to open '$work': $!";
+            print $fh $yada;
+            close $fh or die "Failed to close: $!";
+            my $out = runperl (
+               switches => [ $inplace, '-n' ],
+               prog => "print q(foo\n); $prog",
+               stderr => 1,
+               args => [ $work ],
+            );
+            open my $in, "<", $work or die "$0: failed to open '$work': $!";
+            my $data = do { local $/; <$in> };
+            close $in;
+            is ($data, $yada, "check original content still in file");
+            unlink $work;
+        }
+    }
+
+    # test that path parsing is correct
     open $f, ">", $work or die "Cannot create $work: $!";
     print $f "foo\nbar\n";
     close $f;
@@ -449,6 +471,8 @@ __EOF__
     closedir $d;
     is(scalar(@names), 0, "no extra files")
       or diag "Found @names, expected none";
+
+    rmdir "inplacetmp";
 }
 
 # Tests for -E
-- 
2.1.4

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Dec 6, 2016

From @ppisar

On 2016-12-06, Tony Cook via RT <perlbug-followup@​perl.org> wrote​:

The attached patch attempts to fix this issue.

Thank you tackling this problem.

The implementation (beyond some TODO issues below) has at least one
problem - if the user code changes directory between the file open and
the close then the final clean-up stage is going to fail if the input
name wasn't an absolute path.

This might be fixable, but on some systems it may put the perl process
in a difficult to recover from position - if the system doesn't
implement getcwd() perl may change directory out of the original and not
have a way to return to it.

POSIX.1-2008 has renameat() that allows to specify a file by an opened
directory descriptor and a relative file name. I know it won't help you
if you need to deal with systems without getcwd(), but still it could
avoid some races when people replace directories with files the perl
works on.

-- Petr

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Dec 6, 2016

From @demerphq

On 6 December 2016 at 06​:05, Tony Cook via RT <perlbug-followup@​perl.org> wrote​:

The attached patch attempts to fix this issue.

It also fixes an issue with nested in-place editing, where the inner
in-place edit could overwrite the permissions referenced for the outer
edit, which are used to restore set[gu]id flags on the output file.

Tony

---
via perlbug​: queue​: perl5 status​: open
https://rt-archive.perl.org/perl5/Ticket/Display.html?id=127663

From ecdd0c8 Mon Sep 17 00​:00​:00 2001
From​: Tony Cook <tony@​develop-help.com>
Date​: Wed, 18 May 2016 15​:03​:14 +1000
Subject​: (perl #127663) create a separate random souce for internal use

and use it to initialize hash randomization and to innoculate against
quadratic behaviour in pp_sort
---
embedvar.h | 1 +
intrpvar.h | 8 ++++++++
perl.c | 2 ++
pp_sort.c | 2 +-
util.c | 4 +---
util.h | 6 ++++++
6 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/embedvar.h b/embedvar.h
index c413932..7588807 100644
--- a/embedvar.h
+++ b/embedvar.h
@​@​ -173,6 +173,7 @​@​
#define PL_incgv (vTHX->Iincgv)
#define PL_initav (vTHX->Iinitav)
#define PL_inplace (vTHX->Iinplace)
+#define PL_internal_random_state (vTHX->Iinternal_random_state)
#define PL_isarev (vTHX->Iisarev)
#define PL_known_layers (vTHX->Iknown_layers)
#define PL_last_in_gv (vTHX->Ilast_in_gv)
diff --git a/intrpvar.h b/intrpvar.h
index 1aa94f7..532a458 100644
--- a/intrpvar.h
+++ b/intrpvar.h
@​@​ -810,6 +810,14 @​@​ PERLVAR(I, random_state, PL_RANDOM_STATE_TYPE)

PERLVARI(I, dump_re_max_len, STRLEN, 0)

+/* For internal uses of randomness, this ensures the sequence of
+ * random numbers returned by rand() isn't modified by perl's internal
+ * use of randomness.
+ * This is important if the user has called srand() with a seed.
+ */
+
+PERLVAR(I, internal_random_state, PL_RANDOM_STATE_TYPE)
+
/* If you are adding a U8 or U16, check to see if there are 'Space' comments
* above on where there are gaps which currently will be structure padding. */

diff --git a/perl.c b/perl.c
index 3a647f7..dd67d4e 100644
--- a/perl.c
+++ b/perl.c
@​@​ -261,6 +261,8 @​@​ perl_construct(pTHXx)

 init\_constants\(\);

+ Perl_drand48_init_r(&PL_internal_random_state, seed());
+
SvREADONLY_on(&PL_sv_placeholder);
SvREFCNT(&PL_sv_placeholder) = SvREFCNT_IMMORTAL;

diff --git a/pp_sort.c b/pp_sort.c
index 68e65f9..7aa44eb 100644
--- a/pp_sort.c
+++ b/pp_sort.c
@​@​ -787,7 +787,7 @​@​ S_qsortsvu(pTHX_ SV ** array, size_t num_elts, SVCOMPARE_t compare)
size_t n;
SV ** const q = array;
for (n = num_elts; n > 1; ) {
- const size_t j = (size_t)(n-- * Drand01());
+ const size_t j = (size_t)(n-- * Perl_internal_drand48());
temp = q[j];
q[j] = q[n];
q[n] = temp;
diff --git a/util.c b/util.c
index 02c84c8..ef13e8b 100644
--- a/util.c
+++ b/util.c
@​@​ -4757,10 +4757,8 @​@​ Perl_get_hash_seed(pTHX_ unsigned char * const seed_buffer)
else
#endif
{
- (void)seedDrand01((Rand_seed_t)seed());
-
for( i = 0; i < PERL_HASH_SEED_BYTES; i++ ) {
- seed_buffer[i] = (unsigned char)(Drand01() * (U8_MAX+1));
+ seed_buffer[i] = (unsigned char)(Perl_internal_drand48() * (U8_MAX+1));
}
}
#ifdef USE_PERL_PERTURB_KEYS
diff --git a/util.h b/util.h
index 8f4171b..c71eefd 100644
--- a/util.h
+++ b/util.h
@​@​ -85,6 +85,12 @​@​ typedef struct PERL_DRAND48_T perl_drand48_t;
#define Perl_drand48_init(seed) (Perl_drand48_init_r(&PL_random_state, (seed)))
#define Perl_drand48() (Perl_drand48_r(&PL_random_state))

+#ifdef PERL_CORE
+/* uses a different source of randomness to avoid interfering with the results
+ * of rand() */
+#define Perl_internal_drand48() (Perl_drand48_r(&PL_internal_random_state))
+#endif
+
#ifdef USE_C_BACKTRACE

typedef struct {
--
2.1.4

I like this patch a lot. I can think of other uses of the new
Perl_internal_drand48() too.

I do have one hazy question. Is it right to do this in qsort()? I
wonder if a user might expect qsort() to be deterministic under
srand(). We have ways of overriding the randomness in the hash seed,
so should we not have a way to override or control the randomness in
something like qsort()? Perhaps we should have a way to set the seed
for the PL_internal_random_state from the env like we do for the hash
seed.

Yves

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Dec 6, 2016

From @tonycoz

On Tue, 06 Dec 2016 00​:49​:20 -0800, ppisar wrote​:

On 2016-12-06, Tony Cook via RT <perlbug-followup@​perl.org> wrote​:

The attached patch attempts to fix this issue.

Thank you tackling this problem.

The implementation (beyond some TODO issues below) has at least one
problem - if the user code changes directory between the file open and
the close then the final clean-up stage is going to fail if the input
name wasn't an absolute path.

This might be fixable, but on some systems it may put the perl process
in a difficult to recover from position - if the system doesn't
implement getcwd() perl may change directory out of the original and not
have a way to return to it.

POSIX.1-2008 has renameat() that allows to specify a file by an opened
directory descriptor and a relative file name. I know it won't help you
if you need to deal with systems without getcwd(), but still it could
avoid some races when people replace directories with files the perl
works on.

getcwd() isn't enough to fix the possible issues.

With the current implementation (ie. without the patch in this ticket)
the current directory can be renamed, have its parent's permissions changed
to prevent access or even be removed, with the inode staying live because
it's the current directory.

Using the *at() functions (along with dirfd()) can fix this, but getcwd()
isn't enough.

I think it's valuable to implement, but it adds another variation to test,
so I've left it for now.

Tony

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Dec 7, 2016

From @tonycoz

On Tue, 06 Dec 2016 02​:10​:28 -0800, demerphq wrote​:

On 6 December 2016 at 06​:05, Tony Cook via RT <perlbug-
followup@​perl.org> wrote​:

The attached patch attempts to fix this issue.

It also fixes an issue with nested in-place editing, where the inner
in-place edit could overwrite the permissions referenced for the
outer
edit, which are used to restore set[gu]id flags on the output file.

Tony

---
via perlbug​: queue​: perl5 status​: open
https://rt-archive.perl.org/perl5/Ticket/Display.html?id=127663

From ecdd0c8 Mon Sep 17 00​:00​:00
2001
From​: Tony Cook <tony@​develop-help.com>
Date​: Wed, 18 May 2016 15​:03​:14 +1000
Subject​: (perl #127663) create a separate random souce for internal
use

and use it to initialize hash randomization and to innoculate against
quadratic behaviour in pp_sort
---
embedvar.h | 1 +
intrpvar.h | 8 ++++++++
perl.c | 2 ++
pp_sort.c | 2 +-
util.c | 4 +---
util.h | 6 ++++++
6 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/embedvar.h b/embedvar.h
index c413932..7588807 100644
--- a/embedvar.h
+++ b/embedvar.h
@​@​ -173,6 +173,7 @​@​
#define PL_incgv (vTHX->Iincgv)
#define PL_initav (vTHX->Iinitav)
#define PL_inplace (vTHX->Iinplace)
+#define PL_internal_random_state (vTHX-

Iinternal_random_state)
#define PL_isarev (vTHX->Iisarev)
#define PL_known_layers (vTHX->Iknown_layers)
#define PL_last_in_gv (vTHX->Ilast_in_gv)
diff --git a/intrpvar.h b/intrpvar.h
index 1aa94f7..532a458 100644
--- a/intrpvar.h
+++ b/intrpvar.h
@​@​ -810,6 +810,14 @​@​ PERLVAR(I, random_state, PL_RANDOM_STATE_TYPE)

PERLVARI(I, dump_re_max_len, STRLEN, 0)

+/* For internal uses of randomness, this ensures the sequence of
+ * random numbers returned by rand() isn't modified by perl's
internal
+ * use of randomness.
+ * This is important if the user has called srand() with a seed.
+ */
+
+PERLVAR(I, internal_random_state, PL_RANDOM_STATE_TYPE)
+
/* If you are adding a U8 or U16, check to see if there are 'Space'
comments
* above on where there are gaps which currently will be structure
padding. */

diff --git a/perl.c b/perl.c
index 3a647f7..dd67d4e 100644
--- a/perl.c
+++ b/perl.c
@​@​ -261,6 +261,8 @​@​ perl_construct(pTHXx)

init_constants();

+ Perl_drand48_init_r(&PL_internal_random_state, seed());
+
SvREADONLY_on(&PL_sv_placeholder);
SvREFCNT(&PL_sv_placeholder) = SvREFCNT_IMMORTAL;

diff --git a/pp_sort.c b/pp_sort.c
index 68e65f9..7aa44eb 100644
--- a/pp_sort.c
+++ b/pp_sort.c
@​@​ -787,7 +787,7 @​@​ S_qsortsvu(pTHX_ SV ** array, size_t num_elts,
SVCOMPARE_t compare)
size_t n;
SV ** const q = array;
for (n = num_elts; n > 1; ) {
- const size_t j = (size_t)(n-- * Drand01());
+ const size_t j = (size_t)(n-- * Perl_internal_drand48());
temp = q[j];
q[j] = q[n];
q[n] = temp;
diff --git a/util.c b/util.c
index 02c84c8..ef13e8b 100644
--- a/util.c
+++ b/util.c
@​@​ -4757,10 +4757,8 @​@​ Perl_get_hash_seed(pTHX_ unsigned char * const
seed_buffer)
else
#endif
{
- (void)seedDrand01((Rand_seed_t)seed());
-
for( i = 0; i < PERL_HASH_SEED_BYTES; i++ ) {
- seed_buffer[i] = (unsigned char)(Drand01() *
(U8_MAX+1));
+ seed_buffer[i] = (unsigned char)(Perl_internal_drand48()
* (U8_MAX+1));
}
}
#ifdef USE_PERL_PERTURB_KEYS
diff --git a/util.h b/util.h
index 8f4171b..c71eefd 100644
--- a/util.h
+++ b/util.h
@​@​ -85,6 +85,12 @​@​ typedef struct PERL_DRAND48_T perl_drand48_t;
#define Perl_drand48_init(seed)
(Perl_drand48_init_r(&PL_random_state, (seed)))
#define Perl_drand48() (Perl_drand48_r(&PL_random_state))

+#ifdef PERL_CORE
+/* uses a different source of randomness to avoid interfering with
the results
+ * of rand() */
+#define Perl_internal_drand48()
(Perl_drand48_r(&PL_internal_random_state))
+#endif
+
#ifdef USE_C_BACKTRACE

typedef struct {
--
2.1.4

I like this patch a lot. I can think of other uses of the new
Perl_internal_drand48() too.

Part of the impetus for adding it was your suggestion in #115928.

I do have one hazy question. Is it right to do this in qsort()? I
wonder if a user might expect qsort() to be deterministic under
srand(). We have ways of overriding the randomness in the hash seed,
so should we not have a way to override or control the randomness in
something like qsort()? Perhaps we should have a way to set the seed
for the PL_internal_random_state from the env like we do for the hash
seed.

The use of randomness in qsort() is sufficiently internal that I don't
see much point to providing a runtime mechanism like srand() to control
the internal randomness.

If a user does want such randomness they can do something like​:

  srand($some_number); # or not
  @​sorted = map $_->[0],
  sort { $a->[0] cmp $b->[1] || $a->[1] <=> $b->[1] }
  map [ $_, rand ], @​input;

An environment variable is suitable though, per the attached patch.

I noticed there doesn't seem to be a way to build perl to have hash seed
randomization but disable the PERL_HASH_SEED environment variable. Is that
deliberate?

Tony

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Dec 7, 2016

From @tonycoz

0001-perl-127663-provide-limited-control-for-the-internal.patch
From 9eb4256cbc54e7d68ce05ebc227afe254f2876db Mon Sep 17 00:00:00 2001
From: Tony Cook <tony@develop-help.com>
Date: Wed, 7 Dec 2016 14:38:06 +1100
Subject: [PATCH] (perl #127663) provide limited control for the internal
 drand48()

perl can be built without PERL_INTERNAL_SEED support to reduce
it's attack surface.
---
 INSTALL         |  6 ++++++
 perl.c          | 29 +++++++++++++++++++++++++++++
 pod/perlrun.pod | 12 ++++++++++++
 3 files changed, 47 insertions(+)

diff --git a/INSTALL b/INSTALL
index 158b382..7267eb6 100644
--- a/INSTALL
+++ b/INSTALL
@@ -2685,6 +2685,12 @@ F<mathoms.c> will not be compiled in. Those functions are no longer used
 by perl itself; for source compatibility reasons, though, they weren't
 completely removed.
 
+=head2 C<-DNO_PERL_INTERNAL_SEED>
+X<PERL_INTERNAL_SEED>
+
+If you configure perl with C<-Accflags=-DNO_PERL_INTERNAL_SEED>, perl
+will ignore the C<PERL_INTERNAL_SEED> enviroment variable.
+
 =head1 DOCUMENTATION
 
 Read the manual entries before running perl.  The main documentation
diff --git a/perl.c b/perl.c
index 6ff0e43..16dc2b6 100644
--- a/perl.c
+++ b/perl.c
@@ -261,7 +261,21 @@ perl_construct(pTHXx)
 
     init_constants();
 
+#ifdef NO_PERL_INTERNAL_SEED
     Perl_drand48_init_r(&PL_internal_random_state, seed());
+#else
+    {
+        UV seed;
+        const char *env_pv;
+        if (PerlProc_getuid() != PerlProc_geteuid() ||
+            PerlProc_getgid() != PerlProc_getegid() ||
+            !(env_pv = PerlEnv_getenv("PERL_INTERNAL_SEED")) ||
+            grok_number(env_pv, strlen(env_pv), &seed) != IS_NUMBER_IN_UV) {
+            seed = seed();
+        }
+        Perl_drand48_init_r(&PL_internal_random_state, (U32)seed);
+    }
+#endif
 
     SvREADONLY_on(&PL_sv_placeholder);
     SvREFCNT(&PL_sv_placeholder) = SvREFCNT_IMMORTAL;
@@ -2159,6 +2173,21 @@ S_parse_body(pTHX_ char **env, XSINIT_t xsinit)
     }
     }
 
+#ifndef NO_PERL_INTERNAL_SEED
+    /* If we're not set[ug]id, we might have honored
+       PERL_INTERNAL_SEED in perl_construct().
+       At this point command-line options have been parsed, so if
+       we're now tainting and not set[ug]id re-seed.
+       This could possibly be wasteful if PERL_INTERNAL_SEED is invalid,
+       but avoids duplicating the logic from perl_construct().
+    */
+    if (PL_tainting &&
+        PerlProc_getuid() == PerlProc_geteuid() &&
+        PerlProc_getgid() == PerlProc_getegid()) {
+        Perl_drand48_init_r(&PL_internal_random_state, seed());
+    }
+#endif
+
     /* Set $^X early so that it can be used for relocatable paths in @INC  */
     /* and for SITELIB_EXP in USE_SITECUSTOMIZE                            */
     assert (!TAINT_get);
diff --git a/pod/perlrun.pod b/pod/perlrun.pod
index 9d59a6a..d92c899 100644
--- a/pod/perlrun.pod
+++ b/pod/perlrun.pod
@@ -1384,6 +1384,18 @@ X<SYS$LOGIN>
 
 Used if chdir has no argument and HOME and LOGDIR are not set.
 
+=item PERL_INTERNAL_SEED
+X<PERL_INTERNAL_SEED>
+
+Set to a non-negative integer to seed the random number generator used
+internally by perl for a variety of purposes.
+
+Ignored if perl is run setuid or setgid.  Used only for some limited
+startup randomization (hash keys) if C<-T> or C<-t> perl is started
+with tainting enabled.
+
+Perl may be built to ignore this variable.
+
 =back
 
 Perl also has environment variables that control how Perl handles data
-- 
2.1.4

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Dec 7, 2016

From @demerphq

On 7 December 2016 at 04​:43, Tony Cook via RT <perlbug-followup@​perl.org> wrote​:

On Tue, 06 Dec 2016 02​:10​:28 -0800, demerphq wrote​:

On 6 December 2016 at 06​:05, Tony Cook via RT <perlbug-
followup@​perl.org> wrote​:

The attached patch attempts to fix this issue.

It also fixes an issue with nested in-place editing, where the inner
in-place edit could overwrite the permissions referenced for the
outer
edit, which are used to restore set[gu]id flags on the output file.

Tony

---
via perlbug​: queue​: perl5 status​: open
https://rt-archive.perl.org/perl5/Ticket/Display.html?id=127663

From ecdd0c8 Mon Sep 17 00​:00​:00
2001
From​: Tony Cook <tony@​develop-help.com>
Date​: Wed, 18 May 2016 15​:03​:14 +1000
Subject​: (perl #127663) create a separate random souce for internal
use

and use it to initialize hash randomization and to innoculate against
quadratic behaviour in pp_sort
---
embedvar.h | 1 +
intrpvar.h | 8 ++++++++
perl.c | 2 ++
pp_sort.c | 2 +-
util.c | 4 +---
util.h | 6 ++++++
6 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/embedvar.h b/embedvar.h
index c413932..7588807 100644
--- a/embedvar.h
+++ b/embedvar.h
@​@​ -173,6 +173,7 @​@​
#define PL_incgv (vTHX->Iincgv)
#define PL_initav (vTHX->Iinitav)
#define PL_inplace (vTHX->Iinplace)
+#define PL_internal_random_state (vTHX-

Iinternal_random_state)
#define PL_isarev (vTHX->Iisarev)
#define PL_known_layers (vTHX->Iknown_layers)
#define PL_last_in_gv (vTHX->Ilast_in_gv)
diff --git a/intrpvar.h b/intrpvar.h
index 1aa94f7..532a458 100644
--- a/intrpvar.h
+++ b/intrpvar.h
@​@​ -810,6 +810,14 @​@​ PERLVAR(I, random_state, PL_RANDOM_STATE_TYPE)

PERLVARI(I, dump_re_max_len, STRLEN, 0)

+/* For internal uses of randomness, this ensures the sequence of
+ * random numbers returned by rand() isn't modified by perl's
internal
+ * use of randomness.
+ * This is important if the user has called srand() with a seed.
+ */
+
+PERLVAR(I, internal_random_state, PL_RANDOM_STATE_TYPE)
+
/* If you are adding a U8 or U16, check to see if there are 'Space'
comments
* above on where there are gaps which currently will be structure
padding. */

diff --git a/perl.c b/perl.c
index 3a647f7..dd67d4e 100644
--- a/perl.c
+++ b/perl.c
@​@​ -261,6 +261,8 @​@​ perl_construct(pTHXx)

init_constants();

+ Perl_drand48_init_r(&PL_internal_random_state, seed());
+
SvREADONLY_on(&PL_sv_placeholder);
SvREFCNT(&PL_sv_placeholder) = SvREFCNT_IMMORTAL;

diff --git a/pp_sort.c b/pp_sort.c
index 68e65f9..7aa44eb 100644
--- a/pp_sort.c
+++ b/pp_sort.c
@​@​ -787,7 +787,7 @​@​ S_qsortsvu(pTHX_ SV ** array, size_t num_elts,
SVCOMPARE_t compare)
size_t n;
SV ** const q = array;
for (n = num_elts; n > 1; ) {
- const size_t j = (size_t)(n-- * Drand01());
+ const size_t j = (size_t)(n-- * Perl_internal_drand48());
temp = q[j];
q[j] = q[n];
q[n] = temp;
diff --git a/util.c b/util.c
index 02c84c8..ef13e8b 100644
--- a/util.c
+++ b/util.c
@​@​ -4757,10 +4757,8 @​@​ Perl_get_hash_seed(pTHX_ unsigned char * const
seed_buffer)
else
#endif
{
- (void)seedDrand01((Rand_seed_t)seed());
-
for( i = 0; i < PERL_HASH_SEED_BYTES; i++ ) {
- seed_buffer[i] = (unsigned char)(Drand01() *
(U8_MAX+1));
+ seed_buffer[i] = (unsigned char)(Perl_internal_drand48()
* (U8_MAX+1));
}
}
#ifdef USE_PERL_PERTURB_KEYS
diff --git a/util.h b/util.h
index 8f4171b..c71eefd 100644
--- a/util.h
+++ b/util.h
@​@​ -85,6 +85,12 @​@​ typedef struct PERL_DRAND48_T perl_drand48_t;
#define Perl_drand48_init(seed)
(Perl_drand48_init_r(&PL_random_state, (seed)))
#define Perl_drand48() (Perl_drand48_r(&PL_random_state))

+#ifdef PERL_CORE
+/* uses a different source of randomness to avoid interfering with
the results
+ * of rand() */
+#define Perl_internal_drand48()
(Perl_drand48_r(&PL_internal_random_state))
+#endif
+
#ifdef USE_C_BACKTRACE

typedef struct {
--
2.1.4

I like this patch a lot. I can think of other uses of the new
Perl_internal_drand48() too.

Part of the impetus for adding it was your suggestion in #115928.

Oh, cool. Thanks. :-)

I do have one hazy question. Is it right to do this in qsort()? I
wonder if a user might expect qsort() to be deterministic under
srand(). We have ways of overriding the randomness in the hash seed,
so should we not have a way to override or control the randomness in
something like qsort()? Perhaps we should have a way to set the seed
for the PL_internal_random_state from the env like we do for the hash
seed.

The use of randomness in qsort() is sufficiently internal that I don't
see much point to providing a runtime mechanism like srand() to control
the internal randomness.

Oh sorry, I didn't mean runtime ala srand, I meant at-startup ala
PERL_HASH_SEED.

If a user does want such randomness they can do something like​:

srand($some_number); # or not
@​sorted = map $_->[0],
sort { $a->[0] cmp $b->[1] || $a->[1] <=> $b->[1] }
map [ $_, rand ], @​input;

Sure. But I was more thinking of being able to run perl in a mode
where you can exactly replicate its behavior as a whole.

An environment variable is suitable though, per the attached patch.

Yes I like.

I noticed there doesn't seem to be a way to build perl to have hash seed
randomization but disable the PERL_HASH_SEED environment variable. Is that
deliberate?

No, an oversight. We should probably have build options to disable
setting the seed from the env, and also probably one to disable
PERL_HASH_SEED_DEBUG. I think at least one vendor hides some or all
data from that, we might as well make that easy for all.

Anyway I like the patch(es) with one slightly bikeshedding comment
that maybe PERL_INTERNAL_SEED should be PERL_INTERNAL_RAND_SEED, or
something? As is, it is a bit ambiguous what the SEED is for.

Anyway, nice stuff. Thanks for doing this.
yves

Tony

---
via perlbug​: queue​: perl5 status​: open
https://rt-archive.perl.org/perl5/Ticket/Display.html?id=127663

From 9eb4256cbc54e7d68ce05ebc227afe254f2876db Mon Sep 17 00​:00​:00 2001
From​: Tony Cook <tony@​develop-help.com>
Date​: Wed, 7 Dec 2016 14​:38​:06 +1100
Subject​: [PATCH] (perl #127663) provide limited control for the internal
drand48()

perl can be built without PERL_INTERNAL_SEED support to reduce
it's attack surface.
---
INSTALL | 6 ++++++
perl.c | 29 +++++++++++++++++++++++++++++
pod/perlrun.pod | 12 ++++++++++++
3 files changed, 47 insertions(+)

diff --git a/INSTALL b/INSTALL
index 158b382..7267eb6 100644
--- a/INSTALL
+++ b/INSTALL
@​@​ -2685,6 +2685,12 @​@​ F<mathoms.c> will not be compiled in. Those functions are no longer used
by perl itself; for source compatibility reasons, though, they weren't
completely removed.

+=head2 C<-DNO_PERL_INTERNAL_SEED>
+X<PERL_INTERNAL_SEED>
+
+If you configure perl with C<-Accflags=-DNO_PERL_INTERNAL_SEED>, perl
+will ignore the C<PERL_INTERNAL_SEED> enviroment variable.
+
=head1 DOCUMENTATION

Read the manual entries before running perl. The main documentation
diff --git a/perl.c b/perl.c
index 6ff0e43..16dc2b6 100644
--- a/perl.c
+++ b/perl.c
@​@​ -261,7 +261,21 @​@​ perl_construct(pTHXx)

 init\_constants\(\);

+#ifdef NO_PERL_INTERNAL_SEED
Perl_drand48_init_r(&PL_internal_random_state, seed());
+#else
+ {
+ UV seed;
+ const char *env_pv;
+ if (PerlProc_getuid() != PerlProc_geteuid() ||
+ PerlProc_getgid() != PerlProc_getegid() ||
+ !(env_pv = PerlEnv_getenv("PERL_INTERNAL_SEED")) ||
+ grok_number(env_pv, strlen(env_pv), &seed) != IS_NUMBER_IN_UV) {
+ seed = seed();
+ }
+ Perl_drand48_init_r(&PL_internal_random_state, (U32)seed);
+ }
+#endif

 SvREADONLY\_on\(&PL\_sv\_placeholder\);
 SvREFCNT\(&PL\_sv\_placeholder\) = SvREFCNT\_IMMORTAL;

@​@​ -2159,6 +2173,21 @​@​ S_parse_body(pTHX_ char **env, XSINIT_t xsinit)
}
}

+#ifndef NO_PERL_INTERNAL_SEED
+ /* If we're not set[ug]id, we might have honored
+ PERL_INTERNAL_SEED in perl_construct().
+ At this point command-line options have been parsed, so if
+ we're now tainting and not set[ug]id re-seed.
+ This could possibly be wasteful if PERL_INTERNAL_SEED is invalid,
+ but avoids duplicating the logic from perl_construct().
+ */
+ if (PL_tainting &&
+ PerlProc_getuid() == PerlProc_geteuid() &&
+ PerlProc_getgid() == PerlProc_getegid()) {
+ Perl_drand48_init_r(&PL_internal_random_state, seed());
+ }
+#endif
+
/* Set $^X early so that it can be used for relocatable paths in @​INC */
/* and for SITELIB_EXP in USE_SITECUSTOMIZE */
assert (!TAINT_get);
diff --git a/pod/perlrun.pod b/pod/perlrun.pod
index 9d59a6a..d92c899 100644
--- a/pod/perlrun.pod
+++ b/pod/perlrun.pod
@​@​ -1384,6 +1384,18 @​@​ X<SYS$LOGIN>

Used if chdir has no argument and HOME and LOGDIR are not set.

+=item PERL_INTERNAL_SEED
+X<PERL_INTERNAL_SEED>
+
+Set to a non-negative integer to seed the random number generator used
+internally by perl for a variety of purposes.
+
+Ignored if perl is run setuid or setgid. Used only for some limited
+startup randomization (hash keys) if C<-T> or C<-t> perl is started
+with tainting enabled.
+
+Perl may be built to ignore this variable.
+
=back

Perl also has environment variables that control how Perl handles data
--
2.1.4

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Dec 8, 2016

From @tonycoz

On Wed, 07 Dec 2016 01​:01​:51 -0800, demerphq wrote​:

No, an oversight. We should probably have build options to disable
setting the seed from the env, and also probably one to disable
PERL_HASH_SEED_DEBUG. I think at least one vendor hides some or all
data from that, we might as well make that easy for all.

Patch attached.

Anyway I like the patch(es) with one slightly bikeshedding comment
that maybe PERL_INTERNAL_SEED should be PERL_INTERNAL_RAND_SEED, or
something? As is, it is a bit ambiguous what the SEED is for.

Modified patch attached.

Tony

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Dec 8, 2016

From @tonycoz

0001-add-build-options-to-disable-the-PERL_HASH-and-PERL_.patch
From 8a3a5768f9c76a0c300645855725ea8553c4b1a3 Mon Sep 17 00:00:00 2001
From: Tony Cook <tony@develop-help.com>
Date: Thu, 8 Dec 2016 14:14:11 +1100
Subject: add build options to disable the PERL_HASH* and PERL_PERTURB_KEYS env
 vars

These variables either control or reveal information used in perl's
hash implementation that a careful user may not want controlled or
exposed.
---
 INSTALL        |   7 +++
 perl.c         |   4 +-
 t/run/runenv.t | 145 +++++++++++++++++++++++++++++++--------------------------
 util.c         |  12 +++--
 4 files changed, 97 insertions(+), 71 deletions(-)

diff --git a/INSTALL b/INSTALL
index 158b382..a5b1d48 100644
--- a/INSTALL
+++ b/INSTALL
@@ -423,6 +423,13 @@ See L<perlrun/PERL_HASH_SEED> and L<perlrun/PERL_PERTURB_KEYS> for
 details on the environment variables, and L<perlsec/Algorithmic
 Complexity Attacks> for further security details.
 
+The C<PERL_HASH_SEED> and PERL_PERTURB_KEYS> environment variables can
+be disabled by building configuring perl with
+C<-Accflags=-DNO_PERL_HASH_ENV>.
+
+The C<PERL_HASH_SEED_DEBUG> environment variable can be disabled by
+configuring perl with C<-Accflags=-DNO_PERL_HASH_SEED_DEBUG>.
+
 =head3 SOCKS
 
 Perl can be configured to be 'socksified', that is, to use the SOCKS
diff --git a/perl.c b/perl.c
index 3a647f7..0d12759 100644
--- a/perl.c
+++ b/perl.c
@@ -1535,7 +1535,7 @@ perl_parse(pTHXx_ XSINIT_t xsinit, int argc, char **argv, char **env)
 #ifndef MULTIPLICITY
     PERL_UNUSED_ARG(my_perl);
 #endif
-#if defined(USE_HASH_SEED) || defined(USE_HASH_SEED_EXPLICIT) || defined(USE_HASH_SEED_DEBUG)
+#if (defined(USE_HASH_SEED) || defined(USE_HASH_SEED_EXPLICIT) || defined(USE_HASH_SEED_DEBUG)) && !defined(NO_PERL_HASH_SEED_DEBUG)
     {
         const char * const s = PerlEnv_getenv("PERL_HASH_SEED_DEBUG");
 
@@ -1554,7 +1554,7 @@ perl_parse(pTHXx_ XSINIT_t xsinit, int argc, char **argv, char **env)
             PerlIO_printf(Perl_debug_log, "\n");
         }
     }
-#endif /* #if defined(USE_HASH_SEED) || defined(USE_HASH_SEED_EXPLICIT) */
+#endif /* #if (defined(USE_HASH_SEED) || defined(USE_HASH_SEED_EXPLICIT) ... */
 
 #ifdef __amigaos4__
     {
diff --git a/t/run/runenv.t b/t/run/runenv.t
index 6f235d2..611e012 100644
--- a/t/run/runenv.t
+++ b/t/run/runenv.t
@@ -204,74 +204,87 @@ try({PERL5LIB => "foo",
     '',
     '');
 
-try({PERL_HASH_SEED_DEBUG => 1},
-    ['-e','1'],
-    '',
-    qr/HASH_FUNCTION =/);
-
-try({PERL_HASH_SEED_DEBUG => 1},
-    ['-e','1'],
-    '',
-    qr/HASH_SEED =/);
-
-# special case, seed "0" implies disabled hash key traversal randomization
-try({PERL_HASH_SEED_DEBUG => 1, PERL_HASH_SEED => "0"},
-    ['-e','1'],
-    '',
-    qr/PERTURB_KEYS = 0/);
-
-# check that setting it to a different value with the same logical value
-# triggers the normal "deterministic mode".
-try({PERL_HASH_SEED_DEBUG => 1, PERL_HASH_SEED => "0x0"},
-    ['-e','1'],
-    '',
-    qr/PERTURB_KEYS = 2/);
-
-try({PERL_HASH_SEED_DEBUG => 1, PERL_PERTURB_KEYS => "0"},
-    ['-e','1'],
-    '',
-    qr/PERTURB_KEYS = 0/);
-
-try({PERL_HASH_SEED_DEBUG => 1, PERL_PERTURB_KEYS => "1"},
-    ['-e','1'],
-    '',
-    qr/PERTURB_KEYS = 1/);
-
-try({PERL_HASH_SEED_DEBUG => 1, PERL_PERTURB_KEYS => "2"},
-    ['-e','1'],
-    '',
-    qr/PERTURB_KEYS = 2/);
-
-try({PERL_HASH_SEED_DEBUG => 1, PERL_HASH_SEED => "12345678"},
-    ['-e','1'],
-    '',
-    qr/HASH_SEED = 0x12345678/);
-
-try({PERL_HASH_SEED_DEBUG => 1, PERL_HASH_SEED => "12"},
-    ['-e','1'],
-    '',
-    qr/HASH_SEED = 0x12000000/);
+SKIP:
+{
+    skip "NO_PERL_HASH_SEED_DEBUG set", 4
+      if $Config{ccflags} =~ /-DNO_PERL_HASH_SEED_DEBUG\b/;
+
+    try({PERL_HASH_SEED_DEBUG => 1},
+        ['-e','1'],
+        '',
+        qr/HASH_FUNCTION =/);
+
+    try({PERL_HASH_SEED_DEBUG => 1},
+        ['-e','1'],
+        '',
+        qr/HASH_SEED =/);
+}
 
-try({PERL_HASH_SEED_DEBUG => 1, PERL_HASH_SEED => "123456789"},
-    ['-e','1'],
-    '',
-    qr/HASH_SEED = 0x12345678/);
-
-# Test that PERL_PERTURB_KEYS works as expected.  We check that we get the same
-# results if we use PERL_PERTURB_KEYS = 0 or 2 and we reuse the seed from previous run.
-my @print_keys = ( '-e', '@_{"A".."Z"}=(); print keys %_');
-for my $mode ( 0,1, 2 ) { # disabled and deterministic respectively
-    my %base_opts = ( PERL_PERTURB_KEYS => $mode, PERL_HASH_SEED_DEBUG => 1 ),
-    my ($out, $err) = runperl_and_capture( { %base_opts }, [ @print_keys ]);
-    if ($err=~/HASH_SEED = (0x[a-f0-9]+)/) {
-        my $seed = $1;
-        my($out2, $err2) = runperl_and_capture( { %base_opts, PERL_HASH_SEED => $seed }, [ @print_keys ]);
-        if ( $mode == 1 ) {
-            isnt ($out,$out2,"PERL_PERTURB_KEYS = $mode results in different key order with the same key");
-        } else {
-            is ($out,$out2,"PERL_PERTURB_KEYS = $mode allows one to recreate a random hash");
+SKIP:
+{
+    skip "NO_PERL_HASH_ENV or NO_PERL_HASH_SEED_DEBUG set", 16
+      if $Config{ccflags} =~ /-DNO_PERL_HASH_ENV\b/ ||
+         $Config{ccflags} =~ /-DNO_PERL_HASH_SEED_DEBUG\b/;
+
+    # special case, seed "0" implies disabled hash key traversal randomization
+    try({PERL_HASH_SEED_DEBUG => 1, PERL_HASH_SEED => "0"},
+        ['-e','1'],
+        '',
+        qr/PERTURB_KEYS = 0/);
+
+    # check that setting it to a different value with the same logical value
+    # triggers the normal "deterministic mode".
+    try({PERL_HASH_SEED_DEBUG => 1, PERL_HASH_SEED => "0x0"},
+        ['-e','1'],
+        '',
+        qr/PERTURB_KEYS = 2/);
+
+    try({PERL_HASH_SEED_DEBUG => 1, PERL_PERTURB_KEYS => "0"},
+        ['-e','1'],
+        '',
+        qr/PERTURB_KEYS = 0/);
+
+    try({PERL_HASH_SEED_DEBUG => 1, PERL_PERTURB_KEYS => "1"},
+        ['-e','1'],
+        '',
+        qr/PERTURB_KEYS = 1/);
+
+    try({PERL_HASH_SEED_DEBUG => 1, PERL_PERTURB_KEYS => "2"},
+        ['-e','1'],
+        '',
+        qr/PERTURB_KEYS = 2/);
+
+    try({PERL_HASH_SEED_DEBUG => 1, PERL_HASH_SEED => "12345678"},
+        ['-e','1'],
+        '',
+        qr/HASH_SEED = 0x12345678/);
+
+    try({PERL_HASH_SEED_DEBUG => 1, PERL_HASH_SEED => "12"},
+        ['-e','1'],
+        '',
+        qr/HASH_SEED = 0x12000000/);
+
+    try({PERL_HASH_SEED_DEBUG => 1, PERL_HASH_SEED => "123456789"},
+        ['-e','1'],
+        '',
+        qr/HASH_SEED = 0x12345678/);
+
+    # Test that PERL_PERTURB_KEYS works as expected.  We check that we get the same
+    # results if we use PERL_PERTURB_KEYS = 0 or 2 and we reuse the seed from previous run.
+    my @print_keys = ( '-e', '@_{"A".."Z"}=(); print keys %_');
+    for my $mode ( 0,1, 2 ) { # disabled and deterministic respectively
+        my %base_opts = ( PERL_PERTURB_KEYS => $mode, PERL_HASH_SEED_DEBUG => 1 ),
+          my ($out, $err) = runperl_and_capture( { %base_opts }, [ @print_keys ]);
+        if ($err=~/HASH_SEED = (0x[a-f0-9]+)/) {
+            my $seed = $1;
+            my($out2, $err2) = runperl_and_capture( { %base_opts, PERL_HASH_SEED => $seed }, [ @print_keys ]);
+            if ( $mode == 1 ) {
+                isnt ($out,$out2,"PERL_PERTURB_KEYS = $mode results in different key order with the same key");
+            } else {
+                is ($out,$out2,"PERL_PERTURB_KEYS = $mode allows one to recreate a random hash");
+            }
+            is ($err,$err2,"Got the same debug output when we set PERL_HASH_SEED and PERL_PERTURB_KEYS");
         }
-        is ($err,$err2,"Got the same debug output when we set PERL_HASH_SEED and PERL_PERTURB_KEYS");
     }
 }
 
diff --git a/util.c b/util.c
index 02c84c8..a1306c6 100644
--- a/util.c
+++ b/util.c
@@ -4712,20 +4712,23 @@ Perl_seed(pTHX)
 void
 Perl_get_hash_seed(pTHX_ unsigned char * const seed_buffer)
 {
+#ifndef NO_PERL_HASH_ENV
     const char *env_pv;
+#endif
     unsigned long i;
 
     PERL_ARGS_ASSERT_GET_HASH_SEED;
 
+#ifndef NO_PERL_HASH_ENV
     env_pv= PerlEnv_getenv("PERL_HASH_SEED");
 
     if ( env_pv )
-#ifndef USE_HASH_SEED_EXPLICIT
+#  ifndef USE_HASH_SEED_EXPLICIT
     {
         /* ignore leading spaces */
         while (isSPACE(*env_pv))
             env_pv++;
-#ifdef USE_PERL_PERTURB_KEYS
+#    ifdef USE_PERL_PERTURB_KEYS
         /* if they set it to "0" we disable key traversal randomization completely */
         if (strEQ(env_pv,"0")) {
             PL_hash_rand_bits_enabled= 0;
@@ -4733,7 +4736,7 @@ Perl_get_hash_seed(pTHX_ unsigned char * const seed_buffer)
             /* otherwise switch to deterministic mode */
             PL_hash_rand_bits_enabled= 2;
         }
-#endif
+#    endif
         /* ignore a leading 0x... if it is there */
         if (env_pv[0] == '0' && env_pv[1] == 'x')
             env_pv += 2;
@@ -4755,6 +4758,7 @@ Perl_get_hash_seed(pTHX_ unsigned char * const seed_buffer)
         /* should we warn about insufficient hex? */
     }
     else
+#  endif
 #endif
     {
         (void)seedDrand01((Rand_seed_t)seed());
@@ -4774,6 +4778,7 @@ Perl_get_hash_seed(pTHX_ unsigned char * const seed_buffer)
             PL_hash_rand_bits = ROTL_UV(PL_hash_rand_bits,8);
         }
     }
+#  ifndef NO_PERL_HASH_ENV
     env_pv= PerlEnv_getenv("PERL_PERTURB_KEYS");
     if (env_pv) {
         if (strEQ(env_pv,"0") || strEQ(env_pv,"NO")) {
@@ -4786,6 +4791,7 @@ Perl_get_hash_seed(pTHX_ unsigned char * const seed_buffer)
             Perl_warn(aTHX_ "perl: warning: strange setting in '$ENV{PERL_PERTURB_KEYS}': '%s'\n", env_pv);
         }
     }
+#  endif
 #endif
 }
 
-- 
2.1.4

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Dec 8, 2016

From @tonycoz

0001-perl-127663-provide-limited-control-for-the-internal.patch
From 5bd0a3f37a303e2f4b2d8add4c4beb64a7a363a2 Mon Sep 17 00:00:00 2001
From: Tony Cook <tony@develop-help.com>
Date: Thu, 8 Dec 2016 09:38:55 +1100
Subject: (perl #127663) provide limited control for the internal drand48()

perl can be built without PERL_INTERNAL_RAND_SEED support to reduce
it's attack surface.
---
 INSTALL         |  6 ++++++
 perl.c          | 29 +++++++++++++++++++++++++++++
 pod/perlrun.pod | 12 ++++++++++++
 3 files changed, 47 insertions(+)

diff --git a/INSTALL b/INSTALL
index 158b382..7220911 100644
--- a/INSTALL
+++ b/INSTALL
@@ -2685,6 +2685,12 @@ F<mathoms.c> will not be compiled in. Those functions are no longer used
 by perl itself; for source compatibility reasons, though, they weren't
 completely removed.
 
+=head2 C<-DNO_PERL_INTERNAL_RAND_SEED>
+X<PERL_INTERNAL_RAND_SEED>
+
+If you configure perl with C<-Accflags=-DNO_PERL_INTERNAL_RAND_SEED>,
+perl will ignore the C<PERL_INTERNAL_RAND_SEED> enviroment variable.
+
 =head1 DOCUMENTATION
 
 Read the manual entries before running perl.  The main documentation
diff --git a/perl.c b/perl.c
index 6ff0e43..b5be5e2 100644
--- a/perl.c
+++ b/perl.c
@@ -261,7 +261,21 @@ perl_construct(pTHXx)
 
     init_constants();
 
+#ifdef NO_PERL_INTERNAL_RAND_SEED
     Perl_drand48_init_r(&PL_internal_random_state, seed());
+#else
+    {
+        UV seed;
+        const char *env_pv;
+        if (PerlProc_getuid() != PerlProc_geteuid() ||
+            PerlProc_getgid() != PerlProc_getegid() ||
+            !(env_pv = PerlEnv_getenv("PERL_INTERNAL_RAND_SEED")) ||
+            grok_number(env_pv, strlen(env_pv), &seed) != IS_NUMBER_IN_UV) {
+            seed = seed();
+        }
+        Perl_drand48_init_r(&PL_internal_random_state, (U32)seed);
+    }
+#endif
 
     SvREADONLY_on(&PL_sv_placeholder);
     SvREFCNT(&PL_sv_placeholder) = SvREFCNT_IMMORTAL;
@@ -2159,6 +2173,21 @@ S_parse_body(pTHX_ char **env, XSINIT_t xsinit)
     }
     }
 
+#ifndef NO_PERL_INTERNAL_RAND_SEED
+    /* If we're not set[ug]id, we might have honored
+       PERL_INTERNAL_RAND_SEED in perl_construct().
+       At this point command-line options have been parsed, so if
+       we're now tainting and not set[ug]id re-seed.
+       This could possibly be wasteful if PERL_INTERNAL_RAND_SEED is invalid,
+       but avoids duplicating the logic from perl_construct().
+    */
+    if (PL_tainting &&
+        PerlProc_getuid() == PerlProc_geteuid() &&
+        PerlProc_getgid() == PerlProc_getegid()) {
+        Perl_drand48_init_r(&PL_internal_random_state, seed());
+    }
+#endif
+
     /* Set $^X early so that it can be used for relocatable paths in @INC  */
     /* and for SITELIB_EXP in USE_SITECUSTOMIZE                            */
     assert (!TAINT_get);
diff --git a/pod/perlrun.pod b/pod/perlrun.pod
index 9d59a6a..7382aad 100644
--- a/pod/perlrun.pod
+++ b/pod/perlrun.pod
@@ -1384,6 +1384,18 @@ X<SYS$LOGIN>
 
 Used if chdir has no argument and HOME and LOGDIR are not set.
 
+=item PERL_INTERNAL_RAND_SEED
+X<PERL_INTERNAL_RAND_SEED>
+
+Set to a non-negative integer to seed the random number generator used
+internally by perl for a variety of purposes.
+
+Ignored if perl is run setuid or setgid.  Used only for some limited
+startup randomization (hash keys) if C<-T> or C<-t> perl is started
+with tainting enabled.
+
+Perl may be built to ignore this variable.
+
 =back
 
 Perl also has environment variables that control how Perl handles data
-- 
2.1.4

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Jan 9, 2017

From @tonycoz

On Wed, 07 Dec 2016 19​:26​:42 -0800, tonyc wrote​:

On Wed, 07 Dec 2016 01​:01​:51 -0800, demerphq wrote​:

No, an oversight. We should probably have build options to disable
setting the seed from the env, and also probably one to disable
PERL_HASH_SEED_DEBUG. I think at least one vendor hides some or all
data from that, we might as well make that easy for all.

Patch attached.

Applied as 95309d6.

Tony

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Jan 12, 2017

From @tonycoz

On Tue, 06 Dec 2016 15​:25​:18 -0800, tonyc wrote​:

Using the *at() functions (along with dirfd()) can fix this, but getcwd()
isn't enough.

I think it's valuable to implement, but it adds another variation to test,
so I've left it for now.

Here's an updated patch set, this includes a number of enhancements​:

- support for using renameat() etc on platforms that support them to avoid problems with changing directory in the inplace edit loop.

- use symbolic contants for the AV kept in magic

- provide some limited control over the internal rand() per the earlier discussion

and some fixes​:

- don't do the close processing in child threads, since this could result in multiple renames of the work file to the output file (one of which would fail) and avoid double-closedir()ing the DIR for the *at() version of the code

- don't do the close processing in child processes, to avoid double-renaming as above.

- add some cleanup for the tests

Tony

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Jan 12, 2017

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Sep 4, 2017

From @tonycoz

On Thu, 12 Jan 2017 15​:55​:24 -0800, tonyc wrote​:

On Tue, 06 Dec 2016 15​:25​:18 -0800, tonyc wrote​:

Using the *at() functions (along with dirfd()) can fix this, but
getcwd()
isn't enough.

I think it's valuable to implement, but it adds another variation to
test,
so I've left it for now.

Here's an updated patch set, this includes a number of enhancements​:

- support for using renameat() etc on platforms that support them to
avoid problems with changing directory in the inplace edit loop.

- use symbolic contants for the AV kept in magic

- provide some limited control over the internal rand() per the
earlier discussion

and some fixes​:

- don't do the close processing in child threads, since this could
result in multiple renames of the work file to the output file (one of
which would fail) and avoid double-closedir()ing the DIR for the *at()
version of the code

- don't do the close processing in child processes, to avoid double-
renaming as above.

- add some cleanup for the tests

This, hopefully final, patch set also​:

- if the *at() functions aren't available, and the names are relative, fail early if the current directory has changed.

I plan to apply this in a week or so unless someone objects.

Tony

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Sep 4, 2017

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Sep 11, 2017

From @tonycoz

On Sun, 03 Sep 2017 17​:50​:02 -0700, tonyc wrote​:

On Thu, 12 Jan 2017 15​:55​:24 -0800, tonyc wrote​:

On Tue, 06 Dec 2016 15​:25​:18 -0800, tonyc wrote​:

Using the *at() functions (along with dirfd()) can fix this, but
getcwd()
isn't enough.

I think it's valuable to implement, but it adds another variation
to
test,
so I've left it for now.

Here's an updated patch set, this includes a number of enhancements​:

- support for using renameat() etc on platforms that support them to
avoid problems with changing directory in the inplace edit loop.

- use symbolic contants for the AV kept in magic

- provide some limited control over the internal rand() per the
earlier discussion

and some fixes​:

- don't do the close processing in child threads, since this could
result in multiple renames of the work file to the output file (one
of
which would fail) and avoid double-closedir()ing the DIR for the
*at()
version of the code

- don't do the close processing in child processes, to avoid double-
renaming as above.

- add some cleanup for the tests

This, hopefully final, patch set also​:

- if the *at() functions aren't available, and the names are relative,
fail early if the current directory has changed.

I plan to apply this in a week or so unless someone objects.

Applied as merge commit 9c6681c.

Leaving this open a bit for any breakages.

Tony

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Sep 12, 2017

From @tonycoz

On Sun, 10 Sep 2017 22​:43​:05 -0700, tonyc wrote​:

Applied as merge commit 9c6681c.

Leaving this open a bit for any breakages.

Reported as broken on FreeBSD​:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=222258

possibly due to a FreeBSD bug, but from the description and what I understand of the FreeBSD code this might occur in a container where rename() would work.

Tony

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Sep 18, 2017

From @tonycoz

On Tue, 12 Sep 2017 16​:56​:55 -0700, tonyc wrote​:

On Sun, 10 Sep 2017 22​:43​:05 -0700, tonyc wrote​:

Applied as merge commit 9c6681c.

Leaving this open a bit for any breakages.

Reported as broken on FreeBSD​:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=222258

possibly due to a FreeBSD bug, but from the description and what I
understand of the FreeBSD code this might occur in a container where
rename() would work.

Added a workaround in 84dbe61.

Tony

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Nov 19, 2018

From @jkeenan

On Mon, 18 Sep 2017 01​:26​:24 GMT, tonyc wrote​:

On Tue, 12 Sep 2017 16​:56​:55 -0700, tonyc wrote​:

On Sun, 10 Sep 2017 22​:43​:05 -0700, tonyc wrote​:

Applied as merge commit 9c6681c.

Leaving this open a bit for any breakages.

Reported as broken on FreeBSD​:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=222258

possibly due to a FreeBSD bug, but from the description and what I
understand of the FreeBSD code this might occur in a container where
rename() would work.

Added a workaround in 84dbe61.

Tony

TonyC​: Is this ticket closable?

Thank you very much.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Nov 19, 2018

@tonycoz - Status changed from 'open' to 'pending release'

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented May 22, 2019

From @khwilliamson

Thank you for filing this report. You have helped make Perl better.

With the release today of Perl 5.30.0, this and 160 other issues have been
resolved.

Perl 5.30.0 may be downloaded via​:
https://metacpan.org/release/XSAWYERX/perl-5.30.0

If you find that the problem persists, feel free to reopen this ticket.

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented May 22, 2019

@khwilliamson - Status changed from 'pending release' to 'resolved'

@p5pRT p5pRT closed this May 22, 2019
@p5pRT p5pRT added the Severity Low label Oct 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant
You can’t perform that action at this time.