-
Notifications
You must be signed in to change notification settings - Fork 567
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Storable loses information on large strings #15238
Comments
From @jkeenan[Originally reported in http://www.nntp.perl.org/group/perl.perl5.porters/2016/03/msg235167.html -- jkeenan] Hi, My Perl version is v5.20.1 for x86_64-Linux. When I used Sortable to store For example: foreach my $i(750615880 .. 75061589){ Thanks, |
From @jkeenanOn Fri Mar 18 13:05:34 2016, jkeenan wrote:
Does anyone on list have a machine with enough memory to explore this bug report? Thank you very much. -- |
The RT System itself - Status changed from 'new' to 'open' |
From @andk
> Does anyone on list have a machine with enough memory to explore this bug report? > Thank you very much. First I could not reproduce what the OP describes, I only could observe # perl -e ' On my machine I get no SEGV when I pass 536870910 to the script. Reproduced with several old and new perls. This is with 5.23.9: Program terminated with signal SIGSEGV, Segmentation fault. -- |
From @tonycozOn Fri Mar 18 13:05:34 2016, jkeenan wrote:
Storable uses I32 and unsigned long internally for lengths, including both using it for the length of the scalar being stored and some intermediate lengths when calculating the new work buffer size for freeze()/thaw(). The first of the attached patches simply rejects over-large scalars. The second handles scalars for which the size is too large for I32 as new Storable tags that an older Storable will reject. Tony |
From @tonycoz0001-perl-127743-simple-stupid-prevent-storing-very-large.patchFrom fe1a27444c2dc30d49f7407163a6b037f920e2be Mon Sep 17 00:00:00 2001
From: Tony Cook <tony@develop-help.com>
Date: Wed, 13 Apr 2016 08:07:58 +0200
Subject: (perl #127743) simple stupid: prevent storing very large strings
The length of a very large string overflows the 32-bit length stored
in the output.
---
dist/Storable/Storable.pm | 2 +-
dist/Storable/Storable.xs | 5 +++++
2 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/dist/Storable/Storable.pm b/dist/Storable/Storable.pm
index c8f6db1..e928401 100644
--- a/dist/Storable/Storable.pm
+++ b/dist/Storable/Storable.pm
@@ -22,7 +22,7 @@ package Storable; @ISA = qw(Exporter);
use vars qw($canonical $forgive_me $VERSION);
-$VERSION = '2.56';
+$VERSION = '2.57';
BEGIN {
if (eval { local $SIG{__DIE__}; require Log::Agent; 1 }) {
diff --git a/dist/Storable/Storable.xs b/dist/Storable/Storable.xs
index 83cd001..49e425a 100644
--- a/dist/Storable/Storable.xs
+++ b/dist/Storable/Storable.xs
@@ -2242,6 +2242,11 @@ static int store_scalar(pTHX_ stcxt_t *cxt, SV *sv)
len, SX_VSTRING, SX_LVSTRING);
}
#endif
+#if Size_t_size > 4
+ if (len > I32_MAX) {
+ CROAK(("String too long for Storable"));
+ }
+#endif
wlen = (I32) len; /* WLEN via STORE_SCALAR expects I32 */
if (SvUTF8 (sv))
--
2.1.4
|
From @tonycoz0002-perl-127743-handle-2GB-scalars-on-64-bit-platforms.patchFrom f0ca707438b56331b963a0cbcd1e7d1ba053a969 Mon Sep 17 00:00:00 2001
From: Tony Cook <tony@develop-help.com>
Date: Mon, 18 Apr 2016 06:52:28 +0200
Subject: (perl #127743) handle 2GB+ scalars on 64-bit platforms
---
MANIFEST | 1 +
dist/Storable/Storable.xs | 143 +++++++++++++++++++++++++++++++++++++++++-----
dist/Storable/t/large.t | 68 ++++++++++++++++++++++
dist/Storable/t/malice.t | 4 +-
4 files changed, 200 insertions(+), 16 deletions(-)
create mode 100644 dist/Storable/t/large.t
diff --git a/MANIFEST b/MANIFEST
index 3da3119..589194b 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -3381,6 +3381,7 @@ dist/Storable/t/HAS_OVERLOAD.pm For auto-requiring of mdoules for overload
dist/Storable/t/integer.t See if Storable works
dist/Storable/t/interwork56.t Test compatibility kludge for 64bit data under 5.6.x
dist/Storable/t/just_plain_nasty.t See if Storable works
+dist/Storable/t/large.t See if Storable handles large scalars.
dist/Storable/t/leaks.t See if Storable leaks (skips in core)
dist/Storable/t/lock.t See if Storable works
dist/Storable/t/make_56_interwork.pl Make test data for interwork56.t
diff --git a/dist/Storable/Storable.xs b/dist/Storable/Storable.xs
index 49e425a..802fcf9 100644
--- a/dist/Storable/Storable.xs
+++ b/dist/Storable/Storable.xs
@@ -46,6 +46,10 @@
#endif /* PERLIO_IS_STDIO */
#endif /* USE_PERLIO */
+#ifndef SSize_t_MAX
+#define SSize_t_MAX (SSize_t)(~(Size_t)0 >> 1)
+#endif
+
/*
* Earlier versions of perl might be used, we can't assume they have the latest!
*/
@@ -157,7 +161,9 @@
#define SX_VSTRING C(29) /* vstring forthcoming (small) */
#define SX_LVSTRING C(30) /* vstring forthcoming (large) */
#define SX_SVUNDEF_ELEM C(31) /* array element set to &PL_sv_undef */
-#define SX_ERROR C(32) /* Error */
+#define SX_VLSCALAR C(32) /* scalar longer than I32_MAX */
+#define SX_VLUTF8STR C(33) /* UTF-8 string longer than I32_MAX */
+#define SX_ERROR C(34) /* Error */
/*
* Those are only used to retrieve "old" pre-0.6 binary images.
@@ -527,11 +533,11 @@ static stcxt_t *Context_ptr = NULL;
#define MMASK (MGROW - 1)
#define round_mgrow(x) \
- ((unsigned long) (((unsigned long) (x) + MMASK) & ~MMASK))
+ ((Size_t) (((Size_t) (x) + MMASK) & ~MMASK))
#define trunc_int(x) \
- ((unsigned long) ((unsigned long) (x) & ~(sizeof(int)-1)))
+ ((Size_t) ((Size_t) (x) & ~(sizeof(int)-1)))
#define int_aligned(x) \
- ((unsigned long) (x) == trunc_int(x))
+ ((Size_t) (x) == trunc_int(x))
#define MBUF_INIT(x) \
STMT_START { \
@@ -590,8 +596,8 @@ static stcxt_t *Context_ptr = NULL;
#define MBUF_XTEND(x) \
STMT_START { \
- int nsz = (int) round_mgrow((x)+msiz); \
- int offset = mptr - mbase; \
+ STRLEN nsz = round_mgrow((x)+msiz); \
+ STRLEN offset = mptr - mbase; \
ASSERT(!cxt->membuf_ro, ("mbase is not read-only")); \
TRACEME(("** extending mbase from %d to %d bytes (wants %d new)", \
msiz, nsz, (x))); \
@@ -1236,6 +1242,8 @@ static SV *retrieve_weakoverloaded(pTHX_ stcxt_t *cxt, const char *cname);
static SV *retrieve_vstring(pTHX_ stcxt_t *cxt, const char *cname);
static SV *retrieve_lvstring(pTHX_ stcxt_t *cxt, const char *cname);
static SV *retrieve_svundef_elem(pTHX_ stcxt_t *cxt, const char *cname);
+static SV *retrieve_vlscalar(pTHX_ stcxt_t *cxt, const char *cname);
+static SV *retrieve_vlutf8str(pTHX_ stcxt_t *cxt, const char *cname);
static const sv_retrieve_t sv_retrieve[] = {
0, /* SX_OBJECT -- entry unused dynamically */
@@ -1270,6 +1278,8 @@ static const sv_retrieve_t sv_retrieve[] = {
(sv_retrieve_t)retrieve_vstring, /* SX_VSTRING */
(sv_retrieve_t)retrieve_lvstring, /* SX_LVSTRING */
(sv_retrieve_t)retrieve_svundef_elem, /* SX_SVUNDEF_ELEM */
+ (sv_retrieve_t)retrieve_vlscalar, /* SX_VLSCALAR */
+ (sv_retrieve_t)retrieve_vlutf8str, /* SX_VLUTF8STR */
(sv_retrieve_t)retrieve_other, /* SX_ERROR */
};
@@ -2222,8 +2232,6 @@ static int store_scalar(pTHX_ stcxt_t *cxt, SV *sv)
#ifdef SvVOK
MAGIC *mg;
#endif
- I32 wlen; /* For 64-bit machines */
-
string_readlen:
pv = SvPV(sv, len);
@@ -2244,15 +2252,30 @@ static int store_scalar(pTHX_ stcxt_t *cxt, SV *sv)
#endif
#if Size_t_size > 4
if (len > I32_MAX) {
- CROAK(("String too long for Storable"));
+ STRLEN wlen = len;
+ unsigned char smark = SvUTF8(sv) ? SX_VLUTF8STR : SX_VLSCALAR;
+
+ PUTMARK(smark);
+ if (cxt->netorder && BYTEORDER == 0x87654321) {
+ ((U32*)wlen)[0] = htonl(len >> 32);
+ ((U32*)wlen)[1] = htonl(len & 0xffffffff);
+ }
+ else {
+ wlen = len;
+ }
+ WRITE(&wlen, sizeof(wlen));
+ WRITE(pv, len);
}
+ else
#endif
- wlen = (I32) len; /* WLEN via STORE_SCALAR expects I32 */
- if (SvUTF8 (sv))
- STORE_UTF8STR(pv, wlen);
- else
- STORE_SCALAR(pv, wlen);
+ {
+ I32 wlen = (I32) len; /* WLEN via STORE_SCALAR expects I32 */
+ if (SvUTF8 (sv))
+ STORE_UTF8STR(pv, wlen);
+ else
+ STORE_SCALAR(pv, wlen);
+ }
TRACEME(("ok (scalar 0x%"UVxf" '%s', length = %"IVdf")",
PTR2UV(sv), SvPVX(sv), (IV)len));
} else
@@ -4990,6 +5013,69 @@ static SV *retrieve_scalar(pTHX_ stcxt_t *cxt, const char *cname)
}
/*
+ * retrieve_lscalar
+ *
+ * Retrieve defined long (string) scalar.
+ *
+ * Layout is SX_LSCALAR <length> <data>, with SX_LSCALAR already read.
+ * The scalar is "long" in that <length> is larger than LG_SCALAR so it
+ * was not stored on a single byte.
+ */
+static SV *retrieve_vlscalar(pTHX_ stcxt_t *cxt, const char *cname)
+{
+#if Size_t_size > 4
+ STRLEN len;
+ SV *sv;
+ HV *stash;
+
+ ASSERT(Size_t_size == 8, ("Size_t isn't 8 bytes, code needs re-work"));
+ if (cxt->netorder && BYTEORDER == 0x87654321) {
+ STRLEN rlen;
+ READ(&rlen, sizeof(rlen));
+ len = ((STRLEN)((U32*)&rlen)[0] << 32) | ((U32*)&rlen)[1];
+ }
+ else {
+ READ(&len, sizeof(len));
+ }
+ TRACEME(("retrieve_vlscalar (#%d), len = %"IVdf, cxt->tagnum, (IV) len));
+ if (len < I32_MAX || len > SSize_t_MAX) {
+ CROAK(("Size of very large scalar out of range"));
+ }
+
+ /*
+ * Allocate an empty scalar of the suitable length.
+ */
+
+ sv = NEWSV(10002, len);
+ stash = cname ? gv_stashpv(cname, GV_ADD) : 0;
+ SEEN_NN(sv, stash, 0); /* Associate this new scalar with tag "tagnum" */
+
+ /*
+ * WARNING: duplicates parts of sv_setpv and breaks SV data encapsulation.
+ *
+ * Now, for efficiency reasons, read data directly inside the SV buffer,
+ * and perform the SV final settings directly by duplicating the final
+ * work done by sv_setpv. Since we're going to allocate lots of scalars
+ * this way, it's worth the hassle and risk.
+ */
+
+ SAFEREAD(SvPVX(sv), len, sv);
+ SvCUR_set(sv, len); /* Record C string length */
+ *SvEND(sv) = '\0'; /* Ensure it's null terminated anyway */
+ (void) SvPOK_only(sv); /* Validate string pointer */
+ if (cxt->s_tainted) /* Is input source tainted? */
+ SvTAINT(sv); /* External data cannot be trusted */
+
+ TRACEME(("large scalar len %"IVdf" '%s'", (IV) len, SvPVX(sv)));
+ TRACEME(("ok (retrieve_lscalar at 0x%"UVxf")", PTR2UV(sv)));
+
+ return sv;
+#else
+ CROAK(("Reading a 2GB or larger string on a 32-bit platform"))'
+#endif
+}
+
+/*
* retrieve_utf8str
*
* Like retrieve_scalar(), but tag result as utf8.
@@ -5047,6 +5133,35 @@ static SV *retrieve_lutf8str(pTHX_ stcxt_t *cxt, const char *cname)
}
/*
+ * retrieve_vlutf8str
+ *
+ * Like retrieve_vlscalar(), but tag result as utf8.
+ * If we're retrieving UTF8 data in a non-UTF8 perl, croaks.
+ */
+static SV *retrieve_vlutf8str(pTHX_ stcxt_t *cxt, const char *cname)
+{
+ SV *sv;
+
+ TRACEME(("retrieve_vlutf8str"));
+
+ sv = retrieve_vlscalar(aTHX_ cxt, cname);
+ if (sv) {
+#ifdef HAS_UTF8_SCALARS
+ SvUTF8_on(sv);
+#else
+ if (cxt->use_bytes < 0)
+ cxt->use_bytes
+ = (SvTRUE(perl_get_sv("Storable::drop_utf8", GV_ADD))
+ ? 1 : 0);
+ if (cxt->use_bytes == 0)
+ UTF8_CROAK();
+#endif
+ }
+
+ return sv;
+}
+
+/*
* retrieve_vstring
*
* Retrieve a vstring, and then retrieve the stringy scalar following it,
diff --git a/dist/Storable/t/large.t b/dist/Storable/t/large.t
new file mode 100644
index 0000000..f9b2055
--- /dev/null
+++ b/dist/Storable/t/large.t
@@ -0,0 +1,68 @@
+#!./perl -w
+
+BEGIN {
+ unshift @INC, 't';
+ unshift @INC, 't/compat' if $] < 5.006002;
+ require Config; import Config;
+ if ($ENV{PERL_CORE} and $Config{'extensions'} !~ /\bStorable\b/) {
+ print "1..0 # Skip: Storable was not built\n";
+ exit 0;
+ }
+ require 'st-dump.pl';
+}
+
+use Storable qw(freeze thaw nfreeze thaw);
+
+use Test::More;
+
+# memory usage checked with top
+$ENV{PERL_TEST_MEMORY} && $ENV{PERL_TEST_MEMORY} >= 8
+ or plan skip_all => "Need 8GB for this test";
+$Config{ptrsize} >= 8
+ or plan skip_all => "Need 64-bit pointers for this test";
+
+plan tests => 4;
+
+# we might have a lot of RAM, but maybe not so much disk space, so we
+# can only test freeze()/thaw().
+
+my $x = "x"; # avoid constant folding the large x op
+my $data = [ $x x 0x88000000 ]; # 2GB RAM (and a wee bit)
+
+{
+ my $frozen = freeze($data); # another 2GB RAM
+ my $thawed = thaw($frozen); # another 2GB RAM
+ # and add a bit more in case the following
+ is_deeply($thawed, $data,
+ "check in and out match");
+ undef $frozen;
+ undef $thawed;
+}
+
+{
+ my $frozen = nfreeze($data);
+ my $thawed = thaw($frozen);
+ is_deeply($thawed, $data, "check in and out match (netorder)");
+ undef $frozen;
+ undef $thawed;
+}
+
+$x->[0] .= chr(0x100);
+
+{
+ my $frozen = freeze($data);
+ my $thawed = thaw($frozen);
+ is_deeply($thawed, $data,
+ "check in and out match (utf8)");
+ undef $frozen;
+ undef $thawed;
+}
+
+{
+ my $frozen = nfreeze($data);
+ my $thawed = thaw($frozen);
+ is_deeply($thawed, $data, "check in and out match (utf8,netorder)");
+ undef $frozen;
+ undef $thawed;
+}
+
diff --git a/dist/Storable/t/malice.t b/dist/Storable/t/malice.t
index 867a0d7..1358139 100644
--- a/dist/Storable/t/malice.t
+++ b/dist/Storable/t/malice.t
@@ -208,7 +208,7 @@ sub test_things {
$where = $file_magic + $network_magic;
}
- # Just the header and a tag 255. As 31 is currently the highest tag, this
+ # Just the header and a tag 255. As 33 is currently the highest tag, this
# is "unexpected"
$copy = substr ($contents, 0, $where) . chr 255;
@@ -228,7 +228,7 @@ sub test_things {
# local $Storable::DEBUGME = 1;
# This is the delayed croak
test_corrupt ($copy, $sub,
- "/^Storable binary image v$header->{major}.$minor6 contains data of type 255. This Storable is v$header->{major}.$minor and can only handle data types up to 31/",
+ "/^Storable binary image v$header->{major}.$minor6 contains data of type 255. This Storable is v$header->{major}.$minor and can only handle data types up to 33/",
"bogus tag, minor plus 4");
# And check again that this croak is not delayed:
{
--
2.1.4
|
From @xsawyerxOn Sun, 17 Apr 2016 22:21:07 -0700, tonyc wrote:
Does anyone object merging this? |
From @tonycozOn Sun, Feb 12, 2017 at 01:56:09PM -0800, Sawyer X via RT wrote:
At the time I was put off merging this because of: http://blogs.perl.org/users/rurban/2016/04/storable-security-problems-and-overlarge-data.html (I don't recall him sending me the fixes.) The second patch uses an op code the cperl changed skipped for some cperl allocates a different op as a prefix op to mark the next op as So the issue for the second patch is file compatibility with cperl - are Tony [1] the format of the storable binary data [2] cperl seems to treat SX_ERROR as a specific error code rather than |
From @tonycozOn Sun, 12 Feb 2017 15:44:19 -0800, tonyc wrote:
I ended up merging much of the cperl Storable changes to blead, which was released as part of perl 5.28.0 and to CPAN as Storable 3.08. Along with several other patches that fixed this issue and several others. Closing. Tony |
@tonycoz - Status changed from 'open' to 'resolved' |
Migrated from rt.perl.org#127743 (status was 'resolved')
Searchable as RT127743$
The text was updated successfully, but these errors were encountered: