Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restrict capacity of plists to 2^28 resp. 2^60 (revision of #2039) #2064

Merged
merged 3 commits into from
Jan 8, 2018

Conversation

fingolfin
Copy link
Member

This is a rebased version of PR #2039, with the fixup commit squashed, and a new size restriction assert added to NEW_PLIST. As such, it addresses parts of issue #1479 (but does not fully resolve it; at least one overflow bug in GASMAN needs to be fixed, as well a bug in one of the GNU libc memmove implementations).

I also provided a new, extensive commit message for the second commit, and since that now makes the majority of the work in that commit, took the liberty of changing authorship of that commit to myself. Here is the new commit message:

By adding bounds checks to GrowPlist and NEW_PLIST, we restrict the capacity
of plain lists to the maximum value of an immediate integer, i.e. 2^28 on a
32 bit system and 2^60 on a 64 bit system. The check in GrowPlist triggers
an error, as it can be triggered by the user. The check in NEW_PLIST is a
run-time assertion, and should never be triggered by user actions, only by
buggy kernel code.

This restriction fixes overflows and other problems, which can lead to
crashes, corrupt data or nonsense computations.

It poses no actual limitation in practice, for the following reasons:

First off, many other places already effectively limited the length of a
plist they can interact with to the maximum value of an immediate integer.
E.g. such limitations exist for sublist access via l{poss}, EmptyPlist(),
ASS_PLIST_DEFAULT, and more.

Secondly, with this change, the effective size (in bytes) of a plist of
this maximal length would be 2^30 resp. 2^63. The latter certainly poses no
actual limitation. The former corresponds to 1 GB. Conceivably, GAP could
support slightly larger bags on a 32bit system (the GASMAN heap can grow up
to 3GB if the host system supports it). However, there is not much you can
do with GAP if most of its heap is filled with a single gigantic plist, so
this restriction seems acceptable.

@fingolfin fingolfin added kind: bug Issues describing general bugs, and PRs fixing them topic: kernel labels Jan 4, 2018
@fingolfin fingolfin added this to the GAP 4.9.1 milestone Jan 4, 2018
src/plist.c Outdated
if (!UINT_FITS_IN_INTOBJ(need)) {
ErrorMayQuit("GrowPlist: List size too large", 0, 0);
}

/* find out how large the plain list should become */
good = 5 * (SIZE_OBJ(list)/sizeof(Obj)-1) / 4 + 4;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would check the overflow here:

    if (! UINT_FITS_IN_INTOBJ(good))
        good = (((UInt)1) << NR_SMALL_INT_BITS) - 1;

src/plist.c Outdated
/* find out how large the plain list should become */
good = 5 * (SIZE_OBJ(list)/sizeof(Obj)-1) / 4 + 4;

/* but maybe we need more */
if ( need < good ) { plen = good; }
else { plen = need; }
if (need < good && UINT_FITS_IN_INTOBJ(good))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then this can be simplified to

    if (need < good)

In your version adding and appending to lists of length > 4/5 * maximal length becomes very inefficient.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent point, thanks, I'll update this PR.

@frankluebeck
Copy link
Member

I always thought that the length of plists is restricted to 2^28-1 (resp. 2^60-1). I agree that for the reasons given above this is not a serious restriction. I guess a lot of code does actually assume that the length of a plist fits into an immediate integer. Therefore I consider this pull request as a sensible bug fix.

Maybe this restriction should be mentioned somewhere in the manual?

A possible test on 32 bit systems could be

gap> l:=0*[1..QuoInt(2^28*6,7)];;
gap> Append(l,0+[1..100]);
gap> m:=MemoryUsage(l);;
gap> m>2^30+16;
false

(Or use SIZE_OBJ after it is fixed to work with large objects.)

Side remark:

(the GASMAN heap can grow up to 3GB if the host system supports it)

On my 64 bit systems I can even call 32-bit gap with gap -m 3.9g.

@fingolfin
Copy link
Member Author

I'll look into addressing all of that. I'll also replace SIZE_OBJ by an alias for SHALLOW_SIZE (which already is correct and doesn't crash for immediate values as input)

That is, use ObjInt_UInt instead of INTOBJ_INT to convert the object size
into a GAP integer object, and handle T_INT and T_FFE inputs explicitly.

Also replace the kernel function SHALLOW_SIZE with a synonym pointing to
SIZE_OBJ. While the former name may be less misleading (a user might expect
SIZE_OBJ to include the sizes of subobjects), the latter name matches the
name of the kernel function, just like with TNUM_OBJ and TNAM_OBJ, which
simplifies life for kernel developers who want to use these UNDOCUMENTED,
INTERNAL and potentially DANGEROUS APIs. As such, clarity of names is not a
primary concern.
Also convert as much code as possible to use the new constants,
and tweak some additional places using NR_SMALL_INT_BITS:

* FindCommonField used it in a bounds check, but not quite right; use
  INT_INTOBJ_MAX instead
* FuncGASMAN_STATS was simplified by using ObjInt_Int
* ProdIntObj and PowObjInt needlessly used NR_SMALL_INT_BITS+1, changed
  to avoid the +1
* ModInt and RemInt included a check 'i <= (1L<<NR_SMALL_INT_BITS)' for
  an immediate value i, which of course was always true, so removed it
By adding bounds checks to GrowPlist and NEW_PLIST, we restrict the capacity
of plain lists to the maximum value of an immediate integer, i.e. 2^28 on a
32 bit system and 2^60 on a 64 bit system. The check in GrowPlist triggers
an error, as it can be triggered by the user. The check in NEW_PLIST is a
run-time assertion, and should never be triggered by user actions, only by
buggy kernel code.

This restriction fixes overflows and other problems, which can lead to
crashes, corrupt data or nonsense computations.

It poses no actual limitation in practice, for the following reasons:

First off, many other places already effectively limited the length of a
plist they can interact with to the maximum value of an immediate integer.
E.g. such limitations exist for sublist access via l{poss}, EmptyPlist(),
ASS_PLIST_DEFAULT, and more.

Secondly, with this change,  the effective size (in bytes) of a plist of this
maximal length would be 2^30 resp. 2^63. The latter certainly poses no actual
limitation. The former corresponds to 1 GB. While GAP in principle supports
slightly larger bags on 32 bit systems (the GASMAN heap can grow up to almost
4GB if the host system supports it),, there is not much you can do with GAP
if most of its heap is filled with a gigantic plist, so this restriction
seems acceptable.
@fingolfin
Copy link
Member Author

I have now addressed most of the points @frankluebeck brought up, I hope, plus some more:

  • check in GrowPlist adjusted as suggested by Frank;
  • fixed SIZE_OBJ to return sane values for large objects, and to not crash when called on an immediate object it (and while at it, merged it with SHALLOW_OBJ, which now is an obsolete synonym)
  • replaced INT_FITS_IN_INTOBJ and UINT_FITS_IN_INTOBJ (which I found ungainly) by constants for the min/max allowed values; I also made a lot of code use them; I find the resulting checks easier to understand;

I also looked into adding the length restriction to the manual, but couldn't find a good place. We already mention a similar restriction for ranges (were the two ends must be immediate integers). But I couldn't find any section dedicated to plain lists / "internally represented lists", although they are mentioned on and off. Perhaps such a section should be added? Or perhaps it's there and I just missed it?

I also noticed that section 21.2 "Basic Operations for Lists" starts out saying "The basic operations for lists are ..." and then provides a set of points which, curiously enough, does not cover creation of lists. I found no place in the whole manual where this is described (I focused on checking chapter 21, as well as chapter 4, "The Programming Language")

UInt bound = 1UL << NR_SMALL_INT_BITS;

if (i < bound) {
if (i <= INT_INTOBJ_MAX) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is relying on how C handles signed / unsigned comparisons. I can't remember his they come out, and would have a small preference for assigning INT_INTOBJ_MAX to a UInt, just in case future people worry.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I briefly thought about adding an UINT_INTOBJ_MAX, but the C standard is pretty clear on this, and always has been. In a (simplified) nutshell, a signed and unsigned value are compared by casting the signed value to unsigned, which is always what we want for these bounds checks.

In short, INT_INTOBJ_MAX and UINT_INTOBJ_MAX would always do exactly the same thing, except when accidentally using UINT_INTOBJ_MAX to bound check a signed quantity... As such, UINT_INTOBJ_MAX could be misused by accident. I don't see a way to misuse INT_INTOBJ_MAX by accident.

@ChrisJefferson ChrisJefferson merged commit 4bf6638 into gap-system:stable-4.9 Jan 8, 2018
@fingolfin fingolfin deleted the mh/fix-list-length branch January 11, 2018 10:51
@olexandr-konovalov
Copy link
Member

@fingolfin I will omit this from release notes: as #2039 sats, it ensures that GrowPlist checks the size we are growing the list to is a valid list size, and stops us then overflowing in ResizeBag later. This is an internal functionality.

@olexandr-konovalov olexandr-konovalov added the release notes: not needed PRs introducing changes that are wholly irrelevant to the release notes label Jan 29, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: bug Issues describing general bugs, and PRs fixing them release notes: not needed PRs introducing changes that are wholly irrelevant to the release notes topic: kernel
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants