Skip to content

Crash: marshal.loads SIGSEGV on self-referencing TYPE_TUPLE with FLAG_REF #148653

@mjbommar

Description

@mjbommar

Bug report

Bug description:

Summary

marshal.loads() deterministically segfaults on a 16-byte structural
payload containing a TYPE_TUPLE | FLAG_REF whose elements include a
TYPE_REF back to the partial tuple itself.

The root cause is that R_REF() registers the tuple in p->refs
before its slots are populated. A nested TYPE_REF back-reference
then yields the partial tuple to a hashing site (PySet_Add), and
tuplehash calls PyObject_Hash(NULL) on the unfilled slot.

TYPE_FROZENSET, TYPE_CODE, and TYPE_SLICE already use the correct
two-phase pattern (r_ref_reserve / r_ref_insert) that avoids this.
TYPE_TUPLE, TYPE_LIST, TYPE_DICT, and TYPE_SET do not.

Originally filed as GHSA-m7gv-g5p9-9qqq. PSRT assessed this as outside
the security threat model since marshal.loads is documented as not secure
against malicious data. Converting to a public bug per their guidance.

Reproducer

import marshal
marshal.loads(b'\xa8\x02\x00\x00\x00N<\x01\x00\x00\x00r\x00\x00\x00\x00')

Byte stream:

  • \xa8 = TYPE_TUPLE | FLAG_REF
  • \x02\x00\x00\x00 = n = 2
  • N = item[0] = TYPE_NONE
  • <\x01\x00\x00\x00 = item[1] = TYPE_SET, n = 1
  • r\x00\x00\x00\x00 = element = TYPE_REF(0), the partial outer tuple

Exit code 139 (SIGSEGV). faulthandler stack:

PySet_Add -> set_add_entry -> PyObject_Hash -> tuplehash -> NULL deref

Affected versions

Crashes on every version I tested: 3.9, 3.10, 3.11, 3.12, 3.13, 3.14.

Behavioral change after fix

With the two-phase pattern, the Py_None placeholder in p->refs
is detected by the existing TYPE_REF handler at marshal.c:1675:

if (v == Py_None) {
    PyErr_SetString(PyExc_ValueError, "bad marshal data (invalid reference)");
    break;
}

So the fix changes the behavior from SIGSEGV to
ValueError: bad marshal data (invalid reference).

Suggested fix

A fix with regression tests for tuple, list, set, and dict
self-reference payloads is at #148652

CPython versions tested on:

3.14

Operating systems tested on:

Linux

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    interpreter-core(Objects, Python, Grammar, and Parser dirs)type-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions