Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-113626: Add allow_code parameter in marshal functions #113648

Merged
merged 6 commits into from
Jan 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
41 changes: 31 additions & 10 deletions Doc/library/marshal.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,11 @@ transfer of Python objects through RPC calls, see the modules :mod:`pickle` and
:mod:`shelve`. The :mod:`marshal` module exists mainly to support reading and
writing the "pseudo-compiled" code for Python modules of :file:`.pyc` files.
Therefore, the Python maintainers reserve the right to modify the marshal format
in backward incompatible ways should the need arise. If you're serializing and
in backward incompatible ways should the need arise.
The format of code objects is not compatible between Python versions,
even if the version of the format is the same.
De-serializing a code object in the incorrect Python version has undefined behavior.
If you're serializing and
de-serializing Python objects, use the :mod:`pickle` module instead -- the
performance is comparable, version independence is guaranteed, and pickle
supports a substantially wider range of objects than marshal.
Expand All @@ -40,7 +44,8 @@ Not all Python object types are supported; in general, only objects whose value
is independent from a particular invocation of Python can be written and read by
this module. The following types are supported: booleans, integers, floating
point numbers, complex numbers, strings, bytes, bytearrays, tuples, lists, sets,
frozensets, dictionaries, and code objects, where it should be understood that
frozensets, dictionaries, and code objects (if *allow_code* is true),
where it should be understood that
tuples, lists, sets, frozensets and dictionaries are only supported as long as
the values contained therein are themselves supported. The
singletons :const:`None`, :const:`Ellipsis` and :exc:`StopIteration` can also be
Expand All @@ -54,27 +59,32 @@ bytes-like objects.
The module defines these functions:


.. function:: dump(value, file[, version])
.. function:: dump(value, file, version=version, /, *, allow_code=True)

Write the value on the open file. The value must be a supported type. The
file must be a writeable :term:`binary file`.

If the value has (or contains an object that has) an unsupported type, a
:exc:`ValueError` exception is raised --- but garbage data will also be written
to the file. The object will not be properly read back by :func:`load`.
:ref:`Code objects <code-objects>` are only supported if *allow_code* is true.

The *version* argument indicates the data format that ``dump`` should use
(see below).

.. audit-event:: marshal.dumps value,version marshal.dump

.. versionchanged:: 3.13
Added the *allow_code* parameter.

.. function:: load(file)

.. function:: load(file, /, *, allow_code=True)

Read one value from the open file and return it. If no valid value is read
(e.g. because the data has a different Python version's incompatible marshal
format), raise :exc:`EOFError`, :exc:`ValueError` or :exc:`TypeError`. The
file must be a readable :term:`binary file`.
format), raise :exc:`EOFError`, :exc:`ValueError` or :exc:`TypeError`.
:ref:`Code objects <code-objects>` are only supported if *allow_code* is true.
The file must be a readable :term:`binary file`.

.. audit-event:: marshal.load "" marshal.load

Expand All @@ -88,24 +98,32 @@ The module defines these functions:
This call used to raise a ``code.__new__`` audit event for each code object. Now
it raises a single ``marshal.load`` event for the entire load operation.

.. versionchanged:: 3.13
Added the *allow_code* parameter.


.. function:: dumps(value[, version])
.. function:: dumps(value, version=version, /, *, allow_code=True)

Return the bytes object that would be written to a file by ``dump(value, file)``. The
value must be a supported type. Raise a :exc:`ValueError` exception if value
has (or contains an object that has) an unsupported type.
:ref:`Code objects <code-objects>` are only supported if *allow_code* is true.

The *version* argument indicates the data format that ``dumps`` should use
(see below).

.. audit-event:: marshal.dumps value,version marshal.dump

.. versionchanged:: 3.13
Added the *allow_code* parameter.

.. function:: loads(bytes)

.. function:: loads(bytes, /, *, allow_code=True)

Convert the :term:`bytes-like object` to a value. If no valid value is found, raise
:exc:`EOFError`, :exc:`ValueError` or :exc:`TypeError`. Extra bytes in the
input are ignored.
:exc:`EOFError`, :exc:`ValueError` or :exc:`TypeError`.
:ref:`Code objects <code-objects>` are only supported if *allow_code* is true.
Extra bytes in the input are ignored.

.. audit-event:: marshal.loads bytes marshal.load

Expand All @@ -114,6 +132,9 @@ The module defines these functions:
This call used to raise a ``code.__new__`` audit event for each code object. Now
it raises a single ``marshal.loads`` event for the entire load operation.

.. versionchanged:: 3.13
Added the *allow_code* parameter.


In addition, the following constants are defined:

Expand Down
8 changes: 8 additions & 0 deletions Doc/whatsnew/3.13.rst
Original file line number Diff line number Diff line change
Expand Up @@ -247,6 +247,14 @@ ipaddress
* Add the :attr:`ipaddress.IPv4Address.ipv6_mapped` property, which returns the IPv4-mapped IPv6 address.
(Contributed by Charles Machalow in :gh:`109466`.)

marshal
-------

* Add the *allow_code* parameter in module functions.
Passing ``allow_code=False`` prevents serialization and de-serialization of
code objects which are incompatible between Python versions.
(Contributed by Serhiy Storchaka in :gh:`113626`.)

mmap
----

Expand Down
1 change: 1 addition & 0 deletions Include/internal/pycore_global_objects_fini_generated.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Include/internal/pycore_global_strings.h
Original file line number Diff line number Diff line change
Expand Up @@ -276,6 +276,7 @@ struct _Py_global_strings {
STRUCT_FOR_ID(after_in_child)
STRUCT_FOR_ID(after_in_parent)
STRUCT_FOR_ID(aggregate_class)
STRUCT_FOR_ID(allow_code)
STRUCT_FOR_ID(append)
STRUCT_FOR_ID(argdefs)
STRUCT_FOR_ID(arguments)
Expand Down
1 change: 1 addition & 0 deletions Include/internal/pycore_runtime_init_generated.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions Include/internal/pycore_unicodeobject_generated.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

26 changes: 26 additions & 0 deletions Lib/test/test_marshal.py
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,32 @@ def test_different_filenames(self):
self.assertEqual(co1.co_filename, "f1")
self.assertEqual(co2.co_filename, "f2")

def test_no_allow_code(self):
data = {'a': [({0},)]}
dump = marshal.dumps(data, allow_code=False)
self.assertEqual(marshal.loads(dump, allow_code=False), data)

f = io.BytesIO()
marshal.dump(data, f, allow_code=False)
f.seek(0)
self.assertEqual(marshal.load(f, allow_code=False), data)

co = ExceptionTestCase.test_exceptions.__code__
data = {'a': [({co, 0},)]}
dump = marshal.dumps(data, allow_code=True)
self.assertEqual(marshal.loads(dump, allow_code=True), data)
with self.assertRaises(ValueError):
marshal.dumps(data, allow_code=False)
with self.assertRaises(ValueError):
marshal.loads(dump, allow_code=False)

marshal.dump(data, io.BytesIO(), allow_code=True)
self.assertEqual(marshal.load(io.BytesIO(dump), allow_code=True), data)
with self.assertRaises(ValueError):
marshal.dump(data, io.BytesIO(), allow_code=False)
with self.assertRaises(ValueError):
marshal.load(io.BytesIO(dump), allow_code=False)

@requires_debug_ranges()
def test_minimal_linetable_with_no_debug_ranges(self):
# Make sure when demarshalling objects with `-X no_debug_ranges`
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Add support for the *allow_code* argument in the :mod:`marshal` module.
Passing ``allow_code=False`` prevents serialization and de-serialization of
code objects which is incompatible between Python versions.