Skip to content

Commit

Permalink
00397: PEP 706, CVE-2007-4559: Filter API for tarfile.extractall
Browse files Browse the repository at this point in the history
Add API for allowing checks on the content of tar files, allowing callers to mitigate
directory traversal (CVE-2007-4559) and related issues.

Python 3.12 will warn if this API is not used.
Python 3.14 will fail if it's not used.

Backport from python#102950

Change document: https://peps.python.org/pep-0706/
  • Loading branch information
encukou authored and hroncok committed Mar 7, 2024
1 parent e7ecd65 commit 34ba358
Show file tree
Hide file tree
Showing 9 changed files with 1,790 additions and 78 deletions.
33 changes: 27 additions & 6 deletions Doc/library/shutil.rst
Expand Up @@ -537,7 +537,7 @@ provided. They rely on the :mod:`zipfile` and :mod:`tarfile` modules.
Remove the archive format *name* from the list of supported formats.


.. function:: unpack_archive(filename[, extract_dir[, format]])
.. function:: unpack_archive(filename[, extract_dir[, format[, filter]]])

Unpack an archive. *filename* is the full path of the archive.

Expand All @@ -551,6 +551,24 @@ provided. They rely on the :mod:`zipfile` and :mod:`tarfile` modules.
registered for that extension. In case none is found,
a :exc:`ValueError` is raised.

The keyword-only *filter* argument, which was added in Python 3.6.16,
is passed to the underlying unpacking function.
For zip files, *filter* is not accepted.
For tar files, it is recommended to set it to ``'data'``,
unless using features specific to tar and UNIX-like filesystems.
(See :ref:`tarfile-extraction-filter` for details.)
The ``'data'`` filter will become the default for tar files
in Python 3.14.

.. warning::

Never extract archives from untrusted sources without prior inspection.
It is possible that files are created outside of the path specified in
the *extract_dir* argument, e.g. members that have absolute filenames
starting with "/" or filenames with two dots "..".

.. versionchanged:: 3.6.16
Added the *filter* argument.

.. function:: register_unpack_format(name, extensions, function[, extra_args[, description]])

Expand All @@ -559,11 +577,14 @@ provided. They rely on the :mod:`zipfile` and :mod:`tarfile` modules.
``.zip`` for Zip files.

*function* is the callable that will be used to unpack archives. The
callable will receive the path of the archive, followed by the directory
the archive must be extracted to.

When provided, *extra_args* is a sequence of ``(name, value)`` tuples that
will be passed as keywords arguments to the callable.
callable will receive:

- the path of the archive, as a positional argument;
- the directory the archive must be extracted to, as a positional argument;
- possibly a *filter* keyword argument, if it was given to
:func:`unpack_archive`;
- additional keyword arguments, specified by *extra_args* as a sequence
of ``(name, value)`` tuples.

*description* can be provided to describe the format, and will be returned
by the :func:`get_unpack_formats` function.
Expand Down

0 comments on commit 34ba358

Please sign in to comment.