Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix PEP706 warning #3319

Closed
pmoravec opened this issue Jul 26, 2023 · 6 comments
Closed

Fix PEP706 warning #3319

pmoravec opened this issue Jul 26, 2023 · 6 comments

Comments

@pmoravec
Copy link
Contributor

pmoravec commented Jul 26, 2023

Since Python 3.12, a new warning will be raised in sos/cleaner/archives/__init__.py : archive.extractall(path). The call will emit a warning by default. See https://peps.python.org/pep-0706/ for background / rationale.

To prevent that, we can add something like this before the call:

archive.extraction_filter = getattr(tarfile, 'data_filter',
                                       (lambda member, path: member))

This is compatible with unpatched versions of Python, and for patched Python it is equivalent to the call: archive.extractall(path, filter='data').

The 'data' filter above attempts a "safe" extraction, intended for pure data archives. For example:

  • prevents extracting outside the target directory, and to absolute paths (by raising an exception)
  • prevents symlinks pointing outside the target directory, and to absolute paths
  • adjusts permissions (for the owner, only the executable bit is honored)

See PEP 706 for details: https://peps.python.org/pep-0706/#filters

If we trust the tarball, we should use 'fully_trusted_filter' (or filter='fully_trusted') instead. That will preserve the existing behavior.

I feel we shall stick on the current behaviour (which still means a code change but with fully_trusted_filter attr option) since the extracted archive might(?) break some assumptions (i.e. the symlinks?) imposed to the safe extraction in some use cases - I am afraid using data_filter could end up in raising uncaught warnings in some use cases (or should we be more strict and fix issues behind those use cases, rather?)

@pmoravec
Copy link
Contributor Author

pmoravec commented Aug 8, 2023

The more I read about the data filter and think about it, the more I think we should use it. Since an extraction of an "unsafe file" (say, extracting outside directory, or extracting symlinks pointing outside the directory) would mean a bug is sos how it prepared the tarball.

Therefore I would go for data filter rather than for fully_trusted filter - I will prepare a PR for that within a day or two.

Any objections or thoughts, @TurboTurtle , @arif-ali or @mhradile ?

@mhradile
Copy link
Contributor

mhradile commented Aug 8, 2023

No objections. Definitely for closing that security hole.

@TurboTurtle
Copy link
Member

No objections here.

pmoravec added a commit to pmoravec/sos that referenced this issue Aug 8, 2023
Since Python 3.12, archive.extractall would require setting an
extraction filter otherwise an exception is raised. See PEP 706 for
more.

Since sos extracts just tarballs that should be compliant with safe
extraction (no extracted file outside the target directory etc.), we
should stick to the data filter.

Relevant: sosreport#3319
Closes: sosreport#3330

Signed-off-by: Pavel Moravec <pmoravec@redhat.com>
@arif-ali
Copy link
Member

arif-ali commented Aug 8, 2023

makes sense

pmoravec added a commit to pmoravec/sos that referenced this issue Aug 10, 2023
Since Python 3.12, archive.extractall would require setting an
extraction filter otherwise an exception is raised. See PEP 706 for
more.

Since Python 3.10 and 3.11 can raise false alarms as exceptions (see sosreport#3330
for examples), we must use the legacy fully_trusted filter.

Relevant: sosreport#3319
Closes: sosreport#3330

Signed-off-by: Pavel Moravec <pmoravec@redhat.com>
TurboTurtle pushed a commit that referenced this issue Aug 15, 2023
Since Python 3.12, archive.extractall would require setting an
extraction filter otherwise an exception is raised. See PEP 706 for
more.

Since Python 3.10 and 3.11 can raise false alarms as exceptions (see #3330
for examples), we must use the legacy fully_trusted filter.

Relevant: #3319
Closes: #3330

Signed-off-by: Pavel Moravec <pmoravec@redhat.com>
@jjansky1
Copy link
Contributor

Can we close this in favor of PR #3330 ?

@TurboTurtle
Copy link
Member

Indeed we can.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants