-
-
Notifications
You must be signed in to change notification settings - Fork 31.7k
[security] CVE-2021-3426: Information disclosure via pydoc -p: /getfile?key=path allows to read arbitrary file on the filesystem #87154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hello Python security, Running Steps to Reproduce:
Actual results: Expected results: Additional info: Python notebook works around this by providing a token that is required to access the notepad. Depending on the system being able to read arbitrary files can allow to impersonate my, by e.g. stealing my ssh-key (if it is non-encrypted) I've originally reported this to security@python.org but I was asked to open a public issue here. |
Nice find! I am able to reproduce it too in many Python releases. I see differnt ways we can fix it: # Using a random secret generated at startup time Used any way, like by providing an hmac on getfile urls to ensure they are signed with the server secret. Con: getfile URLS won't work from a run to another run (the secret should be random and changed at every start), and can't be shared (do someone share them in the first place?) # Allowlist according to sys.path In getfile implementation, we can check if the asked path is under a path from sys.path. Con: If someone have ~/ in sys.path, we still can access all its home, or if someone start it using # Allowlist populated while generating links Idea is: each time the server generates a getfile link, the target is added to an allowlist. Each time a getfile link is requested, if the file is not in the allowlist, request is denied. Con: Refreshing a page won't work after a server restart (thus having an empty allowlist). # fnmatch allowlist We could allow only Con: Secrets stored in ----------------- My personal preference goes for the allowlist populated while generating links. |
Downstream Fedora issue: https://bugzilla.redhat.com/show_bug.cgi?id=1917807 |
Looking at the _url_handler() code in pydoc.py, this was clearly not written with web server standards in mind. None of the handlers apply security checks on the user input and there are most likely several other vulnerabilities in there to be found. It's not just the getfile query allowing reading arbitrary files. The user may well have code in his or her Python installation which is not meant to be published to other users on the same server. I'd suggest to print a big warning on the console, explaining that the web server will potentially make all content accessible by the user visible to anyone else on the same server. Perhaps adding some extra check to the html_getfile() handler would be good as well, making sure that the path is on sys.path and maps to a Python file (there could be non-Python file resources in package dirs as well). Alternatively, perhaps the whole getfile logic could be removed and the web server just provide the path to the source file (as file:// link), so that the user can easily open it, but needs access permissions for this to be successful. |
I searched for "pydoc by Ka-Ping Yee" in Google and only found two online pydoc services:
The first one runs on Python 2.7 which doesn't have the getfile feature (added in Python 3.6), the second one is a static website. => there is no public vulnerable website: good! I don't think that pydoc is commonly used to run a server on the Internet. |
An option is also to remove the whole getfile feature. It was added in bpo-2001 by: commit 7bb30b7
|
The getfile feature is used to display the source code of a Python module. For example, on the difflib documentation, there a link to difflib.py. If you click, a webpage displays the content of the file. I suggest to remove the whole feature. I don't think that it's so useful, compared to the vulnerability. |
I created a PR to remove the getfile function - now it just places the hyperlinked file path there but clicking on it won't render the file contents. Personally I agree with Marc-Andre Lemburg's comments on how _url_handler probably has other vulnerabilities somewhere. But I don't really see an easy solution other than removing the web server altogether. It uses http.server, which has a disclaimer on the docs page saying it isn't recommended for production. Someone looking hard enough can probably find a few more vulnerabilities in http.server itself rather than just pydoc. I think the "Allowlist populated while generating links" suggested by Julien is pretty clever. I thought about file: // approach too - it's probably the most secure. But it would require a lot of change (and also generating all the .py files to .html initially). Maybe I'll make a PR exploring the other approaches if the current one isn't favorable. Thanks for your time. |
I dislike this idea. If they are vulnerabilities, they should be fixed. Users usually have no idea what to do when seeing such warning. |
On 22.01.2021 01:28, STINNER Victor wrote:
The problem is that neither the docs nor the help text in the command While the getfile API endpoint can be used to view non-Python files In a corporate environment this can easily result in data leaks Fixing just getfile or replacing those links with file:// ones will Python's http.server at least warns about this in the docs: My guess is that pydoc -p really is just intended to be useful """ That would make it harder to guess the base URL and limit |
Why not limit the serving to sys.path? |
Fidget-Spinner wrote on the PR:
pydoc shows global constant values in the doc. So yes, if you find a settings.py of a Django project, you can discover secrets. I'm working on bpo-42955 "Add sys.module_names: list of stdlib module names (Python and extension modules)". One option would be to restrict pydoc to stdlib modules by defaults, and ask to opt-in for discovery of any module installed on the system (sys.path). |
I would be fine with a warning in the pydoc documentation, but I dislike warnings display on the command line. When I see such warning, I feel that the machine considers that I'm dumb and I have no idea of what I am doing. If it's unsafe, can we make it safe by default? |
I have updated the PR to do the following:
Now it says:
>>> python.exe -m pydoc -b
Server ready at http://localhost:52035/Y1YzOyEbitE9BB_dtH0YXbMgGXbcg3ytXLpvpg8P7GEM3z1hlCkTXgxaojtAordVqs2s6oHZHPMbXqq9mXq_wbJCVW8jnHrgQeYE5hFUQuI/ FWIW, it seems that Jupyter notebook server deals with the same problems in a similar manner: https://jupyter-notebook.readthedocs.io/en/stable/security.html#security-in-the-jupyter-notebook-server I removed the warning message in the PR because I think this is secure enough. |
PR 24337 uses different approach. It keeps compatibility, but checks that the argument is a file path of the source of one of modules (using the same algorithm as /search). |
While this approach solves the getfile problem, I don't think this will solve the other problem of pydoc leaking secrets stored in python files: Quoting from Marc-Andre Lemburg's message:
Quoting from Victor's messages:
Ultimately, the problem seems to be that .py files (other than those in the stdlib) may contain sensitive info, which pydoc can read. |
Resolution of this issue is blocking 3.7.x and 3.6.x security releases and threatens to block upcoming maintenance releases. |
While this vulnerability is bad, it only impacts very few users who run pydoc server. I suggest to not hold the incoming Python release (remove the "release blocker" priority) just for this one. If it's fixed before: great! But IMO it can wait for another Python release. |
I agree. |
Todd Cullum from Red Hat Security team: "I don't have an account on Python's tracker, would you mind forwarding to upstream on my behalf that this is not only locally exploitable, but it can be exploited by actors on the adjacent network as well because 6a396c9 was introduced in Python 3.7.0 alpha 1. I just used the -n option and got to read some of my own files using my cell phone on the WiFi. It does require the port to be unblocked by firewall though." |
This is now CVE-2021-3426. |
I created https://python-security.readthedocs.io/vuln/pydoc-getfile.html to track this vulnerability. The is no CVE section yet since the CVE is currently only *RESERVED*. |
Fedora downstream issue: https://bugzilla.redhat.com/show_bug.cgi?id=1937476 |
FWIW, I don't think we should even have a server feature in pydoc... |
The "pydoc -p port" command only listen on the local link ("localhost") by default, even if it's possible to listen on another IPv4 address using -n HOSTNAME command line option. While the "getfile" feature is convenient when the pydoc server is accessed from a different machine, I don't think that it's worth it, compared to the security risks and the complexity of PR 24285 and PR 24337 fixes. I propose to simply remove the "getfile" feature with PR 25015, but keep links using file:// scheme. So we delegate the security to the web browser. If the web browser is allowed to read sensitive data of a local Python file, it's not our problem: pydoc doesn't make things worse. |
Curious, what is actually being called out as a vulnerability here? The fact that pydoc doesn't enforce authentication? Seems this works the same as any other service/application works (when accessing the application/service, you are granted the permissions of the account running that application/service when it interacts with the operating system). I can think of dozens of applications and protocols that would be victim to this same finding. For example, running python -m http.server 80 . from a home directory does the same thing (still works even on version 3.11) does that mean that python3.11 is broken? |
It's a matter of meeting expectations for users of a tool so that they know when they're choosing to expose themselves to risks. Python's standard library pydoc's former built in web server could reasonably have been expected not to serve other data, as that isn't what the user asked for or would ever assume it would allow. Thus it was reasonable to consider it doing so a vulnerability. |
thanks for the explanation, that does make a bit more sense. Personally I am not familiar with pydoc, so when reading the vulnerability through the first time it sounded no different than if nginx, apache or iis was misconfigured, or running a tftp, ftp, or smb share that disclosed sensitive documents or for that matter any other production application that is used for file sharing. I appreciate the explanation and will chalk it up to my ignorance to pydoc and how its used. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: