Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to reproduce CVE-2021-3572? #10042

Closed
frenzymadness opened this issue Jun 7, 2021 · 11 comments
Closed

How to reproduce CVE-2021-3572? #10042

frenzymadness opened this issue Jun 7, 2021 · 11 comments
Labels
type: question User question

Comments

@frenzymadness
Copy link
Contributor

I'm trying to understand the CVE-2021-3572 which was fixed by #9827 and create a test (reproducer) for it. The CVE is reserved but without further details yet: https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-3572

I think that if an attacker can gain an access to a git repository from which a user installs a package, there is no need to create some special tags to hijack commit-based pin on a user side.

Let me demonstrate that it can work also with the latest version of pip with no usage of the special Unicode characters in tags.

  • Let's say I want to install a specific version of a Python package so I use a command like this: pip install git+https://gitlab.cee.redhat.com/lbalhar/marshalparser.git@b5f61d773ca28f7afb2bb5623d142ee27e3daecf. Because the sha hash points to the specific commit in the git repository, I always have the same version installed.
  • Now, an attacker gains access to the repository the user is installing the package from. The attacker does not need to do anything special to make the user install a different version. The only thing the attacker needs to do is to create a new tag named b5f61d773ca28f7afb2bb5623d142ee27e3daecf.
  • Now, if the user uses exactly the same command, the installed version would be completely different.

Full log:

# User

$ pip list 
Package    Version
---------- -------
pip        21.1.2
setuptools 56.2.0
wheel      0.36.2

$ pip install git+https://gitlab.cee.redhat.com/lbalhar/marshalparser.git@b5f61d773ca28f7afb2bb5623d142ee27e3daecf
…
Successfully installed marshalparser-0.1.1

$ pip list
Package       Version
------------- -------
marshalparser 0.1.1
pip           21.1.2
setuptools    56.2.0
wheel         0.36.2

# Attacker tags the latest commit to force user to update

$ git tag b5f61d773ca28f7afb2bb5623d142ee27e3daecf
$ git push origin master --tags

# User again

$ pip install git+https://gitlab.cee.redhat.com/lbalhar/marshalparser.git@b5f61d773ca28f7afb2bb5623d142ee27e3daecf
…
Successfully installed marshalparser-0.2.6

$ pip list
Package       Version
------------- -------
marshalparser 0.2.6
pip           21.1.2
setuptools    56.2.0
wheel         0.36.2

Could please somebody explain to me why would an attacker need to use some special Unicode linebreak characters in a tag name to hijack the commit-based requirement if there is a possibility to do it in the described simple way?

The truth is that these simple steps do not work for me on Github because Github does not allow branches or tags to have names looking like commit hashes:

remote: error: GH002: Sorry, branch or tag names consisting of 40 hex characters are not allowed.

So, is the fix in the pip needed for Github, where you cannot use the described attack in the simple way?

@pradyunsg pradyunsg changed the title Test for CVE-2021-3572 How to reproduce CVE-2021-3572? Jun 7, 2021
@pradyunsg
Copy link
Member

If the tag you're trying to hijack is AAAA, to instead provide the user with the revision BBBB, you'd need to do something like:

git checkout BBBB
git tag "$(printf 'AAAA\u2028a\u2000a/AAAA')"
git push origin "$(printf 'AAAA\u2028a\u2000a/AAAA')"

With this, when pip is trying to install from AAAA, it will instead misparse that tag and install using the revision BBBB.

@pradyunsg pradyunsg added the type: question User question label Jun 7, 2021
@pradyunsg
Copy link
Member

pradyunsg commented Jun 7, 2021

So, is the fix in the pip needed for Github, where you cannot use the described attack in the simple way?

Yea, basically.

It is fixing a logical mistake in the parsing of the tags, which could be used to inject malicious tags on repositories in a manner that's specific to pip, and is non-trivial for repository hosting services to defend against (if they choose to do so).

@frenzymadness
Copy link
Contributor Author

Thanks for the explanation! Now, I see that the first sentence in the PR confused me:

Previously, maliciously formatted tags could be used to hijack a
commit-based pin.

I've created a repository we can use to test the vulnerability: https://github.com/frenzymadness/CVE-2021-3572 There is a short description about the repository and the vulnerability. If you want to, use it in your own tests.

@frenzymadness
Copy link
Contributor Author

From my point of view, this issue can be closed now.

@frenzymadness
Copy link
Contributor Author

By the way, to exploit the vulnerability in older pip (like 9.0.3) the tag has to look different because the old version uses splitlines() but also split(" ", 1) so the tag has to has a Unicode linebreak at the end and nothing more because a ascii space is not allowed. But this does not work for the newer pip versions. I'll keep the repository ready for newer pip instead of the old ones.

@frenzymadness
Copy link
Contributor Author

To add more info to my latest comment. Older pip (<10) uses a different approach to get an output of git show-ref. It gets all the lines from git show-ref and parses them while the newer pips are calling git show-ref <revision_name> for the specific revision.

That means, that older pips are even more vulnerable because all you need to do is to create a tag with some blank character at the end of its name. When pip parses the output the two tags look the same for pip because the blank character is stripped during processing the output of the command which means that the one appearing later in the output is the one that will be in the final mapping between tags and hashes.

@skazi0
Copy link

skazi0 commented Jun 8, 2021

@frenzymadness I'm trying to patch v9.0.1 for this CVE and the problem is even more complicated there. Because pip.utils.call_subprocess() already calls rstrip() on stdout lines, the grabbed output of git show-ref no longer contains any trailing newlines (including the malicious ones from unicode set).
For v10+ this is not a problem as git show-ref GOOD matches from the end and will not list the entries which have only unicode whitespaces after GOOD.
Looks like the best (and only?) solution for v9 is to port the behavior with git show-ref <ref> from later versions as the full ref mapping is not used for anything else but checking single rev (correct me if I'm wrong).

@frenzymadness
Copy link
Contributor Author

@skazi0 I've discovered the same. I was thinking about making the rstrip in call_subprocess optional but your solution seems much more robust. Do you have any estimation of how hard would it be to use the new approach? I'd say that functional tests should uncover any possible issues.

@skazi0
Copy link

skazi0 commented Jun 9, 2021

@frenzymadness This is my patch: https://github.com/skazi0/CVE-2021-3572/blob/master/CVE-2021-3572-v9.0.1.patch
I forked your repo and set it up with v9 tags. With above fix, both GOOD\u2028a\u2000a/GOOD and GOOD\u2028 should work as expected.

@frenzymadness
Copy link
Contributor Author

Good job. Your patch looks very good to me and it seems that is not that complex so I'll have to fix also the ancient pip versions in our RPM packages.

@pradyunsg
Copy link
Member

Closing since there’s nothing actionable for pip’s maintainers here, but I’m glad you’ve been able to reproduce this and come up with backports!

I’ll add a test for this when I’m able to find the time to do so.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 27, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
type: question User question
Projects
None yet
Development

No branches or pull requests

3 participants