Add package vulnerability reporting API #9552

dekkagaijin · 2021-05-20T00:09:58Z

PyPI today does not store information about known vulnerabilities. We’d like to build the necessary changes such that known vulnerabilities can be associated with package releases. The ultimate goal is that users of “pip” will be automatically warned if their dependencies are vulnerable.

See: #9407, https://discuss.python.org/t/proposing-a-community-maintained-database-of-pypi-package-vulnerabilities/8374

Signed-off-by: Jake Sanders jsand@google.com

dekkagaijin · 2021-05-20T00:13:00Z

/cc @di @oliverchang

dekkagaijin · 2021-05-20T03:59:24Z

Tests are passing, but <100% coverage. Please direct hatred at the changelist

ewjoachim · 2021-05-20T13:37:49Z

Woo, I'm seeing a lot of interactions with the Token Disclosure code I contributed :)
If I may suggest a few general things:

In the token disclosure code, there are 2 places where we take an arbitrary JSON payload and extract data from it (the meta API, and the disclosure payload), while trying very hard to check that it has the expected format. There isn't (yet) a lot of warehouse APIs where we accept JSON input payloads, so I had to do a little bit of work there, but if we're going toward refactoring and/or imitating this code, I think it would be much much easier to integrate a validation lib such as https://pypi.org/project/jsonschema/ (we could also go with Pydantic, Marshmallow or others like that but they're also quite oriented toward serialization and we're only interested in deserialization here). This may make your work much easier, and if you think it's worth it, I can try and make a first PR or draft to do that on the token disclosure side.
It really wasn't easy to test the full token disclosure scenario (PyPI has many unit tests but fewer integration tests and it's not always easy to write those). I made the "notgithub" service (see Notgithub: a service to simulate GitHub token scanning #9269) exactly to ease that, and I'll be glad if you want to extend it to be capable of sending vulnerabilities given the format seems to be quite close. As of today "notgithub"'s only known use-case is Warehouse, so I don't have a problem if we add some coupling.
Finally, around this part of the code, there are also discussions to integrate GitLab secrets detection too (Gitlab secrets detection #9280), but the format is not yet defined. If we move this code, let's make sure we make implementing future token scanning inputs easier rather than harder (I know it's not very actionnable advice, but in case of a doubt during implementation regarding re-usability, I hope this can provide some guidance)

dekkagaijin · 2021-06-02T01:01:43Z

/assign @di

dekkagaijin · 2021-06-02T01:07:23Z

@di This is in a reviewable state, at least for a first pass. I'll be adding more using tests for 100% coverage of what exists and modeling retrieve_public_key_payload and extract_public_keys after what exists or the token leak API

di

Preliminary review

warehouse/integrations/vulnerabilities/utils.py

warehouse/packaging/models.py

warehouse/integrations/vulnerabilities/utils.py

di · 2021-06-02T22:00:03Z

warehouse/packaging/models.py

+    vulnerabilities = orm.relationship(
+        Vulnerability,
+        backref="releases",
+        secondary=lambda: release_vulnerabilities,
+        passive_deletes=True,
+    )
+


Should probably define this on the Vulnerability model instead.

Would we not want both?

di · 2021-06-02T22:00:31Z

warehouse/integrations/vulnerabilities/utils.py

+class VulnerabilityReport:
+    def __init__(
+        self,
+        project: str,
+        versions: List[str],
+        vulnerability_id: str,
+        advisory_link: str,
+        aliases: List[str],
+    ):


I think we can probably condense the Vulnerability database model and this class into a single class. The class can also live at warehouse/integrations/vulnerabilities/models.py instead, and be imported into warehouse.packaging.models.

As in, do the model representation, report validating, and DB interaction in a single class?

dekkagaijin · 2021-06-18T03:41:19Z

@di I'm still finishing up the last few test cases, but please TAL

warehouse/integrations/vulnerabilities/__init__.py

warehouse/integrations/vulnerabilities/osv/__init__.py

warehouse/integrations/vulnerabilities/utils.py

ewjoachim · 2021-07-02T20:09:08Z

tests/unit/integration/vulnerabilities/osv/test_package.py

+
+    def test_retrieve_public_key_payload_json_error(self):
+        response = pretend.stub(
+            text="Still a non-json teapot",


(Is it bad that I laughed at this given this is a copy of a joke I made :D ) (👍)(🍵)

ewjoachim · 2021-07-02T20:26:26Z

(Finished this round of review. I forgot to make a general comment. I appreciate that there was a lot of code inspired from my PR, but if we need to do this a 3rd time, I'm afraid that will start to make quite a lot of duplication. This is by no means a way to block this PR (https://en.wikipedia.org/wiki/Rule_of_three_(computer_programming)) but we can think already about how we can avoid duplicating that much code next time)
(Great job 👍 👍 👍 🎉 )

warehouse/integrations/vulnerabilities/utils.py

dekkagaijin · 2021-07-14T18:53:12Z

I would still like to see tagging implemented before merge.

Tagging?

ewdurbin · 2021-07-14T18:58:38Z

I would still like to see tagging implemented before merge.

Tagging?

see the comment that is in reply to #9552 (comment)

Signed-off-by: Jake Sanders <jsand@google.com>

…tructor Signed-off-by: Jake Sanders <jsand@google.com>

dekkagaijin · 2021-07-15T23:25:13Z

I would still like to see tagging implemented before merge.

done

ewdurbin

one request (the timing metric tag) and some suggestions. none of it strictly critical for merge.

ewdurbin · 2021-07-21T19:39:03Z

warehouse/migrations/versions/1dbb95161e5a_add_vulnerability_and_release_.py

+revision = "1dbb95161e5a"
+down_revision = "08ccc59d9857"
+
+# Note: It is VERY important to ensure that a migration does not lock for a


comment(s) from template can be removed (not critical)

ewdurbin · 2021-07-21T19:39:23Z

warehouse/migrations/versions/1dbb95161e5a_add_vulnerability_and_release_.py

+
+
+def downgrade():
+    # ### commands auto generated by Alembic - please adjust! ###


comment(s) from template can be removed (not critical)

ewdurbin · 2021-07-21T19:39:35Z

warehouse/migrations/versions/1dbb95161e5a_add_vulnerability_and_release_.py

+
+
+def upgrade():
+    # ### commands auto generated by Alembic - please adjust! ###


comment(s) from template can be removed (not critical)

warehouse/integrations/vulnerabilities/utils.py

ewdurbin · 2021-07-21T19:50:30Z

warehouse/integrations/vulnerabilities/utils.py

+def analyze_vulnerability(request, vulnerability_report, origin, metrics):
+    metrics.increment("warehouse.vulnerabilities.received", tags=[f"origin:{origin}"])
+    try:
+        _analyze_vulnerability(


let's wrap this call in with metrics.timed('warehouse.vulnerabilities.analysis', tags=[f"origin:{origin}"]):

this will emit timing metrics so we can keep tabs on this programatically.

done, good idea

Signed-off-by: Jake Sanders <jsand@google.com>

* Add package vulnerability reporting API Signed-off-by: Jake Sanders <jsand@google.com> * genericize vuln report handling, split osv into a subpackage Signed-off-by: Jake Sanders <jsand@google.com> * use kwargs instead of positionals for VulnerabilityReportRequest constructor Signed-off-by: Jake Sanders <jsand@google.com> * use tags rather than namespacing metrics by origin * also load all releases when loading vulnerability records * add description for vuln report model columns * remove comments from vulnerability migration Signed-off-by: Jake Sanders <jsand@google.com> * instrument `_analyze_vulnerability` Signed-off-by: Jake Sanders <jsand@google.com> Co-authored-by: Dustin Ingram <di@users.noreply.github.com>

dekkagaijin force-pushed the vulnz-api branch 7 times, most recently from 683ff3d to 3f01152 Compare June 1, 2021 23:11

di reviewed Jun 2, 2021

View reviewed changes

dekkagaijin force-pushed the vulnz-api branch 3 times, most recently from e912f0b to 1f71f43 Compare June 14, 2021 22:51

dekkagaijin force-pushed the vulnz-api branch 8 times, most recently from 14e463d to 2eb2332 Compare June 18, 2021 00:59

dekkagaijin force-pushed the vulnz-api branch 3 times, most recently from f10a17f to 2e7d51a Compare June 19, 2021 00:17

dekkagaijin changed the title ~~[WIP] Add package vulnerability API~~ Add package vulnerability API Jun 19, 2021

dekkagaijin force-pushed the vulnz-api branch from 2e7d51a to 8effb89 Compare June 19, 2021 00:22

oliverchang mentioned this pull request Jul 2, 2021

Add a cloud function for posting vulnerabilities to PyPI. google/osv.dev#184

Merged

ewjoachim reviewed Jul 2, 2021

View reviewed changes

dekkagaijin force-pushed the vulnz-api branch 2 times, most recently from 76e70ff to 91ae897 Compare July 12, 2021 21:23

ewdurbin requested changes Jul 14, 2021

View reviewed changes

warehouse/integrations/vulnerabilities/utils.py Outdated Show resolved Hide resolved

warehouse/integrations/vulnerabilities/utils.py Show resolved Hide resolved

dekkagaijin force-pushed the vulnz-api branch from cdae3bf to e0a3bd3 Compare July 14, 2021 18:43

dekkagaijin force-pushed the vulnz-api branch from e0a3bd3 to ff416cb Compare July 14, 2021 19:07

Jake Sanders added 3 commits July 14, 2021 12:25

Add package vulnerability reporting API

f594390

Signed-off-by: Jake Sanders <jsand@google.com>

genericize vuln report handling, split osv into a subpackage

551e794

Signed-off-by: Jake Sanders <jsand@google.com>

use kwargs instead of positionals for VulnerabilityReportRequest cons…

6c2518d

…tructor Signed-off-by: Jake Sanders <jsand@google.com>

dekkagaijin force-pushed the vulnz-api branch from ff416cb to 6c2518d Compare July 14, 2021 19:25

Jake Sanders added 2 commits July 15, 2021 15:26

use tags rather than namespacing metrics by origin

e3e662a

also load all releases when loading vulnerability records

a43834f

add description for vuln report model columns

f1df4e3

dekkagaijin force-pushed the vulnz-api branch from 13f7c1a to f1df4e3 Compare July 15, 2021 23:58

dekkagaijin requested a review from ewdurbin July 20, 2021 18:30

ewdurbin approved these changes Jul 21, 2021

View reviewed changes

Jake Sanders added 2 commits July 22, 2021 17:00

remove comments from vulnerability migration

d11a997

Signed-off-by: Jake Sanders <jsand@google.com>

instrument _analyze_vulnerability

266857c

Signed-off-by: Jake Sanders <jsand@google.com>

dekkagaijin force-pushed the vulnz-api branch from f8246c8 to 266857c Compare July 23, 2021 00:27

Merge branch 'main' into vulnz-api

e182cdb

di merged commit 3c54782 into pypi:main Jul 26, 2021

dekkagaijin deleted the vulnz-api branch July 26, 2021 15:02

di mentioned this pull request Mar 23, 2022

Generic interface for associating package releases with vulnerabilities #9407

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add package vulnerability reporting API #9552

Add package vulnerability reporting API #9552

dekkagaijin commented May 20, 2021

dekkagaijin commented May 20, 2021

dekkagaijin commented May 20, 2021

ewjoachim commented May 20, 2021

dekkagaijin commented Jun 2, 2021

dekkagaijin commented Jun 2, 2021

di left a comment

di Jun 2, 2021

dekkagaijin Jun 8, 2021

di Jun 2, 2021

dekkagaijin Jun 14, 2021

dekkagaijin commented Jun 18, 2021

ewjoachim Jul 2, 2021

ewjoachim commented Jul 2, 2021 •

edited

dekkagaijin commented Jul 14, 2021

ewdurbin commented Jul 14, 2021

dekkagaijin commented Jul 15, 2021

ewdurbin left a comment

ewdurbin Jul 21, 2021

dekkagaijin Jul 23, 2021

ewdurbin Jul 21, 2021

dekkagaijin Jul 23, 2021

ewdurbin Jul 21, 2021

dekkagaijin Jul 23, 2021

ewdurbin Jul 21, 2021

dekkagaijin Jul 23, 2021



		def downgrade():
		# ### commands auto generated by Alembic - please adjust! ###



		def upgrade():
		# ### commands auto generated by Alembic - please adjust! ###

Add package vulnerability reporting API #9552

Add package vulnerability reporting API #9552

Conversation

dekkagaijin commented May 20, 2021

dekkagaijin commented May 20, 2021

dekkagaijin commented May 20, 2021

ewjoachim commented May 20, 2021

dekkagaijin commented Jun 2, 2021

dekkagaijin commented Jun 2, 2021

di left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dekkagaijin commented Jun 18, 2021

Choose a reason for hiding this comment

ewjoachim commented Jul 2, 2021 • edited

dekkagaijin commented Jul 14, 2021

ewdurbin commented Jul 14, 2021

dekkagaijin commented Jul 15, 2021

ewdurbin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ewjoachim commented Jul 2, 2021 •

edited