2.5.0 regression: `SortedSet` assumes comparability of members, but `Vulnerability` model is not comparable #245

woodruffw · 2022-06-10T14:25:43Z

Hi there! Thanks a ton for this library.

We currently use it to generate SBOMs in pip-audit, and I noticed an interested regression upon upgrading to 2.5.0: it looks like Component.add_vulnerability attempts to add the underlying Vulnerability model to a SortedSet, which in turn fails because Vulnerability doesn't appear to implement the standard comparable operators (e.g. __lt__).

Here's the failing code on our side, which worked in 2.4.0:

        for (dep, vulns) in result.items():
            if dep.is_skipped():
                continue
            dep = cast(service.ResolvedDependency, dep)

            c = Component(name=dep.name, version=str(dep.version))
            for vuln in vulns:
                c.add_vulnerability(
                    Vulnerability(
                        id=vuln.id,
                        description=vuln.description,
                        recommendation="Upgrade",
                    )
                )

            self._components.append(c)

and the failing CI tests on 2.5.0: https://github.com/trailofbits/pip-audit/runs/6832431942?check_suite_focus=true

In my estimation, this looks like a bug/regression, rather than a SemVer breakage -- the Vulnerability model also comes from CycloneDX, so it probably should have been made comparable at the same time that comparability was assumed by introducing SortedSet.

xref pypa/pip-audit#292

The text was updated successfully, but these errors were encountered:

woodruffw · 2022-06-10T14:38:47Z

If I'm right, the only fix needed here is to add a Vulnerability.__lt__ implementation. I can look into that now.

Edit: Looking at the model more closely, it's not clear to me what the sorting semantics should be -- nearly every field is optional, and many are lists (which means that comparing between instances might entail comparing overlapping ranges, which is both unlikely to be semantically coherent and has bad worst-case performance).

madpah · 2022-06-10T14:52:46Z

@woodruffw - thanks for reporting this - let me take a look. In the mean time, are you able to pin your usage to >= 2.4.0?

woodruffw · 2022-06-10T14:53:56Z

Thanks for looking into it!

In the mean time, are you able to pin your usage to >= 2.4.0?

Yep, and I've confirmed that doing so avoids the bug.

RodneyRichardson · 2022-06-10T16:12:57Z

Ah! I'm not sure how I missed that. I thought I'd covered all classes that are used in a SortedSet.

If I'm right, the only fix needed here is to add a Vulnerability.__lt__ implementation. I can look into that now.

It looks like VulnerabilityCredits and OrganizationalEntity also need __lt__. And test_sort functions to the corresponding TestModelXXX classes.

Edit: Looking at the model more closely, it's not clear to me what the sorting semantics should be -- nearly every field is optional, and many are lists (which means that comparing between instances might entail comparing overlapping ranges, which is both unlikely to be semantically coherent and has bad worst-case performance).

I would suggest sort order for Vulnerability: id, description, detail, source, created, published

jkowalleck · 2022-06-10T16:23:10Z

Regarding a camparator on the model, here are my 2ct:
It does not matter how you implement the comparison, as long as it is deterministic.
Just define a algorithm, implement it, add a unit test so it does not change over time unintended.

Partial fix for CycloneDX#245. Signed-off-by: Rodney Richardson <rodney.richardson@cambridgeconsultants.com>

RodneyRichardson · 2022-06-10T16:35:37Z

I've added a partial fix (#246), that should work in this particular situation (when credits are not set). I'm afraid I don't have time to work on this in the next week.

woodruffw · 2022-06-10T17:22:48Z

Thanks @RodneyRichardson!

FWIW, I'd suggest yanking the current release (2.5.0) from PyPI -- IMO the regression here violates SemVer's minor version rules, so users with semantic ranges might experience breakage.

Partial fix for #245. Signed-off-by: Rodney Richardson <rodney.richardson@cambridgeconsultants.com>

jkowalleck · 2022-06-11T06:06:52Z

Thanks @RodneyRichardson!

FWIW, I'd suggest yanking the current release (2.5.0) from PyPI -- IMO the regression here violates SemVer's minor version rules, so users with semantic ranges might experience breakage.

my 2ct: i do not see sorted lists as a breaking change but a bachwards compatible feature. no method signatures (api) was changed backwards-incompatible. Some return types did change in backwards-compatible manner, the returned content did not change in its meaning.
The content of sets are commutative: order of these data never mattered, no matter if a component is listed first or last as long as it is in the list(set).
Even typing should be fine according to grantjenks/python-sortedcontainers#107 (comment)

What is your POV? Are there breaking changes for you?

woodruffw · 2022-06-11T14:49:33Z

To clarify: the breaking change I meant wasn’t the new APIs, but the fact that the current APIs stopped working when upgrading between 2.4.0 and 2.5.0 (specifically, the Vulnerability model couldn’t be used with the other APIs that it was intended with.) In particular, pip-audit assumes the stability of the API in the 2.x series, so users who try to install it and use SBOM generation currently can’t do so. Without a yank, we’d have to explicitly carve out 2.5.0 in the range of supported versions. Sent from mobile. Please excuse my brevity.

…

On Jun 11, 2022, at 2:07 AM, Jan Kowalleck ***@***.***> wrote: Thanks @RodneyRichardson! FWIW, I'd suggest yanking the current release (2.5.0) from PyPI -- IMO the regression here violates SemVer's minor version rules, so users with semantic ranges might experience breakage. my 2ct: i do not see sorted lists as a breaking change but a bachwards compatible feature. no method signatures (api) was changed, the return types did not change, the returned content did not change. The content of sets are commutative: order of these data never mattered, no matter if a component is listed first or last as long as it is in the list(set). — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.

woodruffw · 2022-06-11T15:01:17Z

Just saw your edit 🙂 — the specific breakage we saw in pip-audit is the code in the initial comment, which works in 2.4.0 but crashes in 2.5.0. As a result, any user who currently installs pip-audit and uses CycloneDX for SBOM generation will currently see an error, since we had an open SemVer range that allows 2.5.0. Sent from mobile. Please excuse my brevity.

…

On Jun 11, 2022, at 10:49 AM, William Woodruff ***@***.***> wrote: To clarify: the breaking change I meant wasn’t the new APIs, but the fact that the current APIs stopped working when upgrading between 2.4.0 and 2.5.0 (specifically, the Vulnerability model couldn’t be used with the other APIs that it was intended with.) In particular, pip-audit assumes the stability of the API in the 2.x series, so users who try to install it and use SBOM generation currently can’t do so. Without a yank, we’d have to explicitly carve out 2.5.0 in the range of supported versions. Sent from mobile. Please excuse my brevity. >> On Jun 11, 2022, at 2:07 AM, Jan Kowalleck ***@***.***> wrote: >> > > Thanks @RodneyRichardson! > > FWIW, I'd suggest yanking the current release (2.5.0) from PyPI -- IMO the regression here violates SemVer's minor version rules, so users with semantic ranges might experience breakage. > > my 2ct: i do not see sorted lists as a breaking change but a bachwards compatible feature. no method signatures (api) was changed, the return types did not change, the returned content did not change. > The content of sets are commutative: order of these data never mattered, no matter if a component is listed first or last as long as it is in the list(set). > > — > Reply to this email directly, view it on GitHub, or unsubscribe. > You are receiving this because you were mentioned.

RodneyRichardson · 2022-06-11T18:28:16Z

FWIW, I'd suggest yanking the current release (2.5.0) from PyPI

I agree - I don't think it's a breaking change in the API, but I think it's a broken release that should be recalled. Sorry I don't have time to add __lt__ operators on the other two classes to fix this fully.

jkowalleck · 2022-06-11T21:56:47Z

reviewed all classes that are pumped into a SortedSet - all seamed to have a proper __lt__() now, after 2.5.1 was released .
yanking 2.5.0 - still in discussion.

woodruffw · 2022-06-13T14:02:36Z

Thanks a ton for the patch release!

I don't want to be a nag about the yank, but IMO the sooner the better 🙂 -- most users are unlikely to hit it now that there's a patch version, but those who do (and have similar codepaths to pip-audit) are likely to experience similar problems.

RodneyRichardson · 2022-06-15T05:05:44Z

reviewed all classes that are pumped into a SortedSet - all seamed to have a proper __lt__() now, after 2.5.1 was released

@jkowalleck It looked like VulnerabilityCredits and OrganizationalEntity also need __lt__

jkowalleck · 2022-06-15T14:11:39Z

reviewed all classes that are pumped into a SortedSet - all seamed to have a proper __lt__() now, after 2.5.1 was released

@jkowalleck It looked like VulnerabilityCredits and OrganizationalEntity also need __lt__

what am i missing, @RodneyRichardson ?
I do not see any usage of VulnerabilityCredits in a SortedSet.
OrganizationalEntity actually was used in a sorted set and was lacking the __lt__() - you are right. Missed this one in previous code reviews.

however, will come up with PR to add the method to both, soon.

RodneyRichardson · 2022-06-17T14:51:58Z

what am i missing, @RodneyRichardson ? I do not see any usage of VulnerabilityCredits in a SortedSet.

Ah - I didn't have access to the code when I wrote that - I think I was planning to have credits as part of the sort criteria for the Vulnerability (which would mean it needs a comparator) but I didn't. I guess we should check for use in the ComparableTuple too (but the tests should pick that up).

jkowalleck · 2022-06-18T05:41:45Z

I will leave everything as is, for now. Additional bug reports will show further need for tests or improvements.

Therefore, i will close this issue, for now.

See you next time, when we open the gates of this issue again, and continue our discussions :-)

woodruffw mentioned this issue Jun 10, 2022

pyproject: add missing toml deps pypa/pip-audit#292

Merged

RodneyRichardson added a commit to RodneyRichardson/cyclonedx-python-lib that referenced this issue Jun 10, 2022

Make Vulnerability sortable

645cb09

Partial fix for CycloneDX#245. Signed-off-by: Rodney Richardson <rodney.richardson@cambridgeconsultants.com>

RodneyRichardson mentioned this issue Jun 10, 2022

fix: add missing Vulnerability comparator for sorting #246

Merged

jkowalleck pushed a commit that referenced this issue Jun 10, 2022

fix: add missing Vulnerability comparator for sorting (#246)

c3f3d0d

Partial fix for #245. Signed-off-by: Rodney Richardson <rodney.richardson@cambridgeconsultants.com>

jkowalleck mentioned this issue Jun 15, 2022

fix: add expected lower-than comparators for OrganizationalEntity and VulnerabilityCredits #248

Merged

jkowalleck closed this as completed Jun 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2.5.0 regression: `SortedSet` assumes comparability of members, but `Vulnerability` model is not comparable #245

2.5.0 regression: `SortedSet` assumes comparability of members, but `Vulnerability` model is not comparable #245

woodruffw commented Jun 10, 2022

woodruffw commented Jun 10, 2022 •

edited

Loading

madpah commented Jun 10, 2022

woodruffw commented Jun 10, 2022

RodneyRichardson commented Jun 10, 2022

jkowalleck commented Jun 10, 2022 •

edited

Loading

RodneyRichardson commented Jun 10, 2022

woodruffw commented Jun 10, 2022

jkowalleck commented Jun 11, 2022 •

edited

Loading

woodruffw commented Jun 11, 2022 via email

woodruffw commented Jun 11, 2022 via email

RodneyRichardson commented Jun 11, 2022

jkowalleck commented Jun 11, 2022

woodruffw commented Jun 13, 2022

RodneyRichardson commented Jun 15, 2022 •

edited

Loading

jkowalleck commented Jun 15, 2022 •

edited

Loading

RodneyRichardson commented Jun 17, 2022 •

edited

Loading

jkowalleck commented Jun 18, 2022

2.5.0 regression: SortedSet assumes comparability of members, but Vulnerability model is not comparable #245

2.5.0 regression: SortedSet assumes comparability of members, but Vulnerability model is not comparable #245

Comments

woodruffw commented Jun 10, 2022

woodruffw commented Jun 10, 2022 • edited Loading

madpah commented Jun 10, 2022

woodruffw commented Jun 10, 2022

RodneyRichardson commented Jun 10, 2022

jkowalleck commented Jun 10, 2022 • edited Loading

RodneyRichardson commented Jun 10, 2022

woodruffw commented Jun 10, 2022

jkowalleck commented Jun 11, 2022 • edited Loading

woodruffw commented Jun 11, 2022 via email

woodruffw commented Jun 11, 2022 via email

RodneyRichardson commented Jun 11, 2022

jkowalleck commented Jun 11, 2022

woodruffw commented Jun 13, 2022

RodneyRichardson commented Jun 15, 2022 • edited Loading

jkowalleck commented Jun 15, 2022 • edited Loading

RodneyRichardson commented Jun 17, 2022 • edited Loading

jkowalleck commented Jun 18, 2022

2.5.0 regression: `SortedSet` assumes comparability of members, but `Vulnerability` model is not comparable #245

2.5.0 regression: `SortedSet` assumes comparability of members, but `Vulnerability` model is not comparable #245

woodruffw commented Jun 10, 2022 •

edited

Loading

jkowalleck commented Jun 10, 2022 •

edited

Loading

jkowalleck commented Jun 11, 2022 •

edited

Loading

RodneyRichardson commented Jun 15, 2022 •

edited

Loading

jkowalleck commented Jun 15, 2022 •

edited

Loading

RodneyRichardson commented Jun 17, 2022 •

edited

Loading