fix: don't report vulnerabilities multiple times under different aliases #61
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Not super happy about this one as it feels really inefficient to be constantly doing the same bunch of loops over and over, but I can't think of a more efficient way of doing this - the main issue is that not everyone lists all their aliases, so we just have to check them from every angle...
This messed with my brain a bunch, so I've made
Include
public for now to be able to have a explicit test but will replace that once I've had time to look into writing tests for the database (which require running a test http server to provide the db archive).Overall this is primarily "just" for the PyPip ecosystem as their database has both GHSA and PYSEC vulnerabilities, with the GHSA ones having been pulled from PYSEC in the first place; however what they are doing is completely valid so we should be having this logic as any database could do it.
Oh and currently this is first-in-wins based which means GHSAs "win" over PYSEC since they're listed alphabetically, but I expect GHSA entries to always be more fulsome so it would be nice at some point to look into having them be preferred as part of this logic (though that'd also be adding more expense, since we'd have to recompute the splice to replace the entry, maybe?)
(btw while I'm talking about this as being inefficient, it's relative - I expect this is a lot faster than it actually looks, because Go 🤷)