Skip to content

Conversation

@xmo-odoo
Copy link
Contributor

@xmo-odoo xmo-odoo commented Nov 25, 2025

Type
✨ New feature

SARIF is a unified file format used to exchange information between static analysis tools (like pylint) and various types of formatters, meta-runners, broadcasters / alert system, ...

This implementation is ad-hoc, and non-validating.

Spec v Github

Turns out Github both doesn't implement all of SARIF (which makes sense) and requires a bunch of properties which the spec considers optional. The official SARIF validator (linked to by both oasis and github) was used to validate the output of the reporter, ensuring that all the github requirements it flags are fulfilled, and fixing some of the validator's pet issues.

As of now the following issues are left unaddressed:

  • azure requires run.automationDetails, looking at the spec I don't think it makes sense for the reporter to inject that, it's more up to the CI
  • the validator wants a run.versionControlProvenance, same as above
  • the validator wants rule names in PascalCase, lol
  • the validator wants templated result messages, but without pylint providing the args as part of the Message that's a bit of a chore
  • the validator wants region to include a snippet (the flagged content)
  • the validator wants physicalLocation to have a contextRegion (most likely with a snippet)

On URIs

The reporter makes use of URIs for artifacts (~files). Per "guidance on the use of artifactLocation objects", uri should capture the deterministic part of the artifact location and uriBaseId should capture the non-deterministic part. However as far as I can tell pylint has no requirement (and no clean way to require) consistent resolution roots: path is just relative to the cwd, and there is no requirement to have project-level files to use pylint. This makes the use of relative uris dodgy, but absolute uris are pretty much always broken for the purpose of interchange so they're not really any better.

As a side-note, Github asserts

While this [nb: originalUriBaseIds] is not required by GitHub for the code scanning results to be displayed correctly, it is required to produce a valid SARIF output when using relative URI references.

However per 3.4.4 this is incorrect, the uriBaseId can be resolved through end-user configuration, originalUriBaseIds, external information (e.g. envvars), or heuristics.

It would be nice to document the "relative root" via originalUriBaseIds (which may be omitted for that purpose per 3.14.14, but per the above claiming a consistent project root is dodgy.

We could resolve known project files (e.g. pyproject.toml, tox.ini, etc...) in order to find a consistent root (project root, repo root, ...) and set / use that for relative URIs but that's a lot of additional complexity which I'm not sure is warranted at least for a first version.

Fixes #5493

@Pierre-Sassoulas Pierre-Sassoulas added the Enhancement ✨ Improvement to a component label Nov 25, 2025
@Pierre-Sassoulas Pierre-Sassoulas added this to the 4.1.0 milestone Nov 25, 2025
Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, this is great.

I'm not a SARIF expert so I have some code placement comments, and I hope someone more knowledgeable in SARIF will have an opinion about the actual content.

@codecov
Copy link

codecov bot commented Nov 25, 2025

Codecov Report

❌ Patch coverage is 97.93814% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 95.99%. Comparing base (7588243) to head (ebf822c).

Files with missing lines Patch % Lines
pylint/reporters/sarif_reporter.py 97.91% 2 Missing ⚠️

❌ Your patch check has failed because the patch coverage (97.93%) is below the target coverage (100.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main   #10759   +/-   ##
=======================================
  Coverage   95.98%   95.99%           
=======================================
  Files         176      177    +1     
  Lines       19560    19657   +97     
=======================================
+ Hits        18775    18870   +95     
- Misses        785      787    +2     
Files with missing lines Coverage Δ
pylint/lint/base_options.py 100.00% <ø> (ø)
pylint/reporters/__init__.py 100.00% <100.00%> (ø)
pylint/reporters/sarif_reporter.py 97.91% <97.91%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@xmo-odoo
Copy link
Contributor Author

xmo-odoo commented Nov 25, 2025

I'm not a SARIF expert so I have some code placement comments, and I hope someone more knowledgeable in SARIF will have an opinion about the actual content.

Maybe @nvuillam is still active since they're the original requestor.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@xmo-odoo xmo-odoo force-pushed the sarif branch 3 times, most recently from d2110f6 to 940ce59 Compare November 25, 2025 15:09
@github-actions

This comment has been minimized.

@nvuillam
Copy link

As long as the sarif validator validates the SARIF output, it's good :)

@github-actions

This comment has been minimized.

@xmo-odoo xmo-odoo force-pushed the sarif branch 3 times, most recently from 997fd0c to e64b059 Compare November 26, 2025 13:27
SARIF is a unified file format used to exchange information between
static analysis tools (like pylint) and various types of formatters,
meta-runners, broadcasters / alert system, ...

This implementation is ad-hoc, and non-validating.

Spec v Github
-------------

Turns out Github both doesn't implement all of SARIF (which makes
sense) and requires a bunch of properties which the spec considers
optional. The [official SARIF validator][] (linked to by both oasis
and github) was used to validate the output of the reporter, ensuring
that all the github requirements it flags are fulfilled, and
fixing *some* of the validator's pet issues.

As of now the following issues are left unaddressed:

- azure requires `run.automationDetails`, looking at the spec I don't
  think it makes sense for the reporter to inject that, it's more up
  to the CI
- the validator wants a `run.versionControlProvenance`, same as above
- the validator wants rule names in PascalCase, lol
- the validator wants templated result messages, but without pylint
  providing the args as part of the `Message` that's a bit of a chore
- the validator wants `region` to include a snippet (the flagged
  content)
- the validator wants `physicalLocation` to have a `contextRegion`
  (most likely with a snippet)

On URIs
-------

The reporter makes use of URIs for artifacts (~files). Per ["guidance
on the use of artifactLocation objects"][3.4.7], `uri` *should*
capture the deterministic part of the artifact location and
`uriBaseId` *should* capture the non-deterministic part. However as
far as I can tell pylint has no requirement (and no clean way to
require) consistent resolution roots: `path` is just relative to the
cwd, and there is no requirement to have project-level files to use
pylint. This makes the use of relative uris dodgy, but absolute uris
are pretty much always broken for the purpose of *interchange* so
they're not really any better.

As a side-note, Github [asserts][relative-uri-guidance]

> While this [nb: `originalUriBaseIds`] is not required by GitHub for
> the code scanning results to be displayed correctly, it is required
> to produce a valid SARIF output when using relative URI references.

However per [3.4.4][] this is incorrect, the `uriBaseId` can be
resolved through end-user configuration, `originalUriBaseIds`,
external information (e.g. envvars), or heuristics.

It would be nice to document the "relative root" via
`originalUriBaseIds` (which may be omitted for that purpose per
[3.14.14][], but per the above claiming a consistent project root is
dodgy.

We *could* resolve known project files (e.g. pyproject.toml, tox.ini,
etc...) in order to find a consistent root (project root, repo root,
...) and set / use that for relative URIs but that's a lot of
additional complexity which I'm not sure is warranted at least for a
first version.

Fixes pylint-dev#5493

[3.4.4]: https://docs.oasis-open.org/sarif/sarif/v2.1.0/csprd01/sarif-v2.1.0-csprd01.html#_Toc10540869
[3.4.7]:
https://docs.oasis-open.org/sarif/sarif/v2.1.0/csprd01/sarif-v2.1.0-csprd01.html#_Toc10540872
[3.14.14]: https://docs.oasis-open.org/sarif/sarif/v2.1.0/csprd01/sarif-v2.1.0-csprd01.html#_Toc10540936
[relative-uri-guidance]:
https://docs.github.com/en/code-security/code-scanning/integrating-with-code-scanning/sarif-support-for-code-scanning#relative-uri-guidance-for-sarif-producers
[official SARIF validator]: https://sarifweb.azurewebsites.net/
@github-actions
Copy link
Contributor

🤖 According to the primer, this change has no effect on the checked open source code. 🤖🎉

This comment was generated for commit ebf822c

@xmo-odoo
Copy link
Contributor Author

Not entirely sure how to fix the coverage issue as it's mostly for the conversion of windows paths to relative URIs.

Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation LGTM, it seems we don't need to cover everything as long as we add sarif validation in the test suite ?

@xmo-odoo
Copy link
Contributor Author

Implementation LGTM, it seems we don't need to cover everything as long as we add sarif validation in the test suite ?

Not sure it's that useful: the type specs should be the useful (to pylint) subset of the json schema (modulo a rule or two which are painful to type), while the "official" validator with additional suggestions or service-specific requirements doesn't seem to be available as anything but a webservice of sorts?

@Pierre-Sassoulas
Copy link
Member

pylint maintainers will probably have to deal with bug fixes on this, so we need to have something to make sure this does not regress somehow. We do have windows specific CI jobs though, so we can cover the windows specific path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Enhancement ✨ Improvement to a component

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SARIF output format for pylint ?

3 participants