Skip to content

upload-sarif: relative paths beginning with ./ are not handled correctly #2829

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
woodruffw opened this issue Mar 26, 2025 · 2 comments
Open

Comments

@woodruffw
Copy link

Hello!

I'm reporting what I believe to be a bug in GitHub's SARIF consumption. This bug is probably happening somewhere deeper in GitHub's SARIF consumption machinery but upload-sarif is the main user-facing entrypoint for that machinery, so I'm filing the report here. Please let me know if a better discussion venue exists and I'd be happy to continue this elsewhere 🙂

Description

GitHub's documentation says the following about paths in SARIF inputs:

Code scanning interprets results that are reported with relative paths as relative to the root of the repository analyzed. If a result contains an absolute URI, the URI is converted to a relative URI. The relative URI can then be matched against a file committed to the repository.

Ref: https://docs.github.com/en/code-security/code-scanning/integrating-with-code-scanning/sarif-support-for-code-scanning#specifying-the-root-for-source-files

This, combined with the fact that SARIF stipulates RFC 3986 for artifactLocation URIs suggests that relative paths should be able to include . and .., as these path components are well-defined in RFC 3986.

However, if a SARIF file contains a relative URI like ./foo/bar, GitHub appears to fail to resolve that URL to a path in the repository being scanned. This results in suboptimal finding presentation:

  1. Findings are not rendered on PRs, since GitHub doesn't think the finding URLs match repository paths;
  2. Findings are rendered under "Code scanning alerts" but show "Preview unavailable" instead of a proper synopsis/extraction of the source file.

Here is a screenshot of the above behavior, demonstrating the "Preview unavailable" behavior because the path (./tests/integration/test-data/issue-612-repro/action.yml) starts with a ./:

Image

For contrast, here's a similar finding rendering correctly because it doesn't begin with ./:

Image

Expected behavior

I expect GitHub's SARIF ingestion to handle URIs that begin with (or contain) relative references, since SARIF stipulates RFC 3986 for URIs and RFC 3986 permits relative references. More generally, I believe many code-scanning tools produce relative references (like ./foo) by default and would benefit from not having to specialize their handling for GitHub's SARIF consumer.

Actual behavior

Relative URIs like foo/bar work correctly (resolving relative to the repository root), while relative URIs like ./foo/bar do not work correctly.

Workarounds

The primary workarounds here are:

  1. "Normalize" relative URIs from ./foo/bar to foo/bar. This is possible, but non-trivial in the general case (e.g. ./foo/bar/../baz), and requires more pre-processing on the generated SARIF than the specification stipulates.
  2. Switch entirely to absolute URIs, and use SARIF's features (like invocations[0].workingDirectory.uri) to help GitHub ingest and transform paths into appropriate relative URIs. I've had limited success making this work: the SARIF ingestor appears to be very fickle about absolute paths.

Overall, I think both of these workarounds are non-ideal, and add additional burden to SARIF producers to pre-process their inputs/offer a GitHub specific "quirks mode" beyond what SARIF stipulates. I think it would be fantastic if GitHub could instead support these kinds of relative paths!

Additional context

I ran into this behavior within zizmor, which supports SARIF as an output format and encourages people to use it when integrating with GitHub. Some original issue context: woodruffw/zizmor#604, woodruffw/zizmor#571.

Additionally, I've observed that GitHub's behavior around equivalent relative paths (e.g. foo/bar and ./foo/bar) is somewhat mixed: ./foo/bar doesn't produce a preview, but results for foo/bar and ./foo/bar are deduplicated against each other. That makes me think that some degree of normalization/equivalence checking is happening in GitHub's SARIF processing, just not at the point needed to handle previews correctly.

Please let me know if there's any other information I can provide!

CC @kommendorkapten

@cannist
Copy link
Contributor

cannist commented Apr 2, 2025

Thank you for the thorough analysis and information. I have looked at this a bit. Indeed the ingestion does not handle any dot segments at the moment. And we're also not resolving anything when showing paths in the UI.

but results for foo/bar and ./foo/bar are deduplicated against each other.

From the code I don't think that is the case and I cannot reproduce this. We establish the canonical filepath we're for a result using once during ingestion and then that is used everywhere. Do you have an example where two results that differ in their filepath because of dot segments get deduplicated into the same alert?

Regarding fixing this: ideally we would do do dot-segment elimination during ingestion. That would be my preferred solution. The downside is that this would change what we call the stable alert identifier so if we only did that then alerts with dot segments that were processed before the "fix" would not match up anymore with the same result being processed after. There are ways around that but it will be complicated.

Simpler might be to update the display code in the UI but that would then not solve alert matching between different filepath representations.

I cannot just quickly fix this but will discuss it in our team.

@woodruffw
Copy link
Author

Hi @cannist, thank you for your response!

From the code I don't think that is the case and I cannot reproduce this. We establish the canonical filepath we're for a result using once during ingestion and then that is used everywhere. Do you have an example where two results that differ in their filepath because of dot segments get deduplicated into the same alert?

Ah, you're right -- I completely misread my own "Code scanning alerts" panel 🤦. This part is a red herring, then.

I cannot just quickly fix this but will discuss it in our team.

Thank you! I can totally appreciate that the fix here is nontrivial; I appreciate any thought/effort your team puts into it 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants