Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEFAULT_WORKSPACE is prefixed to SARIF-output field uri but should not be. #2006

Closed
janderssonse opened this issue Oct 25, 2022 · 8 comments · Fixed by #2383
Closed

DEFAULT_WORKSPACE is prefixed to SARIF-output field uri but should not be. #2006

janderssonse opened this issue Oct 25, 2022 · 8 comments · Fixed by #2383
Labels
bug Something isn't working nostale This issue or pull request is not stale, keep it open

Comments

@janderssonse
Copy link
Contributor

Describe the bug
DEFAULT_WORKSPACE is prefixed to uri in the SARIF-output, in my case, breaks SARIF-Viewer/VSCode, but would be the same for any alike setup.

To Reproduce
Example:

Run the megalinter container, here standing in the root of the repo you want to check:

  1. podman run --volume $(pwd):/repo DEFAULT_WORKSPACE=/repo -e LOG_LEVEL=INFO oxsecurity/megalinter:v6.13.0
  2. Open up your fav editor, for example VSCode, (with SARIF-Vieever extension installed)
  3. Open a sarif file from VSCode.(from megalinter-reports/sarif
  4. While rules are shown, notice VSCode cant find the path to the error. Look in the SARIF-FILE, see that all "uri"s have the DEFAULT_WORKSPACE prefixed to them., EG /repo/src/main etc.
  5. Removing the prefixed DEFAULT_WORKSPACE so we have expected src/main. Now it works!

Expected behavior
DEFAULT_WORKSPACE should not be prefixed to SARIF-output. This variable is something that most likely only exists during runtime generation, and makes the SARIF-output closely tied to that run.

@janderssonse janderssonse added the bug Something isn't working label Oct 25, 2022
@nvuillam
Copy link
Member

There is already a replacement in generated SARIF for some values like /tmp/lint or /github/workspace

The case with custom value is indeed not taken in account, we can arrange that ,:)

Meanwhile if you use /tmp/lint it will be ok :)

As some linters directly generate a SARIF file, content of SARIF folder may not be post-processed, you may directly use the unique SARIF file generated at report root :)

@github-actions
Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity.
It will be closed in 14 days if no further activity occurs.
Thank you for your contributions.

If you think this issue should stay open, please remove the O: stale 🤖 label or comment on the issue.

@github-actions github-actions bot added the O: stale 🤖 This issue or pull request is stale, it will be closed if there is no activity label Nov 25, 2022
@janderssonse
Copy link
Contributor Author

I will give this one a shot when I have the time. I had a look at the code, and I think this one could be implemented in the method fix_sarif_physical_location by just looking if the uri field starts with the same as DEFAULT_WORKSPACE, (if set) and if so, remove that prefix.

@github-actions github-actions bot removed the O: stale 🤖 This issue or pull request is stale, it will be closed if there is no activity label Nov 29, 2022
@janderssonse
Copy link
Contributor Author

A small PR is now here #2119.
It is to be considerad as a partial fix.

That will fix the problem for the aggregated sarifreport.

However, to fully fix the problem, I suggest moving the whole run of the sarif-fixing logic in SarifReporter-class, to be run on the individiual sarif outputs. Ie to move the fix phase so it is done just somewhere around Linter.py, row 899, with a call to a "normallize_sarif" which is basically the same logit that currently is happing in SarifReporter.py for cleaning up.

The benefits would be that both the indiividual sarif output files for each linteer and the end aggregate report would be useful!

If that sounds like a good idea, i could start creating an pr for adjusting an moving the sarif report cleanup method to that earlier phase instead.

@janderssonse
Copy link
Contributor Author

@nvuillam Working on small pr for this, where the sarif-fix/normalization are done earlier in the phase (as above described, in practice just moves the logic call.) We'll see if you like it and if I get to finish it, and so on, but a small practical question here: currently the files are saved under the sarif-files folder under report folder. Would prefer that each of those is overwritten with the fixed/normalized version OR that the fixed version is also added with, for example a .normalized.sarif prefix? I.e

Would you prefer as now:
BASH_SHELLCHECK.sarif
JAVA_PMD.sarif.

where these two are "normalized" files

OR

BASH_SHELLCHECK.sarif
BASH_SHELLCHECK.normalized.sarif
JAVA_PMD.sarif
JAVA_PMD.normalized.sarif

ie doubles the number of files, but makes any bug-hunting easier in the future (if something is normalized to much :)

@nvuillam
Copy link
Member

nvuillam commented Dec 21, 2022

@janderssonse I'd prefer option 3 :)
Keep only one sarif file by linter, do not normalize initial sarif files by default (better perfs), but normalize them if there is a config var SARIF_REPORTER_NORMALIZE_LINTERS_OUTPUT set to true ? :)

@github-actions
Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity.
It will be closed in 14 days if no further activity occurs.
Thank you for your contributions.

If you think this issue should stay open, please remove the O: stale 🤖 label or comment on the issue.

@github-actions github-actions bot added the O: stale 🤖 This issue or pull request is stale, it will be closed if there is no activity label Jan 21, 2023
@janderssonse
Copy link
Contributor Author

janderssonse commented Feb 1, 2023

Hi don't stale me yet. I did an working feature take a few weeks ago, I only need to get the time free to add as configurable, and wrap it up.

@nvuillam nvuillam removed the O: stale 🤖 This issue or pull request is stale, it will be closed if there is no activity label Feb 1, 2023
@Kurt-von-Laven Kurt-von-Laven added the nostale This issue or pull request is not stale, keep it open label Feb 3, 2023
janderssonse added a commit to janderssonse/megalinter that referenced this issue Feb 21, 2023
This PR is a suggestion for solving the use case
of needing to remove the DEFAULT_WORKSPACE from
the out put in the generated SARIF output.
(oxsecurity#2006).

It moves the SARIF logic to an earlier phase, to be handled
before the aggregate SARIF generation.
It replaces the prefix if the flag
SARIF_REPORTER_NORMALIZE_LINTERS_OUTPUT: true is set

Implementation is done by line parsing and replacing,
as a node traversal solution quickly grew due to
the many places in the sarif out put the uri can be found
(metris, relatedLocations, and so on), and the code
is much simpler this way to maintain.

Improvements and suggestions:
Could dumps and resulting json string be used in
a reliable way to line parse an json file? I didn't
find a good way.
Should the option be renamed to
SARIF_REPORTER_DISABLE_DEFAULT_WORKSPACE_IN_OUTPUT
or alike. As the pre existing normalization still happens?
(We don't change that pre existing behaviour in this PR,
only the DEFAULT_WORKSPACE prefix part).

Signed-off-by: Josef Andersson <josef.andersson@gmail.com>
janderssonse added a commit to janderssonse/megalinter that referenced this issue Feb 21, 2023
This PR is a suggestion for solving the use case
of needing to remove the DEFAULT_WORKSPACE from
the out put in the generated SARIF output.
(oxsecurity#2006).

It moves the SARIF logic to an earlier phase, to be handled
before the aggregate SARIF generation.
It replaces the prefix if the flag
SARIF_REPORTER_NORMALIZE_LINTERS_OUTPUT: true is set
(default: true).

Implementation is done by line parsing and replacing,
as a node traversal solution quickly grew due to
the many places in the sarif out put the uri can be found
(metris, relatedLocations, and so on), and the code
is much simpler this way to maintain.

Improvements and suggestions:
Could dumps and resulting json string be used in
a reliable way to line parse an json file? I didn't
find a good way.
Should the option be renamed to
SARIF_REPORTER_DISABLE_DEFAULT_WORKSPACE_IN_OUTPUT
or alike. As the pre existing normalization still happens?
(We don't change that pre existing behaviour in this PR,
only the DEFAULT_WORKSPACE prefix part).

Signed-off-by: Josef Andersson <josef.andersson@gmail.com>
janderssonse added a commit to janderssonse/megalinter that referenced this issue Apr 7, 2023
This PR is a suggestion for solving the use case
of needing to remove the DEFAULT_WORKSPACE from
the out put in the generated SARIF output.
(oxsecurity#2006).

It moves the SARIF logic to an earlier phase, to be handled
before the aggregate SARIF generation.
It replaces the prefix if the flag
SARIF_REPORTER_NORMALIZE_LINTERS_OUTPUT: true is set
(default: true).

Implementation is done by line parsing and replacing,
as a node traversal solution quickly grew due to
the many places in the sarif out put the uri can be found
(metris, relatedLocations, and so on), and the code
is much simpler this way to maintain.

Improvements and suggestions:
Could dumps and resulting json string be used in
a reliable way to line parse an json file? I didn't
find a good way.
Should the option be renamed to
SARIF_REPORTER_DISABLE_DEFAULT_WORKSPACE_IN_OUTPUT
or alike. As the pre existing normalization still happens?
(We don't change that pre existing behaviour in this PR,
only the DEFAULT_WORKSPACE prefix part).

Signed-off-by: Josef Andersson <josef.andersson@gmail.com>
janderssonse added a commit to janderssonse/megalinter that referenced this issue Apr 9, 2023
This PR is a suggestion for solving the use case
of needing to remove the DEFAULT_WORKSPACE from
the out put in the generated SARIF output.
(oxsecurity#2006).

It moves the SARIF logic to an earlier phase, to be handled
before the aggregate SARIF generation.
It replaces the prefix if the flag
SARIF_REPORTER_NORMALIZE_LINTERS_OUTPUT: true is set
(default: true).

Implementation is done by line parsing and replacing,
as a node traversal solution quickly grew due to
the many places in the sarif out put the uri can be found
(metris, relatedLocations, and so on), and the code
is much simpler this way to maintain.

Improvements and suggestions:
Could dumps and resulting json string be used in
a reliable way to line parse an json file? I didn't
find a good way.
Should the option be renamed to
SARIF_REPORTER_DISABLE_DEFAULT_WORKSPACE_IN_OUTPUT
or alike. As the pre existing normalization still happens?
(We don't change that pre existing behaviour in this PR,
only the DEFAULT_WORKSPACE prefix part).

Signed-off-by: Josef Andersson <josef.andersson@gmail.com>
nvuillam pushed a commit that referenced this issue Apr 11, 2023
This PR is a suggestion for solving the use case
of needing to remove the DEFAULT_WORKSPACE from
the out put in the generated SARIF output.
(#2006).

It moves the SARIF logic to an earlier phase, to be handled
before the aggregate SARIF generation.
It replaces the prefix if the flag
SARIF_REPORTER_NORMALIZE_LINTERS_OUTPUT: true is set
(default: true).

Implementation is done by line parsing and replacing,
as a node traversal solution quickly grew due to
the many places in the sarif out put the uri can be found
(metris, relatedLocations, and so on), and the code
is much simpler this way to maintain.

Improvements and suggestions:
Could dumps and resulting json string be used in
a reliable way to line parse an json file? I didn't
find a good way.
Should the option be renamed to
SARIF_REPORTER_DISABLE_DEFAULT_WORKSPACE_IN_OUTPUT
or alike. As the pre existing normalization still happens?
(We don't change that pre existing behaviour in this PR,
only the DEFAULT_WORKSPACE prefix part).

Signed-off-by: Josef Andersson <josef.andersson@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working nostale This issue or pull request is not stale, keep it open
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants