Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEATURE: Add --normalize-for-ghas argument to the rewrite command #2581

Merged
merged 12 commits into from
Jan 18, 2023

Conversation

shaopeng-gh
Copy link
Collaborator

This adds the function to rewrite a SARIF file to meet requirement of GitHub / GHAS.

}

node.Results = errors;
}
Copy link
Collaborator Author

@shaopeng-gh shaopeng-gh Dec 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please also see the test file "NonErrorResults.sarif'.
This does not seem right, why do we want to remove warning results?
#Closed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps GitHub only used to process errors? Have you tested this in their system?

Copy link
Collaborator Author

@shaopeng-gh shaopeng-gh Dec 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is from the current SDK repo:
image

and also those linters works I had been using warning to upload to GHAS, shows similar to above.

@@ -16,5 +16,33 @@ public class RewriteOptions : SingleFileOptionsBase
Default = false,
HelpText = "Sort results in the final output file.")]
public bool SortResults { get; set; }

[Option(
Copy link
Collaborator Author

@shaopeng-gh shaopeng-gh Dec 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Option(

All these options are copied from other file with existing name and HelpText. #Closed

this.run.Tool.Driver.GlobalMessageStrings?.TryGetValue(node.Id, out formatString) == true)
{
node.Text = node.Arguments?.Count > 0
? string.Format(CultureInfo.CurrentCulture, formatString.Text, node.Arguments.ToArray())

Check warning

Code scanning / CodeQL

Dereferenced variable may be null

Variable [formatString](1) may be null at this access as suggested by [this](2) null check.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a false alarm to me

}
}
],
"artifacts": [
Copy link
Collaborator Author

@shaopeng-gh shaopeng-gh Dec 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

artifacts

3 things changed,
after rewrite:

  1. baseuri is created
  2. messages are flatten
  3. artifacts are removed #Closed

{
if (node.Text == null)
{
ReportingDescriptor rule = this.ruleIndex != -1 ? this.run.Tool.Driver.Rules[this.ruleIndex] : null;
Copy link

@KalleOlaviNiemitalo KalleOlaviNiemitalo Dec 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't fully implement SARIF-2.1.0 §3.11.7 Message string lookup. I think this won't find messages of tool extensions. #Resolved

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are correct, created an issue to track.

Comment on lines 151 to 153
node.Text = node.Arguments?.Count > 0
? rule.Format(node.Id, node.Arguments)
: formatString?.Text;
Copy link

@KalleOlaviNiemitalo KalleOlaviNiemitalo Dec 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the result of string.Format still contains substrings like "{0}", should this double the braces, in order to prevent the SARIF consumer from treating those as placeholders and attempting to expand them again? Also, should this reset node.Arguments to null? #Resolved

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the result is put in .Text.
.Text is for plain text, consumer should not try to expand any placeholders in it.

should this reset node.Arguments to null ---- currently, .text is added side by side with .id and .arguments.
The information is duplicated, the benefit is we can un-flatten if needed by simply remove the .text field.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.Text is for plain text, consumer should not try to expand any placeholders in it.

That's not how I read the standard.

So the text property of a message object may contain placeholders. Does the SARIF consumer in GitHub Advanced Security deviate from the standard in this respect?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With more reading I think you are right! The Text can contain placeholder again!

Within both plain text and formatted message strings, the characters “{” and “}” SHALL be represented by the character sequences “{{” and “}}” respectively. ----- so, I think I will replace { with {{ and } with }}, let me know if sounds good to you.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will replace { with {{ and } with }} -- this is done and updated the test case.

this.run.Tool.Driver.GlobalMessageStrings?.TryGetValue(node.Id, out formatString) == true)
{
node.Text = node.Arguments?.Count > 0
? string.Format(CultureInfo.CurrentCulture, formatString.Text, node.Arguments.ToArray())

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a false alarm to me

@@ -161,6 +138,32 @@ public override Result VisitResult(Result node)
return base.VisitResult(node);
}

public override Message VisitMessage(Message node)
{
if (node.Text == null)
Copy link

@KalleOlaviNiemitalo KalleOlaviNiemitalo Dec 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need the same for node.Markdown? #Closed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is form GitHub, I think GitHub requires .Text only

@@ -12,6 +12,7 @@
* BUGFIX: Eliminate `IOException` and `DirectoryNotFoundException` exceptions thrown by `merge` command when splitting by rule (due to invalid file characters in rule ids). [#2513](https://github.com/microsoft/sarif-sdk/pull/2513)
* BUGFIX: Fix classes inside NotYetAutoGenerated folder missing `virtual` keyword for public methods and properties, by regenerate and manually sync the changes. [#2537](https://github.com/microsoft/sarif-sdk/pull/2537)
* FEATURE: Allow external set of `MaxFileSizeInKilobytes`, which will allow SDK users to change the value. (Default value is 1024) [#2578](https://github.com/microsoft/sarif-sdk/pull/2578)
* FEATURE: Add `--normalize-for-github` argument to the `rewrite` command to get GitHub supported SARIF format. [#2581](https://github.com/microsoft/sarif-sdk/pull/2581)
Copy link
Member

@michaelcfanning michaelcfanning Dec 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good release note item but let me add a bit more detail, try this:

  • FEATURE: Add --normalize-for-github argument to the rewrite command to ensure rewritten SARIF is compatible with GitHub Advanced Security ingestion requirements. #2581 #Closed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, fixed

this.artifacts = node.Artifacts;
this.threadFlowLocations = node.ThreadFlowLocations;

if (node.Results != null)
{
int errorsCount = 0;
Copy link
Member

@michaelcfanning michaelcfanning Dec 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

errorsCount

I wouldn't delete this until you confirm that GH can accept warnings. Also, if this visitor was correct, the validator should also fire on any warnings in a log file (warning that they will be stripped). #Closed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes i think it supports warning, please see my other reply with screenshot.

@@ -144,6 +119,8 @@

public override Result VisitResult(Result node)
{
this.ruleIndex = node.RuleIndex;

if (node.Fingerprints != null)
{
// GitHub appears to require that fingerprints be emitted to the
Copy link
Member

@michaelcfanning michaelcfanning Dec 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s to require

We have a message string look-up helper that you should use. As Kalle notes, this is a pretty complex operation.

OK, I see you grabbed this code from the insert optional data visitor. We shouldn't copy code like this, minimally, it should be in a common helper somewhere (you could have created a static in the other visitor, for example).

Let me keep looking for a better helper, I could swear we have one. #Closed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, nope. It appears that we don't. So the correct course here is to please create a common helper for both this code and the insert optional data visitor to call. Second, create an issue for the problem that Kalle notes: 'Message creation logic does not properly search tool extensions for format strings.'

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it, will work on this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done both add helper method and create an issue
#2582

@@ -64,6 +65,25 @@ public int Run(RewriteOptions options)
reformattedLog = new SortingVisitor().VisitSarifLog(reformattedLog);
}

if (options.NormalizeForGitHub)
{
if ((reformattedLog.Runs?.Any(r => r.Artifacts?.Any(a => a.Location?.Uri?.IsAbsoluteUri == true) == true) == true) ||
Copy link
Member

@michaelcfanning michaelcfanning Dec 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reformattedLog

So I don't like these two lines of code which will traverse the log file twice.

Can't we simply invoke the RebaseUriVisitor in cases when those args are non-null.

And then make the GitHubIngestionVisitor throw new arg exception on encountering the any absolute uri as you detect here.
#Closed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made the suggested change.
Though we can not simply throw in GitHubIngestionVisitor, because it is a same method VisitArtifactLocation to visit both of it in result and also the originalUriBaseIds,
we allow one and not the other to be absolute.
Currently I does not check and throw, let me know your instruction.

@lgtm-com
Copy link

lgtm-com bot commented Dec 7, 2022

This pull request fixes 1 alert when merging b17469e into 3fa7f43 - view on LGTM.com

fixed alerts:

  • 1 for Dereferenced variable may be null

Heads-up: LGTM.com's PR analysis will be disabled on the 5th of December, and LGTM.com will be shut down ⏻ completely on the 16th of December 2022. Please enable GitHub code scanning, which uses the same CodeQL engine ⚙️ that powers LGTM.com. For more information, please check out our post on the GitHub blog.

src/Sarif/Visitors/VisitorHelper.cs Fixed Show fixed Hide fixed
src/Sarif/Visitors/VisitorHelper.cs Fixed Show fixed Hide fixed
@@ -16,5 +16,33 @@ public class RewriteOptions : SingleFileOptionsBase
Default = false,
HelpText = "Sort results in the final output file.")]
public bool SortResults { get; set; }

[Option(
"normalize-for-github",
Copy link
Collaborator

@yongyan-gh yongyan-gh Jan 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"normalize-for-github"

Question: the Convert command has the same normalize-for-github parameter, can we use Convert command to achieve the same? #Closed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Convert means 'convert from some other non-SARIF format to SARIF.' This operation is a SARIF->SARIF transformation, not a 'conversion'.

One thing, we could use 'normalize-for-ghas' instead as a more precise designation (though 'ghas' may be more opaque to users.

I'm fine with either choice.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use 'normalize-for-ghas' instead as a more precise designation
---- sounds good, will change it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use 'normalize-for-ghas' instead as a more precise designation
---- done.

Copy link
Collaborator

@yongyan-gh yongyan-gh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@michaelcfanning michaelcfanning changed the title FEATURE: Add --normalize-for-github argument to the rewrite command FEATURE: Add --normalize-for-ghas argument to the rewrite command Jan 18, 2023
@@ -28,6 +28,8 @@
* FEATURE : Allow initialization of file regions cache in `InsertOptionalDataVisitor` (previously initialized exclusively from `FileRegionsCache.Instance`).
* FEATURE : Provide 'RuleScanTime` trace and emitted timing data. Provide `ScanExecution` trace with no utilization.
* FEATURE : Populate associated rule data in `LogToolNotification` as called from `SarifLogger`. [#2604](https://github.com/microsoft/sarif-sdk/pull/2604)
* FEATURE : Add `--normalize-for-ghas` argument to the `rewrite` command to ensure rewritten SARIF is compatible with GitHub Advanced Security (GHAS) ingestion requirements. [#2581](https://github.com/microsoft/sarif-sdk/pull/2581)
* BREAKING: Rename `--normalize-for-github` argument to `--normalize-for-ghas` for `convert` command. [#2581](https://github.com/microsoft/sarif-sdk/pull/2581)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BREAKING

move this to top

@@ -161,6 +137,15 @@
return base.VisitResult(node);
}

public override Message VisitMessage(Message node)
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return

add a blank line.

@@ -144,6 +118,8 @@

public override Result VisitResult(Result node)
{
this.ruleIndex = node.RuleIndex;

if (node.Fingerprints != null)
{
// GitHub appears to require that fingerprints be emitted to the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VisitMessage

add unit test for result with rule id only, no index

{
public static class VisitorHelper
{
public static void FlattenMessage(Message node, int ruleIndex, Run run)
Copy link
Member

@michaelcfanning michaelcfanning Jan 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

node

name this message #Resolved

@@ -24,8 +24,8 @@ public class ConvertOptions : SingleFileOptionsBase
public string PluginAssemblyPath { get; set; }

[Option(
"normalize-for-github",
HelpText = "Normalize converted output to conform to GitHub Advanced Security code scanning ingestion requirements.")]
public bool NormalizeForGitHub { get; set; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NormalizeForGitHub

leave this and mark it obsolete??

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's go and remove it, per hulon.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added it back as obsolete, per the last decision from Michael:
image

@@ -16,5 +16,33 @@ public class RewriteOptions : SingleFileOptionsBase
Default = false,
HelpText = "Sort results in the final output file.")]
public bool SortResults { get; set; }

[Option(
"normalize-for-ghas",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"normalize-for-ghas"

@shaopeng-gh Sorry if I missed this: this is a required argument?

@@ -28,6 +28,8 @@
* FEATURE : Allow initialization of file regions cache in `InsertOptionalDataVisitor` (previously initialized exclusively from `FileRegionsCache.Instance`).
* FEATURE : Provide 'RuleScanTime` trace and emitted timing data. Provide `ScanExecution` trace with no utilization.
* FEATURE : Populate associated rule data in `LogToolNotification` as called from `SarifLogger`. [#2604](https://github.com/microsoft/sarif-sdk/pull/2604)
* FEATURE : Add `--normalize-for-ghas` argument to the `rewrite` command to ensure rewritten SARIF is compatible with GitHub Advanced Security (GHAS) ingestion requirements. [#2581](https://github.com/microsoft/sarif-sdk/pull/2581)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • FEATURE : Add --normalize-for-ghas argument to the rewrite command to ensure rewritten SARIF is compatible with GitHub Advanced Security (GHAS) ingestion requirements.

@shaopeng-gh Can you add some more details on what these requirements are? That would make the PR self-contained, otherwise it is a little opaque to me why we are making these changes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per Yong in the PR meeting, it is from the task
"Binskim action send log file to GHAS".

@@ -16,5 +16,33 @@ public class RewriteOptions : SingleFileOptionsBase
Default = false,
HelpText = "Sort results in the final output file.")]
public bool SortResults { get; set; }

[Option(
"normalize-for-ghas",
Copy link
Member

@michaelcfanning michaelcfanning Jan 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

normalize-for-ghas

don't delete the previous option, mark it as obsolete and make sure it continues to work. update the release notes. #Resolved

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Default = false,
HelpText = "All relative uris will be rebased to the base-path-value if true. Default is false."
)]
public bool RebaseRelativeUris { get; set; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RebaseRelativeUris

as a next step, you need to make sure rewrite always honors this command in all cases.

Copy link
Member

@michaelcfanning michaelcfanning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@michaelcfanning michaelcfanning merged commit afc9681 into main Jan 18, 2023
@michaelcfanning michaelcfanning deleted the users/shaopeng-gh/ghas branch January 18, 2023 21:12
This was referenced Feb 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants