-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bug: merge
command generates multiple runs
#2513
Fix bug: merge
command generates multiple runs
#2513
Conversation
@@ -117,7 +121,7 @@ public int Run(MergeOptions mergeOptions) | |||
mergedLog.SchemaUri = mergedLog.Version.ConvertToSchemaUri(); | |||
|
|||
FileSystem.DirectoryCreateDirectory(outputDirectory); | |||
outputFilePath = Path.Combine(outputDirectory, GetOutputFileName(_options, key)); | |||
outputFilePath = Path.Combine(outputDirectory, ReplaceInvalidCharInFileName(GetOutputFileName(_options, key), ".")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If PerRule split strategy is specified, it creates a SarifLog file per rule, the file name is {ruleId}_merged.sarif.
If the ruleId is a sub rule Id, e.g. SEC105/001
, it contains invalid file name char '/' and will cause saving the log file failed.
This is the fix for this issue. #ByDesign
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is the unit test for this issue? Did I miss it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dedicated unit tests added. pls review.
@@ -179,7 +183,14 @@ private async Task<bool> MergeSarifLogsAsync() | |||
}; | |||
} | |||
|
|||
key = CreateRuleKey(result.RuleId, run); | |||
if (_options.MergeRuns) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Acccording to merge-runs
argument description
[Option(
"merge-runs",
HelpText = "Merge runs of the same tool + verion combination (requires " +
"eliding run-specific details such as invocations data.")]
public bool MergeRuns { get; set; }
If merge-runs
is false (default), the merged log runs are aggregated by key ruleId+tool name+tool version
.
if it is set to true, the merged log runs will be aggregated by key tool name+tool version
. This means all results produced by same tool and same version will be merged into 1 single run. #ByDesign
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, your understanding is correct. Do we really have the term verion
burned into our help? If so, can you please correct this to version
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed typo.
Does the tool rewrite JSON Pointers in In reply to: 1199064853 |
src/ReleaseHistory.md
Outdated
@@ -1,5 +1,9 @@ | |||
# SARIF Package Release History (SDK, Driver, Converters, and Multitool) | |||
|
|||
## Unreleased | |||
|
|||
* BUGFIX: Fix issue `merge` command generates SARIF with multiple runs [#2488](https://github.com/microsoft/sarif-sdk/pull/2488) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This description does not clearly describe the issue and the fix. @marmegh or @EasyRhinoMSFT can you please help Yong with this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated pls review. cc @VCOLAWTON
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, let's look at what we're fixing here.
The problem: the merge command was generating a sarif with a run per rule instead of a run per tool.
The description here does not capture that. We need to tell the customers what behavior they may have been experiencing that is now changed. Specific first, concise second.
"Fixed merge
command producing runs per rule to the expected runs per tool."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestions. Updated both, pls review.
src/ReleaseHistory.md
Outdated
@@ -1,5 +1,9 @@ | |||
# SARIF Package Release History (SDK, Driver, Converters, and Multitool) | |||
|
|||
## Unreleased | |||
|
|||
* BUGFIX: Fix issue `merge` command generates SARIF with multiple runs [#2488](https://github.com/microsoft/sarif-sdk/pull/2488) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Has someone built/run your tests and examined code coverage? Do you have any information for us on the completeness of code & data coverage for your tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran code coverage analysis in VS. All code paths I made change were covered. In term of MergeCommand
coverage, it was 84%, added couple test cases its now 94%
@@ -307,6 +318,11 @@ | |||
|
|||
return prefix ?? string.Empty; | |||
} | |||
|
|||
private static string ReplaceInvalidCharInFileName(string fileName, string replacement) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dedicated unit tests added pls review
@@ -31,6 +32,9 @@ | |||
private readonly Dictionary<string, HashSet<Result>> _ruleIdToResultsMap; | |||
private readonly Dictionary<string, RunMergingVisitor> _ruleIdToMergeVisitorsMap; | |||
|
|||
private static readonly string s_invalidFileNameCharRegexString = $"[{Regex.Escape(new string(Path.GetInvalidFileNameChars()))}]"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, moved the function to PathExtensions changed it to an extension method of string.
} | ||
} | ||
|
||
private static Run CreateTestRun(int numberOfResult, string toolName = null, bool createSubRule = false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved the RandomSarifLogGenerator from Test.UnitTests.Sarif to Test.Utilities.Sarif so can re-use its functionalities.
public void MergeCommand_WhenSplitPerRule_LogShouldAggregateByRuleToolVersion() | ||
{ | ||
// 2 logs 2 runs, same tool version, 6 rules | ||
SarifLog sarifLog1 = new SarifLog { Runs = new[] { CreateTestRun(6, null, true) } }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. Its no enforced. In VS there are messages (not warning or erro) in error list showing this kind of issue.
Um, this is an excellent question. We support multiple varieties of intra-logfile link, one of which is a simply reference to a location id (which I think should generally be stable as I believe it's associated with an id). The support for the SARIF schema, I also think, is mostly non-existent. And so I'd guess we have no code whatsoever to assist with this. Yong, we should create an issue for this? In reply to: 1199064853 |
…/github.com/microsoft/sarif-sdk into users/yongyan-gh/resultmatching-splitlogbug
Right, the In reply to: 1200469594 |
@@ -1,5 +1,10 @@ | |||
# SARIF Package Release History (SDK, Driver, Converters, and Multitool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you send me a few sentences that explain what this document is used for? I can refine the language for you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need an overview, please see comment. Thanks, Yong! @yongyan-gh
src/ReleaseHistory.md
Outdated
## Unreleased | ||
|
||
* BUGFIX: Fix the issue `merge` command generates SARIF with multiple runs even specified `--merge-runs` option [#2488](https://github.com/microsoft/sarif-sdk/pull/2488) | ||
* BUGFIX: Fix `System.IO.IOException` and `System.IO.DirectoryNotFoundException` when merge Sarif files contains invalid file name characters in the rule id with `--split PerRule` option. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any other references to these exceptions in this change? Was this broken out into one of your other PRs? Or was this included and not documented in a closed PR? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This error was not reported by any issue, but found during my testing for merge
command. My other 2 PRs are for match-result-forward
command. Its a straightforward fix, so I think it can be fixed along with the multiple runs issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you point me to the specific test implemented to prevent regression?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, in this test `MergeCommandUnitTests.MergeCommand_WhenSplitPerRule_LogShouldAggregateByRuleToolVersion', it creates sarif with id contains invalid char '/', e.g. 'TESTRULE/001', and the fix replaced the invalid char and generate valid file name e.g. 'TESTRULE.001_merged.sarif'
src/Sarif.Driver/PathExtensions.cs
Outdated
@@ -19,6 +20,8 @@ namespace Microsoft.CodeAnalysis.Sarif.Driver | |||
/// <summary>Path extension functions.</summary> | |||
public static class PathExtensions | |||
{ | |||
private static readonly Regex s_invalidFileNameCharRegex = new Regex($"[{Regex.Escape(new string(Path.GetInvalidFileNameChars()))}]", RegexOptions.Compiled | RegexOptions.CultureInvariant); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not an actual issue,
just to note that with current way we will possibly generate different name in Linux and Windows,
and a name without any invalid char (win+Linux) would be
working in both os, easier to share.
the other way of cause is to hard code the combination of them.
each has its pros and cons, just to call it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to resolve the conflicts. In reply to: 1218848244 In reply to: 1218848244 |
resolved In reply to: 1218848244 |
@@ -13,7 +13,7 @@ | |||
/// <summary> | |||
/// Generate random sarif logs for testing. | |||
/// </summary> | |||
internal static class RandomSarifLogGenerator | |||
public static class RandomSarifLogGenerator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/ReleaseHistory.md
Outdated
@@ -1,5 +1,10 @@ | |||
# SARIF Package Release History (SDK, Driver, Converters, and Multitool) | |||
|
|||
## Unreleased | |||
|
|||
* BUGFIX: Fixed `merge` command producing runs per rule to the expected runs per tool. [#2488](https://github.com/microsoft/sarif-sdk/pull/2488) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated as suggested
src/ReleaseHistory.md
Outdated
## Unreleased | ||
|
||
* BUGFIX: Fixed `merge` command producing runs per rule to the expected runs per tool. [#2488](https://github.com/microsoft/sarif-sdk/pull/2488) | ||
* BUGFIX: Fixed `System.IO.IOException` or `System.IO.DirectoryNotFoundException` when `merge` command producing per rule files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated the note as suggested
outputFilePath = Path.Combine(outputDirectory, GetOutputFileName(_options, key)); | ||
outputFilePath = Path.Combine( | ||
outputDirectory, | ||
GetOutputFileName(_options, key).ReplaceInvalidCharInFileName(".")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/ReleaseHistory.md
Outdated
@@ -1,5 +1,10 @@ | |||
# SARIF Package Release History (SDK, Driver, Converters, and Multitool) | |||
|
|||
## Unreleased | |||
|
|||
* BUGFIX: Fixed `merge` command producing runs per rule to the expected runs per tool. [#2488](https://github.com/microsoft/sarif-sdk/pull/2488) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/ReleaseHistory.md
Outdated
## Unreleased | ||
|
||
* BUGFIX: Fixed `merge` command producing runs per rule to the expected runs per tool. [#2488](https://github.com/microsoft/sarif-sdk/pull/2488) | ||
* BUGFIX: Fixed `System.IO.IOException` or `System.IO.DirectoryNotFoundException` when `merge` command producing per rule files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
outputHelper.WriteLine($"TestName: {nameof(PathExtensionsTests)} has random seed {randomSeed}"); | ||
} | ||
|
||
[Theory] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to Fact pattern
VerifyReplaceInvalidCharInFileName(fileName, replacement, expectFileName); | ||
} | ||
|
||
[Theory] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to [Fact] test pattern
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Description
Fixes for below issues:
merge command generates SARIF with multiple runs ([SARIF Multitool] merge command generates SARIF with multiple runs #2488)
By default, the results are grouped into multiple runs per rule id + tool version.
When
merge-runs
argument is set, group the results by tool version, this will place all results of runs produced by same tool same version into a single run.Another issue it fixes is
System.IO.IOException
/System.IO.DirectoryNotFoundException
exception when executemerge
command with--split PerRule
option, and the Sarif files to be merged contains rule Id has invalid character for a file name. E.g. a sub rule id "SEC105/001".With
--split PerRule
option,merge
command creates separate Sarif files for each rule id. E.g. "SEC105/001_merged.sarif". The character "/" in the rule Id is an invalid file name character, it will cause the error to create the merged file.