API/other improvements for improved in-memory analysis. #2639

michaelcfanning · 2023-03-13T16:13:30Z

No description provided.

michaelcfanning · 2023-03-13T16:21:11Z

src/Sarif/SarifExtensionMethods.cs

@@ -15,8 +15,18 @@

 namespace Microsoft.CodeAnalysis.Sarif
 {
-    public static class ExtensionMethods
+    public static class SarifExtensions


SarifExtensions

For publicly exposed extension methods classes, it's good to qualify the type name, to avoid collisions with this general name 'ExtensionMethods'.

michaelcfanning · 2023-03-13T16:28:43Z

src/Sarif/Writers/SarifLogger.cs

@@ -101,11 +100,6 @@ public class SarifLogger : BaseLogger, IDisposable, IAnalysisLogger
            RuleToIndexMap = new Dictionary<ReportingDescriptor, int>(ReportingDescriptor.ValueComparer);
            ExtensionGuidToIndexMap = new Dictionary<Guid, int>();

-            if (dataToInsert.HasFlag(OptionallyEmittedData.Hashes))


dataToInsert

I'm deleting this aggressive hashing of targets data on initialization for now. First, we previously had at least three mechanisms for hashing, this one, the hash delegate and on-demand hashing via the insert data visitor. This particular mechanism always held all this data in memory, as opposed to the file regions cache mechanism, which only holds that previous 100 files hashed...

michaelcfanning · 2023-03-13T16:29:46Z

src/Sarif/Writers/SarifLogger.cs

@@ -144,7 +139,8 @@ public class SarifLogger : BaseLogger, IDisposable, IAnalysisLogger
                }

            }
-            else if (_run.Tool.Driver?.Rules != null)
+
+            if (_run.Tool.Driver?.Rules != null)


if

Previously, this code insisted that all rule exist on the tool driver OR on the extensions property. In theory, there could be both. Unit test hole here.

src/Test.UnitTests.Sarif.Driver/Sdk/AnalyzeCommandBaseTests.cs

-                            sb.AppendLine($"\t{trace} : did not observe term 'elapsed' in rule timing notifications.");
-                        }
+                    // We expected timing data for every rule.
+                    if (executionNotifications.Count != expectedNotificationsCount)


src/Sarif.Driver/Sdk/MultithreadedAnalyzeCommandBase.cs


        public virtual FileFormat ConfigurationFormat => FileFormat.Json;

        protected MultithreadedAnalyzeCommandBase(IFileSystem fileSystem = null)
        {
-            // TBD can we zap this?
+            Tool ??= Tool.CreateFromAssemblyData();


src/Sarif/FileRegionsCache.cs

                }
            }
            catch (IOException) { }
+            catch (SecurityException) { }


src/Sarif/FileRegionsCache.cs

                }
            }
            catch (IOException) { }
+            catch (SecurityException) { }
+            catch (UnauthorizedAccessException) { }


src/Test.UnitTests.Sarif.Driver/Sdk/AnalyzeCommandBaseTests.cs

michaelcfanning · 2023-03-13T16:31:07Z

src/Sarif/Writers/SarifLogger.cs

@@ -197,12 +192,13 @@

                foreach (string target in analysisTargets)
                {
-                    Uri uri = new Uri(UriHelper.MakeValidUri(target), UriKind.RelativeOrAbsolute);
+                    string uriText = UriHelper.MakeValidUri(target);


MakeValidUri

We call this helper inconsistently and really should call it everywhere.

michaelcfanning · 2023-03-13T16:32:09Z

src/Sarif/Writers/SarifLogger.cs

@@ -262,10 +258,6 @@
            _run.Invocations.Add(invocation);
        }

-        public Func<Uri, HashData> ComputeHashData { get; set; }


ComputeHashData

These mechanisms deleted in preference of a share file regions cache.

michaelcfanning · 2023-03-13T17:15:53Z

src/Sarif.Driver/Sdk/AnalyzeOptionsBase.cs

            HelpText = "A semicolon delimited list to filter output of scan results to one or more failure levels. Valid values: Error, Warning and Note.")]
        public IEnumerable<FailureLevel> Level { get; set; }

-        private FailureLevelSet failureLevels;


FailureLevelSet

Providing defaults in the options classes is problematic because it interferes with the context object default value functionality (which is much richer/close to the analysis model.). In the options classes, we should actually return null for all cases where the property isn't explicit on the command-line, otherwise reflect the command-line precisely. That's the model. There's more clean-up to do.

michaelcfanning · 2023-03-13T17:17:15Z

src/Sarif.Driver/Sdk/CommandBase.cs

-                    {
-                        normalizedSpecifier = uri.LocalPath;
-                    }
+                    normalizedSpecifier = uri.GetFileName();


GetFileName

Here's another clean-up item, we need to call the GetFilePath and GetFileName helpers consistently across the code base (to properly process things like relative URLs).

michaelcfanning · 2023-03-13T17:18:03Z

src/Sarif.Driver/Sdk/MultithreadedAnalyzeCommandBase.cs

@@ -38,13 +38,13 @@

        public static bool RaiseUnhandledExceptionInDriverCode { get; set; }

-        protected virtual Tool Tool { get; set; }
+        public virtual Tool Tool { get; set; }


public

This property needs to be public in order to interact with it properly in scenarios where users proactively create a SARIF logger (and where skimmers may be loaded as extensions).

michaelcfanning · 2023-03-13T17:20:14Z

src/Sarif.Driver/Sdk/MultithreadedAnalyzeCommandBase.cs

@@ -236,24 +236,27 @@
            context.OutputFilePath = options.OutputFilePath;
            context.AutomationGuid = options.AutomationGuid;
            context.BaselineFilePath = options.BaselineFilePath;
-            context.Traces = InitializeStringSet(options.Trace);
+            context.Traces = options.Trace != null ? InitializeStringSet(options.Trace) : context.Traces;


Trace

This pattern needs to be pushed through-out this code.

If options return null consistently, then we can override an existing config object with updated values but ONLY if the user has expressed an override on the command-line.

michaelcfanning · 2023-03-13T17:21:40Z

src/Test.UnitTests.Sarif/Writers/SarifLoggerTests.cs

@@ -511,7 +511,7 @@ public void SarifLogger_ScrapesFilesFromResult()
            {
                using (var sarifLogger = new SarifLogger(textWriter,
                                                         analysisTargets: null,
-                                                         dataToInsert: OptionallyEmittedData.Hashes,
+                                                         dataToInsert: OptionallyEmittedData.None,


None

Interesting bug! Because we didn't have sufficient handling for the dataToInsert property, we missed a case where this data was requested but not available. The fix now prompts request for missing data. Since that's not the purpose of this test, I just dropped the request for hashes data.

michaelcfanning · 2023-03-13T17:23:13Z

src/Test.UnitTests.Sarif.Driver/SarifHelpers.cs

-
-using Xunit;
-
-namespace Microsoft.CodeAnalysis.Sarif.Driver


Microsoft

Turns out that test code doesn't use this delegate action injection mechanism highly, so I just ripped it out.

Moving forward we should be updating tests to use the new improved analysis model (which is designed to allow solid testing using the core analyze command mechanism, just as the actual client tool does). So this sort of testing model should wither away, rather than being built on.

…if-sdk into in-memory-analysis

API/other improvements for improved in-memory analysis.

f072848

michaelcfanning commented Mar 13, 2023

View reviewed changes

github-advanced-security bot found potential problems Mar 13, 2023

View reviewed changes

michaelcfanning commented Mar 13, 2023

View reviewed changes

michaelcfanning added 4 commits March 13, 2023 11:01

PR feedback.

b5be9e3

Fix multitool test. All tests pass.

da6bc6c

Formatting and test fix.

4e449cd

Merge branch 'main' into in-memory-analysis

adb47ad

michaelcfanning marked this pull request as ready for review March 13, 2023 23:39

michaelcfanning requested review from EasyRhinoMSFT, marmegh and cfaucon as code owners March 13, 2023 23:39

michaelcfanning added 6 commits March 13, 2023 16:55

Change default for kind and levels args.

b4d5bc1

Merge branch 'in-memory-analysis' of https://github.com/microsoft/sar…

5524f40

…if-sdk into in-memory-analysis

Fix path retrieval from URI in rule scan time trace.

620b3a5

Change ienumerable arg population to convert empty enumerable to null.

a9766d1

Remove vertical line.

fcac271

Reformatting.

1ab69bd

michaelcfanning merged commit 98d2d25 into main Mar 14, 2023

michaelcfanning deleted the in-memory-analysis branch March 14, 2023 15:27

michaelcfanning mentioned this pull request Apr 12, 2023

Remove options from per-target context creation. #2655

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API/other improvements for improved in-memory analysis. #2639

API/other improvements for improved in-memory analysis. #2639

michaelcfanning commented Mar 13, 2023

michaelcfanning Mar 13, 2023

michaelcfanning Mar 13, 2023

michaelcfanning Mar 13, 2023

michaelcfanning Mar 13, 2023

michaelcfanning Mar 13, 2023

michaelcfanning Mar 13, 2023

michaelcfanning Mar 13, 2023

michaelcfanning Mar 13, 2023

michaelcfanning Mar 13, 2023

michaelcfanning Mar 13, 2023

michaelcfanning Mar 13, 2023

API/other improvements for improved in-memory analysis. #2639

API/other improvements for improved in-memory analysis. #2639

Conversation

michaelcfanning commented Mar 13, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment