-
Notifications
You must be signed in to change notification settings - Fork 261
Performance enhancements to validation and reference resolution #241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Instead of creating a property InComponents, can we change the signature of Walk for Walk(OpenApiParameter parameter) to Walk(OpenApiParameter parameter, bool isInComponents) and pass the boolean value in the caller? It seems a bit strange that OpenApiVisitorBase needs to keep a state. #Closed |
@@ -170,7 +169,15 @@ private void Validate<T>(T item) | |||
private void Validate(object item, Type type) | |||
{ | |||
if (item == null) return; // Required fields should be checked by higher level objects | |||
var rules = _ruleSet.Where(r => r.ElementType == type); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, per style guideline and consistency, please convert this to
if ( item == null )
{
return;
} #Closed
// We create a new instance of ValidationRuleSet per call as a safeguard | ||
// against unintentional modification of the private _defaultRuleSet. | ||
return new ValidationRuleSet(); | ||
} | ||
/// <summary> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, newline #Closed
@@ -42,6 +56,12 @@ public static ValidationRuleSet GetDefaultRuleSet() | |||
return new ValidationRuleSet(_defaultRuleSet); | |||
} | |||
|
|||
public static ValidationRuleSet GetEmptyRuleSet() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public [](start = 8, length = 6)
nit, documentation #Closed
@@ -90,11 +90,14 @@ public OpenApiDocument Read(Stream input, out OpenApiDiagnostic diagnostic) | |||
} | |||
|
|||
// Validate the document | |||
var errors = document.Validate(_settings.RuleSet); | |||
foreach (var item in errors) | |||
if (_settings.RuleSet != null && _settings.RuleSet.Rules.Count > 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
&& _settings.RuleSet.Rules.Count > 0 [](start = 42, length = 36)
just checking. Is this constraint needed? #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's there for performance. Otherwise we walk the entire graph and do no validation.
In reply to: 182166250 [](ancestors = 182166250)
@@ -176,7 +177,8 @@ private void ResolveObject<T>(T entity, Action<T> assign) where T : class, IOpen | |||
|
|||
if (IsUnresolvedReference(entity)) | |||
{ | |||
assign(ResolveReference<T>(entity.Reference)); | |||
var x = ResolveReference<T>(entity.Reference); | |||
assign(x); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change? #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll revert. I did it temporarily whilst hunting down what was taking so long.
In reply to: 182166389 [](ancestors = 182166389)
using System; | ||
using System.Collections.Generic; | ||
using System.Linq; | ||
using System.Text; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, copyright #Closed
}; | ||
|
||
// Act | ||
var errors = document.Validate(new ValidationRuleSet() { new AllwaysFailRule<OpenApiSchema>()}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AllwaysFailRule [](start = 73, length = 15)
nit, Always typo #Closed
|
||
} | ||
|
||
public class AllwaysFailRule<P> : ValidationRule<P> where P: IOpenApiElement |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P [](start = 33, length = 1)
Not sure what P stands for. Maybe just use the standard T ? #Closed
[sharedSchema.Reference.Id] = sharedSchema | ||
} | ||
}; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also add this schema to an operation like above to test that scenario as well? #Closed
AcceptsReturn="True" | ||
FontFamily="Consolas" | ||
VerticalContentAlignment="Stretch" /> | ||
</DockPanel> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mind posting a screenshot? #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't see the screenshot, but I think it's alright.
In reply to: 182167917 [](ancestors = 182167917)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put it in the PR description. I wasn't sure where else to put it.
In reply to: 184121152 [](ancestors = 184121152,182167917)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
namespace Microsoft.OpenApi.Workbench | ||
{ | ||
public class StatsVisitor : OpenApiVisitorBase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public [](start = 4, length = 6)
nit, document public class/properties #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🕐
I considered passing it as a parameter, but as Components can hold many different types, it would require passing that value almost everywhere. Also, we already hold state while walking. We hold the Path in VisitorBase and we hold stacks for tracking loops in the the Walker. It does make me wonder why we have state in both. Maybe the state can be moved into the walker instead of being in the Visitor. In reply to: 382076651 [](ancestors = 382076651) |
I was able to move the InComponents state into the walker. We still have the Path Stack in the visitor and the ValidationVisitor holds the list of errors detected. So, I don't think we can make the Visitor completely stateless, but we can at least keep it to a minimum. In reply to: 382087733 [](ancestors = 382087733,382076651) |
@@ -37,6 +38,9 @@ public void Walk(OpenApiDocument doc) | |||
{ | |||
return; | |||
} | |||
_schemaLoop.Clear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, newline #Closed
Sure. This looks okay to me. In reply to: 384155854 [](ancestors = 384155854,382087733,382076651) |
} | ||
|
||
[Fact] | ||
public void UnResolvedReferencedSchemaShouldNotBeValidated() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UnResolvedReferencedSchemaShouldNotBeValidated [](start = 20, length = 46)
nit, casing of unresolved incorrect. also to distinguish this from the test below,
UnresolvedReferencedSchemaInComponentsShouldNotBeValidated #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed the casing issue. One tests is ReferencedSchema, the other is SchemaReference
In reply to: 184120408 [](ancestors = 184120408)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, ok. That makes sense. The other one has a typo though ("Referenced" instead of "Reference")
In reply to: 184126122 [](ancestors = 184126122,184120408)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Referenced" was intentional, but Reference works too. Changed it.
In reply to: 184126692 [](ancestors = 184126692,184126122,184120408)
} | ||
|
||
[Fact] | ||
public void UnResolvedSchemaReferencedShouldNotBeValidated() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UnResolvedSchemaReferencedShouldNotBeValidated [](start = 20, length = 46)
UnresolvedReferenceSchemaInOperationShouldNotBeValidated #Closed
nit, "Validate" doesn't seem like the right name for this method. This actually does all the parsing, validating, and collecting the stats. Maybe focus on the core part of it - which is reading, and call this ReadDocuemnt or ParseDocument? #Closed Refers to: src/Microsoft.OpenApi.Workbench/MainModel.cs:166 in 7e51f03. [](commit_id = 7e51f03, deletion_comment = False) |
|
||
namespace Microsoft.OpenApi.Workbench | ||
{ | ||
internal class StatsVisitor : OpenApiVisitorBase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
internal class StatsVisitor : OpenApiVisitorBase [](start = 3, length = 49)
I'm not quite sure if this is supposed to be in the workbench. Even if it's a very basic functionality, it seems to me that we can still add it to the core Microsoft.OpenApi as a service provided. Is there a reason you specifically put this visitor here? #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put it here as a quick and dirty experimental statistics tool. I don't want to commit to something in the core library until we actually spend a bit of time designing it.
In reply to: 184122675 [](ancestors = 184122675)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
} | ||
|
||
[Fact] | ||
public void UnresolvedReferenceSchemaShouldNotBeValidated() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UnresolvedReferenceSchemaShouldNotBeValidated [](start = 20, length = 45)
I was thinking of the other way round. "Referenced" here and "Reference" below. Sorry I wasn't clear :(
The primary problem resolved in this PR is the walker, which is used for both reference resolution and validation, was walking objects multiple times when they were $ref'd. This has been resolved in the OpenApiWalker by adding a guard that does not walk the object if it has a non-null Reference and we are currently not walking through the Components section.
For the large 30MB, million line MS Graph OpenAPI, the time has gone from > 20mins, to ~ 10 secs on my Surface Pro 4.
I also added some statistics to the workbench to get an overview of the parsed file. e.g. for the Graph API file it produces:
Path Items: 6783
Operations: 10609
Parameters: 23392
Request Bodies: 3815
Responses: 10609
Links: 132
Callbacks: 0
Schemas: 336995
Also updated Workbench to allow taking input from a file.