Skip to content

Performance enhancements to validation and reference resolution #241

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Apr 25, 2018

Conversation

darrelmiller
Copy link
Member

@darrelmiller darrelmiller commented Apr 15, 2018

The primary problem resolved in this PR is the walker, which is used for both reference resolution and validation, was walking objects multiple times when they were $ref'd. This has been resolved in the OpenApiWalker by adding a guard that does not walk the object if it has a non-null Reference and we are currently not walking through the Components section.

For the large 30MB, million line MS Graph OpenAPI, the time has gone from > 20mins, to ~ 10 secs on my Surface Pro 4.

I also added some statistics to the workbench to get an overview of the parsed file. e.g. for the Graph API file it produces:

Path Items: 6783
Operations: 10609
Parameters: 23392
Request Bodies: 3815
Responses: 10609
Links: 132
Callbacks: 0
Schemas: 336995

Also updated Workbench to allow taking input from a file.

image

@PerthCharern
Copy link
Contributor

PerthCharern commented Apr 17, 2018

Instead of creating a property InComponents, can we change the signature of Walk for Walk(OpenApiParameter parameter) to Walk(OpenApiParameter parameter, bool isInComponents) and pass the boolean value in the caller?

It seems a bit strange that OpenApiVisitorBase needs to keep a state. #Closed

@@ -170,7 +169,15 @@ private void Validate<T>(T item)
private void Validate(object item, Type type)
{
if (item == null) return; // Required fields should be checked by higher level objects
var rules = _ruleSet.Where(r => r.ElementType == type);

Copy link
Contributor

@PerthCharern PerthCharern Apr 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, per style guideline and consistency, please convert this to

if ( item == null )
{
return;
} #Closed

// We create a new instance of ValidationRuleSet per call as a safeguard
// against unintentional modification of the private _defaultRuleSet.
return new ValidationRuleSet();
}
/// <summary>
Copy link
Contributor

@PerthCharern PerthCharern Apr 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, newline #Closed

@@ -42,6 +56,12 @@ public static ValidationRuleSet GetDefaultRuleSet()
return new ValidationRuleSet(_defaultRuleSet);
}

public static ValidationRuleSet GetEmptyRuleSet()
Copy link
Contributor

@PerthCharern PerthCharern Apr 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public [](start = 8, length = 6)

nit, documentation #Closed

@@ -90,11 +90,14 @@ public OpenApiDocument Read(Stream input, out OpenApiDiagnostic diagnostic)
}

// Validate the document
var errors = document.Validate(_settings.RuleSet);
foreach (var item in errors)
if (_settings.RuleSet != null && _settings.RuleSet.Rules.Count > 0)
Copy link
Contributor

@PerthCharern PerthCharern Apr 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

&& _settings.RuleSet.Rules.Count > 0 [](start = 42, length = 36)

just checking. Is this constraint needed? #Closed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's there for performance. Otherwise we walk the entire graph and do no validation.


In reply to: 182166250 [](ancestors = 182166250)

@@ -176,7 +177,8 @@ private void ResolveObject<T>(T entity, Action<T> assign) where T : class, IOpen

if (IsUnresolvedReference(entity))
{
assign(ResolveReference<T>(entity.Reference));
var x = ResolveReference<T>(entity.Reference);
assign(x);
Copy link
Contributor

@PerthCharern PerthCharern Apr 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change? #Closed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll revert. I did it temporarily whilst hunting down what was taking so long.


In reply to: 182166389 [](ancestors = 182166389)

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
Copy link
Contributor

@PerthCharern PerthCharern Apr 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, copyright #Closed

};

// Act
var errors = document.Validate(new ValidationRuleSet() { new AllwaysFailRule<OpenApiSchema>()});
Copy link
Contributor

@PerthCharern PerthCharern Apr 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AllwaysFailRule [](start = 73, length = 15)

nit, Always typo #Closed


}

public class AllwaysFailRule<P> : ValidationRule<P> where P: IOpenApiElement
Copy link
Contributor

@PerthCharern PerthCharern Apr 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P [](start = 33, length = 1)

Not sure what P stands for. Maybe just use the standard T ? #Closed

[sharedSchema.Reference.Id] = sharedSchema
}
};

Copy link
Contributor

@PerthCharern PerthCharern Apr 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also add this schema to an operation like above to test that scenario as well? #Closed

AcceptsReturn="True"
FontFamily="Consolas"
VerticalContentAlignment="Stretch" />
</DockPanel>
Copy link
Contributor

@PerthCharern PerthCharern Apr 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind posting a screenshot? #Closed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't see the screenshot, but I think it's alright.


In reply to: 182167917 [](ancestors = 182167917)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put it in the PR description. I wasn't sure where else to put it.


In reply to: 184121152 [](ancestors = 184121152,182167917)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Saw it now. Thanks!


In reply to: 184128635 [](ancestors = 184128635,184121152,182167917)


namespace Microsoft.OpenApi.Workbench
{
public class StatsVisitor : OpenApiVisitorBase
Copy link
Contributor

@PerthCharern PerthCharern Apr 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public [](start = 4, length = 6)

nit, document public class/properties #Closed

Copy link
Contributor

@PerthCharern PerthCharern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🕐

@darrelmiller
Copy link
Member Author

I considered passing it as a parameter, but as Components can hold many different types, it would require passing that value almost everywhere. Also, we already hold state while walking. We hold the Path in VisitorBase and we hold stacks for tracking loops in the the Walker. It does make me wonder why we have state in both. Maybe the state can be moved into the walker instead of being in the Visitor.


In reply to: 382076651 [](ancestors = 382076651)

@darrelmiller
Copy link
Member Author

I was able to move the InComponents state into the walker. We still have the Path Stack in the visitor and the ValidationVisitor holds the list of errors detected. So, I don't think we can make the Visitor completely stateless, but we can at least keep it to a minimum.


In reply to: 382087733 [](ancestors = 382087733,382076651)

@@ -37,6 +38,9 @@ public void Walk(OpenApiDocument doc)
{
return;
}
_schemaLoop.Clear();
Copy link
Contributor

@PerthCharern PerthCharern Apr 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, newline #Closed

@PerthCharern
Copy link
Contributor

Sure. This looks okay to me.


In reply to: 384155854 [](ancestors = 384155854,382087733,382076651)

}

[Fact]
public void UnResolvedReferencedSchemaShouldNotBeValidated()
Copy link
Contributor

@PerthCharern PerthCharern Apr 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UnResolvedReferencedSchemaShouldNotBeValidated [](start = 20, length = 46)

nit, casing of unresolved incorrect. also to distinguish this from the test below,

UnresolvedReferencedSchemaInComponentsShouldNotBeValidated #Closed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed the casing issue. One tests is ReferencedSchema, the other is SchemaReference


In reply to: 184120408 [](ancestors = 184120408)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, ok. That makes sense. The other one has a typo though ("Referenced" instead of "Reference")


In reply to: 184126122 [](ancestors = 184126122,184120408)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Referenced" was intentional, but Reference works too. Changed it.


In reply to: 184126692 [](ancestors = 184126692,184126122,184120408)

}

[Fact]
public void UnResolvedSchemaReferencedShouldNotBeValidated()
Copy link
Contributor

@PerthCharern PerthCharern Apr 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UnResolvedSchemaReferencedShouldNotBeValidated [](start = 20, length = 46)

UnresolvedReferenceSchemaInOperationShouldNotBeValidated #Closed

@PerthCharern
Copy link
Contributor

PerthCharern commented Apr 25, 2018

    internal void Validate()

nit, "Validate" doesn't seem like the right name for this method. This actually does all the parsing, validating, and collecting the stats.

Maybe focus on the core part of it - which is reading, and call this ReadDocuemnt or ParseDocument? #Closed


Refers to: src/Microsoft.OpenApi.Workbench/MainModel.cs:166 in 7e51f03. [](commit_id = 7e51f03, deletion_comment = False)

@PerthCharern
Copy link
Contributor

PerthCharern commented Apr 25, 2018

using System;

nit, copyright #Closed


Refers to: src/Microsoft.OpenApi.Workbench/StatsVisitor.cs:1 in 7e51f03. [](commit_id = 7e51f03, deletion_comment = False)


namespace Microsoft.OpenApi.Workbench
{
internal class StatsVisitor : OpenApiVisitorBase
Copy link
Contributor

@PerthCharern PerthCharern Apr 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

internal class StatsVisitor : OpenApiVisitorBase [](start = 3, length = 49)

I'm not quite sure if this is supposed to be in the workbench. Even if it's a very basic functionality, it seems to me that we can still add it to the core Microsoft.OpenApi as a service provided. Is there a reason you specifically put this visitor here? #Closed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put it here as a quick and dirty experimental statistics tool. I don't want to commit to something in the core library until we actually spend a bit of time designing it.


In reply to: 184122675 [](ancestors = 184122675)

Copy link
Contributor

@PerthCharern PerthCharern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@darrelmiller
Copy link
Member Author

    internal void Validate()

I renamed it ParseDocument. It's called Convert on the UI but I'm not a fan of that either.


In reply to: 384346044 [](ancestors = 384346044)


Refers to: src/Microsoft.OpenApi.Workbench/MainModel.cs:166 in 7e51f03. [](commit_id = 7e51f03, deletion_comment = False)

}

[Fact]
public void UnresolvedReferenceSchemaShouldNotBeValidated()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UnresolvedReferenceSchemaShouldNotBeValidated [](start = 20, length = 45)

I was thinking of the other way round. "Referenced" here and "Reference" below. Sorry I wasn't clear :(

@darrelmiller darrelmiller merged commit ef6eba9 into master Apr 25, 2018
@darrelmiller darrelmiller deleted the dm/perf branch January 4, 2020 20:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants