Improve performance by removing sync-over-async by generating sync methods using Zomp.SyncMethodGenerator #2136

virzak · 2023-08-08T11:34:08Z

The main advantage of this approach is performance.

The performance is increased by 26%, 21%, 22%, 28%, 30% and 7% for the following benchmarks in ValidationBenchmark.

useAsync flag

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.22621
AMD Ryzen 9 5900HS with Radeon Graphics, 1 CPU, 16 logical and 8 physical cores
.NET Core SDK=7.0.400-preview.23330.10
[Host] : .NET Core 6.0.20 (CoreCLR 6.0.2023.32017, CoreFX 6.0.2023.32017), X64 RyuJIT
DefaultJob : .NET Core 6.0.20 (CoreCLR 6.0.2023.32017, CoreFX 6.0.2023.32017), X64 RyuJIT

Method	DataSet	Mean	Error	StdDev	Gen 0	Gen 1	Gen 2	Allocated
FailFast	HalfErrors	452.61 ms	19.804 ms	57.456 ms	31000.0000	-	-	249.08 MB
Validate	HalfErrors	613.89 ms	23.585 ms	68.425 ms	45000.0000	-	-	362.54 MB
FailFast	ManyErrors	23.21 ms	0.981 ms	2.766 ms	2562.5000	-	-	20.5 MB
Validate	ManyErrors	996.51 ms	41.464 ms	120.952 ms	55000.0000	3000.0000	-	444.06 MB
FailFast	NoErrors	865.26 ms	29.054 ms	83.360 ms	44000.0000	-	-	355.72 MB
Validate	NoErrors	678.55 ms	42.455 ms	119.745 ms	44000.0000	-	-	355.72 MB

SyncMethodGenerator

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.22621
AMD Ryzen 9 5900HS with Radeon Graphics, 1 CPU, 16 logical and 8 physical cores
.NET Core SDK=7.0.400-preview.23330.10
[Host] : .NET Core 6.0.20 (CoreCLR 6.0.2023.32017, CoreFX 6.0.2023.32017), X64 RyuJIT
DefaultJob : .NET Core 6.0.20 (CoreCLR 6.0.2023.32017, CoreFX 6.0.2023.32017), X64 RyuJIT

Method	DataSet	Mean	Error	StdDev	Median	Gen 0	Gen 1	Gen 2	Allocated
FailFast	HalfErrors	333.36 ms	19.547 ms	57.329 ms	322.04 ms	31000.0000	-	-	249.08 MB
Validate	HalfErrors	484.47 ms	23.499 ms	68.547 ms	475.97 ms	45000.0000	-	-	362.54 MB
FailFast	ManyErrors	18.55 ms	0.754 ms	2.150 ms	17.90 ms	2562.5000	-	-	20.5 MB
Validate	ManyErrors	713.78 ms	36.968 ms	108.420 ms	693.34 ms	55000.0000	3000.0000	-	444.06 MB
FailFast	NoErrors	603.87 ms	25.970 ms	75.756 ms	602.52 ms	44000.0000	-	-	355.72 MB
Validate	NoErrors	632.17 ms	39.229 ms	115.053 ms	626.33 ms	44000.0000	-	-	355.72 MB

Other than that, there is less lines of code and no "acceptable" sync-over-async, which still generates a state machine. It was also throwing me off seeing "Async" (eg ValidateAsync and ValidateInternalAsync) in the stack trace when I wasn't doing anything asynchronously.

The only downside is that the DLL grew from 458 KB to 466 KB, but that's less than 2%.

JeremySkinner · 2023-08-08T12:10:12Z

Thanks for the suggestion - this is an interesting idea. My initial reaction was to say that this isn't something I'm keen on, but thinking about it further I'm open to have a discussion about it but I have a few concerns:

Makes it much less obvious what's happening, I prefer the explicitness of the current approach.
I want to avoid forcing consumers to take on additional dependency. Does this work with PrivateAssets=all in the package reference?
Build fails currently with the error Zomp.SyncMethodGenerator.dll' references version '4.6.0.0' of the compiler, which is newer than the currently running version '4.4.0.0'.. FluentValidation compilation is done with the .NET 7 SDK (7.0.100), targeting following TFMs: netstandard2.0;netstandard2.1;net5.0;net6.0;net7.0 (although netstandard2.0 and net5.0 will be removed in the next major release). I see that's been fixed now

virzak · 2023-08-08T13:01:23Z

In reverse order: library users won't know the difference other than performance improvement and perhaps stack trace. Looking at the new package using NuGet Package Explorer is this:

It seems that no source generator needs to add PrivateAssets=all

About the code being less obvious. Since library users won't know the difference, the question is how less obvious is it for contributors?

You would be correct to say that any source generator is less obvious, because sometimes you can't see a method / property unless you F12 from Visual Studio. Previous contributors might wonder what is CreateSyncVersion attribute.

On the other hand, this sync-over-async approach also had to have a paragraph with explanation of why it is acceptable and any new contributor who knows that Async shouldn't be used in synchronized methods will have a learning curve. Which is the steeper learning curve? Also the end users who see 'Async' in the stack trace might be confused. Btw, that's the exact reason for me submitting this PR.

IMO, while code clarity and obviousness can be argued either way, the boost in performance tips the scale for this change.

I'd be curious to know what was the most confusing part for you in this approach, so perhaps it could be remedied.

For example one idea I had (which isn't yet implemented):

Instead of:

#if SYNC_ONLY
    throw new AsyncValidatorInvokedSynchronouslyException();
#endif

The syntax could be rewritten as

[ThrowWhenInvokedSynchronously<AsyncValidatorInvokedSynchronouslyException>]
internal Func<ValidationContext<T>, CancellationToken, Task<bool>> AsyncCondition => _asyncCondition;

JeremySkinner · 2023-08-08T13:32:19Z

hi @virzak thanks for getting back to me

Looking at the new package using NuGet Package Explorer is this:

What does the list of dependent packages look like?

On the other hand, this sync-over-async approach also had to have a paragraph with explanation of why it is acceptable and any new contributor who knows that Async shouldn't be used in synchronized methods will have a learning curve.

Yep all good points, it's a tradeoff, although personally I prefer having the explicitness of being able to see the method with an explanatory comment, and think this is more contributor friendly.

Also the end users who see 'Async' in the stack trace might be confused. Btw, that's the exact reason for me submitting this PR.

That could be partially solved by dropping the async suffix from the method name if its causing confusion. I'm not precious about using the async suffix for internal methods if it clears up ambiguity.

IMO, while code clarity and obviousness can be argued either way, the boost in performance tips the scale for this change.

At the end of the day I think it'll be a case of whether the performance boost balances out my discomfort or not. I used to maintain 2 explicit code paths (sync & async), although this technically performed better than the current approach, it was a massive pain to maintain and the minor performance degredation was balanced out by other performance savings made elsewhere. I'll be the one working on the codebase and maintaining this approach, so I need to be completely comfortable with it, so it needs to be something I'm comfortable with.

I'd be curious to know what was the most confusing part for you in this approach, so perhaps it could be remedied.

Generally I don't like not being able to see the method implementation, although that's a broader issue with source generators as a whole, and why I don't generally like to use them as I want to actually see the code that I'm compiling. I'm also not fond of the conditional SYNC_ONLY symbol; if the method is now purely async then I wouldn't expect to see any sync code paths in there at all. The attribute as an alternative approach is interesting, although again the more rewriting it's doing behind the scenes the less comfortable I am with it.

Also how battle tested is this in production? It looks pretty new, and for a library like FluentValidation with such a large userbase over such a long period of time I try to prioritise stability and am very cautious about taking a dependency on anything that's not been extensively battle tested.

virzak · 2023-08-08T16:12:28Z

hi @virzak thanks for getting back to me

Looking at the new package using NuGet Package Explorer is this:

What does the list of dependent packages look like?

FluentValidation only references these ones. The Zomp.SyncMethodGenerator source generator does not become a dependency.

System.Collections, Version=7.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a
System.Collections.Concurrent, Version=7.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a
System.Linq, Version=7.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a
System.Linq.Expressions, Version=7.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a
System.Runtime, Version=7.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a
System.Text.RegularExpressions, Version=7.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a
System.Threading, Version=7.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a

On the other hand, this sync-over-async approach also had to have a paragraph with explanation of why it is acceptable and any new contributor who knows that Async shouldn't be used in synchronized methods will have a learning curve.

Yep all good points, it's a tradeoff, although personally I prefer having the explicitness of being able to see the method with an explanatory comment, and think this is more contributor friendly.

Not sure if this is something you're aware of, but you can explicitly see the synchronized method. Generated files are created under dependencies > analysers

And as you edit the asynchronous one, the other gets updated right away:

Also the end users who see 'Async' in the stack trace might be confused. Btw, that's the exact reason for me submitting this PR.

That could be partially solved by dropping the async suffix from the method name if its causing confusion. I'm not precious about using the async suffix for internal methods if it clears up ambiguity.

Async suffix convention could be adhered to with the source generator. That way the stack trace shows Async for asynchronous calls.

IMO, while code clarity and obviousness can be argued either way, the boost in performance tips the scale for this change.

At the end of the day I think it'll be a case of whether the performance boost balances out my discomfort or not. I used to maintain 2 explicit code paths (sync & async), although this technically performed better than the current approach, it was a massive pain to maintain and the minor performance degredation was balanced out by other performance savings made elsewhere. I'll be the one working on the codebase and maintaining this approach, so I need to be completely comfortable with it, so it needs to be something I'm comfortable with.

The proposed approach is much closer to your original solution, except that the massive pain is no longer there. With you having the two editor windows side by side, you're making changes, while ensuring correctness as well.

I'd be curious to know what was the most confusing part for you in this approach, so perhaps it could be remedied.

Generally I don't like not being able to see the method implementation, although that's a broader issue with source generators as a whole, and why I don't generally like to use them as I want to actually see the code that I'm compiling. I'm also not fond of the conditional SYNC_ONLY symbol; if the method is now purely async then I wouldn't expect to see any sync code paths in there at all. The attribute as an alternative approach is interesting, although again the more rewriting it's doing behind the scenes the less comfortable I am with it.

So not seeing the compiled code has been addressed above. SYNC_ONLY symbol was a solution to where a slight variation between sync and async versions occur. It is kinda all encompassing solution, but for FluentValidation an attribute would make more sense and I'll be happy to implement it.

Also how battle tested is this in production? It looks pretty new, and for a library like FluentValidation with such a large userbase over such a long period of time I try to prioritise stability and am very cautious about taking a dependency on anything that's not been extensively battle tested.

Like all 3rd party source generators, this package is pretty new. However, it has been battle tested in my client project (closed source) for a bit less than a year. It has been a great investment, since performance is crucial there. Other than how it is used in FluentValidation, there's a lot of IAsyncEnumerable rewrites into IEnumerable.

It has nearly 100% code coverage and was recently integrated into atldotnet. Also it seems like this source generator appeals mostly to people who aren't aware of the sync-over-async approach. For example here is the discussion with the maintainer of atldotnet.

Totally understand the reluctance of integrating a library with 17 stars into a library with 8.3k stars, but this is merely a source generator, so of all 17 star projects this would probably be the safest. You also have my commitment to resolve any future issues that might occur.

JeremySkinner · 2023-08-08T16:23:12Z

Thanks for getting back to me, I’ll have a further think about it and do some experimenting too and I’ll give you a shout if I have any questions. If we go down this route then it wouldn’t be outside of a major version release anyway which would be a while away which gives me some time to fully consider any implications.

FluentValidation only references these ones.

That’s a list of assembly references. I was referring to the list of dependent package references and whether it shows up in there. Within the generated package they’re listed inside the nuspec file inside the dependencies element.

Not sure if this is something you're aware of, but you can explicitly see the synchronized method

Do you know if that’s a Visual Studio feature or if it works in Rider too? (I don’t use windows/VS)

virzak · 2023-08-08T17:55:00Z

You're right. As it is right now, dependencies include it. Not sure about the ramifications but will look into it.

<dependencies>
  <group targetFramework="net5.0">
    <dependency id="Zomp.SyncMethodGenerator" version="1.1.2" exclude="Build,Analyzers" />
  </group>
  <group targetFramework="net6.0">
    <dependency id="Zomp.SyncMethodGenerator" version="1.1.2" exclude="Build,Analyzers" />
  </group>
  <group targetFramework="net7.0">
    <dependency id="Zomp.SyncMethodGenerator" version="1.1.2" exclude="Build,Analyzers" />
  </group>
  <group targetFramework=".NETStandard2.0">
    <dependency id="System.Threading.Tasks.Extensions" version="4.5.2" exclude="Build,Analyzers" />
    <dependency id="Zomp.SyncMethodGenerator" version="1.1.2" exclude="Build,Analyzers" />
  </group>
  <group targetFramework=".NETStandard2.1">
    <dependency id="Zomp.SyncMethodGenerator" version="1.1.2" exclude="Build,Analyzers" />
  </group>
</dependencies>

Rider supports source generators, but seems a touch slower than Visual Studio.

As I typed new code in the async version, the read-only sync version updated after a slight delay.

Visual Studio Code does not have that support at this time, but I filed an issue.

virzak · 2023-08-08T18:35:02Z

One other consideration regarding this is: what's the ratio of Validate to ValidateAsync in a usual project? The heavier it is towards Validate, the more impactful this change will be. In my main project only Validate is being used.

JeremySkinner · 2023-08-08T18:37:57Z

My general recommendation is to use ValidateAsync as it ensures that all features are usable, and I consider the synchronous API to be legacy. My ideal would be to eventually deprecate and remove the synchronous overloads entirely, but I'm not sure that would ever actually be possible due to its prevalence (plus asp.net's validation pipeline is still synchronous, hence the recommendation not to use it anymore as per #1959)

cremor · 2023-08-11T16:04:16Z

You're right. As it is right now, dependencies include it. Not sure about the ramifications but will look into it.

@virzak Zomp.SyncMethodGenerator should be flagged as a developmentDependency. If you then re-add it to FluentValidation VS should create correct attributes in the <PackageReference> tag to not include it as a dependency.

My general recommendation is to use ValidateAsync as it ensures that all features are usable, and I consider the synchronous API to be legacy.

@JeremySkinner This statement, and seeing the sync-over-async code in this PR, surprised me. I always thought that having a validator that is async would be a rare case. Therefore I always preferred the sync Validate methods because I thought that this will prevent running the unnecessary async state machine code. Seems like my assumption was wrong 😓 (And therefore I support this PR 😃 )

From a code style point of view I'd also prefer using sync Validate calls by default. That way I would immediately notice that something "special" happens in a validator if it is called with ValidateAsync.

I've even already thought about how I will handle the case when the Roslyn Analyzer "CA1849: Call async methods when in an async method" is (hopefully) fixed in .NET 8 and will finally warn about async overloads that take a CancellationToken as an optional parameter. My current plan is to suppress those warnings for ValidateAsync because of the reasons listed above.

Do you know if that’s a Visual Studio feature or if it works in Rider too? (I don’t use windows/VS)

With VS you can also use "Go to definition" or "Find all references" on a partial class declaration and it will show the generated code as one of the found results. I don't use Rider, but maybe it does this too?

JeremySkinner · 2023-08-11T16:36:15Z

hi @cremor thanks for the feedback

This statement, and seeing the sync-over-async code in this PR, surprised me

Just to be clear, this isn't "sync over async" in the way that this term is usually used. There is no risk of deadlocks, and ValueTask is optimised for synchronous completion. This approach has made the library significantly more maintainable and has eliminated several bugs that came about due to subtle divergences in the previous implementation with separate sync & async methods. From my point of view as the sole person maintaining this library, the use of the state machine is an acceptable tradeoff for the increased maintainability. From a performance perspective, my own analysis doesn't show this to be an issue, and if it had I wouldn't have switched to this approach. Also this technique is used by Microsoft internally and if it's good enough for them it's good enough for me.

The use of source generators to solve this an alternative way is interesting, but at present I have too many concerns with this approach. I will re-evaluate when I get round to working on v12.0 again.

JeremySkinner · 2024-02-11T15:03:01Z

hi @virzak I'm revisiting this as I'm starting to think about the 12.0 release again. I've rebased your PR into the 12.x branch, but it doesn't currently build as it looks like the generated code forcibly enables nullable with a #nullable enable. Is there a way to turn this off?

virzak · 2024-02-12T05:19:56Z

Hi @JeremySkinner,

All instances of CreateSyncVersion include OmitNullableDirective. Will be able to do that automatically once dotnet/roslyn#49555 is resolved.

More on that here.
zompinc/sync-method-generator#39

JeremySkinner · 2024-02-12T09:12:19Z

@virzak thanks! Build is passing now, I'll continue to review this as part of the 12.0 development

…thods using Zomp.SyncMethodGenerator

…s="All"

virzak mentioned this pull request Aug 8, 2023

Consider using source generators for regular expressions #2138

Closed

JeremySkinner added the Status: On Hold label Aug 11, 2023

virzak mentioned this pull request Aug 11, 2023

Mark the package as DevelopmentDependency zompinc/sync-method-generator#11

Closed

JeremySkinner force-pushed the sync-method-generator branch from 0da66fc to ab626ba Compare February 11, 2024 15:01

JeremySkinner changed the base branch from main to 12.x-dev February 12, 2024 09:11

virzak added 4 commits April 23, 2024 13:37

Improve performance by removing sync-over-async by generating sync me…

4112575

…thods using Zomp.SyncMethodGenerator

Use stable latest .NET release to build

76e7c87

Remove the source generator from dependencies by setting PrivateAsset…

6e3aff5

…s="All"

Explicitly omit "#nullable enable" generation

19edb2e

JeremySkinner force-pushed the sync-method-generator branch from 7b38b62 to 19edb2e Compare April 23, 2024 12:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve performance by removing sync-over-async by generating sync methods using Zomp.SyncMethodGenerator #2136

Improve performance by removing sync-over-async by generating sync methods using Zomp.SyncMethodGenerator #2136

virzak commented Aug 8, 2023 •

edited

JeremySkinner commented Aug 8, 2023 •

edited

virzak commented Aug 8, 2023

JeremySkinner commented Aug 8, 2023

virzak commented Aug 8, 2023

JeremySkinner commented Aug 8, 2023 •

edited

virzak commented Aug 8, 2023

virzak commented Aug 8, 2023

JeremySkinner commented Aug 8, 2023 •

edited

cremor commented Aug 11, 2023 •

edited

JeremySkinner commented Aug 11, 2023

JeremySkinner commented Feb 11, 2024

virzak commented Feb 12, 2024

JeremySkinner commented Feb 12, 2024

Improve performance by removing sync-over-async by generating sync methods using Zomp.SyncMethodGenerator #2136

Are you sure you want to change the base?

Improve performance by removing sync-over-async by generating sync methods using Zomp.SyncMethodGenerator #2136

Conversation

virzak commented Aug 8, 2023 • edited

useAsync flag

SyncMethodGenerator

JeremySkinner commented Aug 8, 2023 • edited

virzak commented Aug 8, 2023

JeremySkinner commented Aug 8, 2023

virzak commented Aug 8, 2023

JeremySkinner commented Aug 8, 2023 • edited

virzak commented Aug 8, 2023

virzak commented Aug 8, 2023

JeremySkinner commented Aug 8, 2023 • edited

cremor commented Aug 11, 2023 • edited

JeremySkinner commented Aug 11, 2023

JeremySkinner commented Feb 11, 2024

virzak commented Feb 12, 2024

JeremySkinner commented Feb 12, 2024

virzak commented Aug 8, 2023 •

edited

JeremySkinner commented Aug 8, 2023 •

edited

JeremySkinner commented Aug 8, 2023 •

edited

JeremySkinner commented Aug 8, 2023 •

edited

cremor commented Aug 11, 2023 •

edited