Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring Include infrastructure to store expression info. #83

Merged
merged 4 commits into from
Feb 17, 2021

Conversation

fiseni
Copy link
Collaborator

@fiseni fiseni commented Feb 14, 2021

This PR addresses a few limitations we have with Include in specifications. Since we're not able to store second level expressions in this form Expression<Func<TPreviousProperty, TProperty>> (read more here), we're extracting the property name and concatenating it to the navigation path. This in turn limits us on what expressions we can use for Include. We can use only "property selector" expressions. That's almost 90% of the use cases, but anyhow, we're limiting the full potential of the EF queries, and all advancements that they can introduce in the future.

In order to mitigate this, we simply have to store this information in some acceptable form, and then during the evaluation should be able to restore the initial expression. A new type IncludeExpressionInfo is introduced, which will hold all information to effectively describe an expression. Then, the new extension methods to IQueryable will utilize this info to call Include or ThenInclude EF methods by reflection.

By implementing this we will be able to support the new features of EF Core 5, e.g. filtered includes. So, this addresses issue #67 too.

Breaking changes

There are no breaking changes in the usage, the builders have the same API, and any written specification today will still work. All integration tests passed without any modification. Anyhow the ISpecification includes changes, no longer holds a collection of IIncludeAggregator, instead, it holds a collection of IncludeExpressionInfo. Anyone that accessed the specification properties directly, have written their own evaluators, or have written various extensions, then this change represents a breaking change.

Performance considerations

Even though we don't utilize reflection heavily, still, it's being used. I don't have performance tests and can't quantify exactly the implications. If anyone can work on that would be great. I don't expect some huge performance issues, but it's worth noting the possible implications.

Implementation

There are changes in the internal infrastructure, but the core logic of this PR is the extensions to IQueryable

public static class IncludeExtensions
{
    public static IQueryable<T> Include<T>(this IQueryable<T> source, IncludeExpressionInfo info)
    {
        _ = info ?? throw new ArgumentNullException(nameof(info));

        var queryExpr = Expression.Call(
            typeof(EntityFrameworkQueryableExtensions),
            "Include",
            new Type[] {
                info.EntityType,
                info.PropertyType
            },
            source.Expression,
            info.LambdaExpression
            );

        return source.Provider.CreateQuery<T>(queryExpr);
    }
    public static IQueryable<T> ThenInclude<T>(this IQueryable<T> source, IncludeExpressionInfo info)
    {
        _ = info ?? throw new ArgumentNullException(nameof(info));
        _ = info.PreviousPropertyType ?? throw new ArgumentNullException(nameof(info.PreviousPropertyType));

        var queryExpr = Expression.Call(
            typeof(EntityFrameworkQueryableExtensions),
            "ThenInclude",
            new Type[] {
                info.EntityType,
                info.PreviousPropertyType,
                info.PropertyType
            },
            source.Expression,
            info.LambdaExpression
            );

        return source.Provider.CreateQuery<T>(queryExpr);
    }
}

@fiseni fiseni requested a review from ardalis February 14, 2021 14:19
@ardalis
Copy link
Owner

ardalis commented Feb 17, 2021

@fiseni this looks good to me.

@fiseni
Copy link
Collaborator Author

fiseni commented Feb 17, 2021

Ok, I'm merging this one.

@fiseni fiseni merged commit 1a420b7 into ardalis:master Feb 17, 2021
@dotnetgeek
Copy link

That are pretty good news. Is there a plan for creating a new version / package that will be published to nuget?

@fiseni
Copy link
Collaborator Author

fiseni commented Feb 18, 2021

We'll publish a new major version soon. We have some more features and changes that we want to introduce (see #84).

@nelsonprsousa
Copy link

nelsonprsousa commented Feb 20, 2021

@fiseni is it possible to test this? I don't know - and I think it is not possible - how we can target master using NuGet.

If this is not possible, maybe we can already create an alpha/beta version?

cc @ardalis

@ardalis
Copy link
Owner

ardalis commented Feb 20, 2021

@fiseni is it possible to test this? I don't know - and I think it is not possible - how we can target master using NuGet.

If this is not possible, maybe we can already create an alpha/beta version?

cc @ardalis

You can build locally to test. Just make a regular project reference or pack and put in a local nuget folder. We aren't publishing prerelease versions of the library at this time. Maybe we could but usually the time between adding enough features for a release and publishing them as a new major version is under a week or two so I'm not sure it would be worth it.

@nelsonprsousa
Copy link

@fiseni is it possible to test this? I don't know - and I think it is not possible - how we can target master using NuGet.
If this is not possible, maybe we can already create an alpha/beta version?
cc @ardalis

You can build locally to test. Just make a regular project reference or pack and put in a local nuget folder. We aren't publishing prerelease versions of the library at this time. Maybe we could but usually the time between adding enough features for a release and publishing them as a new major version is under a week or two so I'm not sure it would be worth it.

I see. Gonna download and use a local nuget folder, as you suggested.

Thank you

@dotnetgeek
Copy link

I did a local build and used it for my scenario, my test are look really good. It works as expected.

@nelsonprsousa
Copy link

I did a local build and used it for my scenario, my test are look really good. It works as expected.

Did you targeted EF 5?

@fiseni
Copy link
Collaborator Author

fiseni commented Feb 21, 2021

Ok, I did some performance tests. Tested the evaluation of Include, given an expression or string.
In order to see the difference, I did testing while using EF directly, and using the specifications (which includes the evaluation on EF side).

Evaluations count: 1.000

Method Mean Error StdDev Median Gen 0 Gen 1 Gen 2 Allocated
EFIncludeExpression 5.037 ms 0.2655 ms 0.7704 ms 5.238 ms - - - 1937.5 KB
EFIncludeString 1.844 ms 0.0658 ms 0.1909 ms 1.873 ms - - - 789.06 KB
SpecIncludeExpression 6.021 ms 0.2738 ms 0.8029 ms 5.808 ms - - - 3140.63 KB
SpecIncludeString 2.731 ms 0.0535 ms 0.1068 ms 2.695 ms - - - 1210.94 KB

Evaluations count: 1.000.000

Method Mean Error StdDev Median Gen 0 Gen 1 Gen 2 Allocated
EFIncludeExpression 3.606 s 0.0721 s 0.1521 s 3.540 s 316000.0000 - - 1892.09 MB
EFIncludeString 1.446 s 0.0052 s 0.0041 s 1.446 s 128000.0000 - - 770.57 MB
SpecIncludeExpression 5.187 s 0.0439 s 0.0411 s 5.185 s 512000.0000 1000.0000 - 3067.02 MB
SpecIncludeString 1.926 s 0.0103 s 0.0096 s 1.921 s 197000.0000 - - 1182.56 MB

Summary

First of all, using the include feature by passing string is always faster, regardless if you're using EF directly or utilizing specifications.
If we analyze further, at first glance we see that using include expressions with specifications is almost 45% slower for 1M evaluations. But, I think reasoning in terms of percentage gives a wrong conclusion here. It's better to talk of absolute values. For example, if you have 1.000.000 requests (which is a quite high number of requests), using an include expression with specifications, will add an overhead of ~1.5 seconds. The rest remains the same.

Simplified, if I run a web application on this same laptop, serving 1M web requests might take almost a minute (~ 15K req/sec). If I use specifications, then it will be a minute plus a second.

@fiseni
Copy link
Collaborator Author

fiseni commented Feb 21, 2021

And here is the actual test, if anyone is interested.

[MemoryDiagnoser]
public class SpecIncludeBenchmark
{
    private readonly int max = 1000000;
    private readonly SpecificationEvaluator evaluator = SpecificationEvaluator.Default;
    private readonly Specification<Store> specInclude = new StoreIncludeProductsSpec();
    private readonly Specification<Store> specIncludeString = new StoreIncludeProductsAsStringSpec();

    private readonly IQueryable<Store> Stores;

    public SpecIncludeBenchmark()
    {
        Stores = new BenchmarkDbContext().Stores.AsQueryable();
    }

    [Benchmark]
    public void EFIncludeExpression()
    {
        for (int i = 0; i < max; i++)
        {
            _ = Stores.Include(x => x.Products);
        }
    }

    [Benchmark]
    public void EFIncludeString()
    {
        for (int i = 0; i < max; i++)
        {
            _ = Stores.Include(nameof(Store.Products));
        }
    }

    [Benchmark]
    public void SpecIncludeExpression()
    {
        for (int i = 0; i < max; i++)
        {
            _ = evaluator.GetQuery(Stores, specInclude);
        }
    }

    [Benchmark]
    public void SpecIncludeString()
    {
        for (int i = 0; i < max; i++)
        {
            _ = evaluator.GetQuery(Stores, specIncludeString);
        }
    }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants