Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Translate Count(x => p(x)) more efficiently if the SUM(IIF(P, 1, 0)) technique applies #2651

Closed
GSPP opened this issue Jul 17, 2015 · 2 comments

Comments

@GSPP
Copy link

GSPP commented Jul 17, 2015

Count(x => somePredicate) is being translated as a subquery at the moment (EF6). If somePredicate is merely a scalar computation over x (and really whatever else is in scope at that point) EF can use a much better SQL pattern:

SUM(IIF(somePredicate, 1, 0))

(Or anything equivalent).

The same ideas applies to conditional sums:

grouping.Where(x => p(x)).Sum(x => s(x))

Should become:

grouping.Sum(x => p(x) ? s(x) : 0)

Another rewrite:

grouping.Select(x => f(x)).Sum(x => s(x))

Should become:

grouping.Sum(x => s(f(x)))

I don't see how those rewrites could ever do harm. They should be applies successively until everything has been moved into the sum or count. The rewrites should be added to make conditional counts and sum, even over computed values, a lot more efficient.

Test query:

                var results = (from o in db.Orders
                               group o by o.CustomerID into g
                               select new
                               {
                                   CustomerID = g.Key,
                                   Count = g.Count(x => x.ID >= 0),
                               }).ToList();

SQL:

SELECT 
[Distinct1].[CustomerID] AS [CustomerID], 
(SELECT 
    COUNT(1) AS [A1]
    FROM [dbo].[Orders] AS [Extent2]
    WHERE ([Distinct1].[CustomerID] = [Extent2].[CustomerID]) AND ([Extent2].[ID] >= 0)) AS [C1]
FROM ( SELECT DISTINCT 
    [Extent1].[CustomerID] AS [CustomerID]
    FROM [dbo].[Orders] AS [Extent1]
)  AS [Distinct1]

Model:

public class Customer
{
    public int ID { get; set; }
    public string Name { get; set; }
    public bool? HasFreeShipping { get; set; }

    public virtual IList<Order> Orders { get; set; }
}

public class Order
{
    public int ID { get; set; }
    public int CustomerID { get; set; }
    public Customer Customer { get; set; }
    public DateTime CreateDateTime { get; set; }
}
@rowanmiller
Copy link
Contributor

Hey,

This issue relates to EF6, which is managed in a separate repo (currently on CodePlex, but we are considering moving it to GitHub). We have taken a note on #2341 to consider this scenario and the SQL you suggested when we work on GroupBy translation in EF7.

Regarding this scenario in EF6, give that we are going to improve this in EF7 and the relative complexity/risk of adjusting the query pipeline in EF6 we aren't planning to address this in EF6. We would consider accepting a pull request though if someone outside our team wanted to contribute the change.

~Rowan

@GSPP
Copy link
Author

GSPP commented Jul 24, 2015

Rowan,

if this is going into EF7 that would be very nice. I do not especially care about EF6 anyway, this is just the version I used for testing this. I'll be on the latest version long-term.

@ajcvickers ajcvickers reopened this Oct 16, 2022
@ajcvickers ajcvickers closed this as not planned Won't fix, can't repro, duplicate, stale Oct 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants