Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Joining between a GroupBy and a Select translates to inefficient SQL #2639

Closed
GSPP opened this issue Jul 16, 2015 · 1 comment
Closed

Joining between a GroupBy and a Select translates to inefficient SQL #2639

GSPP opened this issue Jul 16, 2015 · 1 comment

Comments

@GSPP
Copy link

GSPP commented Jul 16, 2015

Query:

            var results = (from o in db.Orders
                           group o by o.CustomerID into g
                           join c in db.Customers on g.Key equals c.ID
                           select new
                           {
                               CustomerID = c.ID,
                               Count = g.Count(),
                           }).ToList();

SQL:

SELECT 
[Distinct1].[CustomerID] AS [CustomerID], 
[Extent2].[ID] AS [ID], 
(SELECT 
    COUNT(1) AS [A1]
    FROM [dbo].[Orders] AS [Extent3]
    WHERE [Distinct1].[CustomerID] = [Extent3].[CustomerID]) AS [C1]
FROM   (SELECT DISTINCT 
    [Extent1].[CustomerID] AS [CustomerID]
    FROM [dbo].[Orders] AS [Extent1] ) AS [Distinct1]
INNER JOIN [dbo].[Customers] AS [Extent2] ON [Distinct1].[CustomerID] = [Extent2].[ID]

Model:

public class Customer
{
    public int ID { get; set; }
    public string Name { get; set; }
    public bool? HasFreeShipping { get; set; }

    public virtual IList<Order> Orders { get; set; }
}

public class Order
{
    public int ID { get; set; }
    public int CustomerID { get; set; }
    public Customer Customer { get; set; }
    public DateTime CreateDateTime { get; set; }
}

Note, that the GroupBy is being computed twice: Once from the real GroupBy that's in the query and then as a computed subquery column. This seems unnecessary. EF seems to be unable to move the aggregation in the Select back into the GROUP BY.

The following test case reproduces this as well:

            var results = (from o in db.Orders
                           group o by o.CustomerID into g
                           let someVal = g.Key + 1 //unused column
                           select new
                           {
                               CustomerID = g.Key,
                               Count = g.Count(),
                               someVal //you can comment this in or out. fails both times
                           }).ToList();

SELECT 
[Distinct1].[CustomerID] AS [CustomerID], 
(SELECT 
    COUNT(1) AS [A1]
    FROM [dbo].[Orders] AS [Extent2]
    WHERE [Distinct1].[CustomerID] = [Extent2].[CustomerID]) AS [C1]
FROM ( SELECT DISTINCT 
    [Extent1].[CustomerID] AS [CustomerID]
    FROM [dbo].[Orders] AS [Extent1]
)  AS [Distinct1]

This seems like a significant problem because failing to move all aggregations into the GROUP BY can lead to exploding query runtime.

@rowanmiller
Copy link
Contributor

Hey,

This issue relates to EF6, which is managed in a separate repo (currently on CodePlex, but we are considering moving it to GitHub). We have taken a note on #2341 to consider this scenario and the SQL you suggested when we work on GroupBy translation in EF7.

Regarding this scenario in EF6, give that we are going to improve this in EF7 and the relative complexity/risk of adjusting the query pipeline in EF6 we aren't planning to address this in EF6. We would consider accepting a pull request though if someone outside our team wanted to contribute the change.

~Rowan

@ajcvickers ajcvickers reopened this Oct 16, 2022
@ajcvickers ajcvickers closed this as not planned Won't fix, can't repro, duplicate, stale Oct 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants