Skip to content

Relational: Support translating GroupBy() to SQL #2341

@divega

Description

@divega

Updated as of 3/14/2018 based on implementation introduced

LINQ's GroupBy() operators can sometimes be translated to SQL's GROUP BY clauses, in particular when aggregate functions are applied in the projection.

Scope for 2.1 release

Our current intention is for the scope of the work in 2.1 to improve how LINQ's GroupBy is evaluated in this particular scenario:

Grouping on a simple expression that references a single column or multiple columns (using an anonymous type) of the query root and then projecting any aggregate operation from (Count, Sum, Average, Min, Max) and/or any individual property that is part of the grouping key (this will be translated to GROUP BY in SQL)

What is supported in 2.1.0-preview2

Apart from what is supported in 2.1.0-preview1 (details below), we have also added some more patterns

  • Grouping by constant/variables
  • Grouping by properties of reference navigations
  • Ordering or filtering on grouping key or aggregate
  • Aggregate being DTO/nominal type

Examples

// Grouping by constant or a variable from closure
db.Orders.GroupBy(o => 2).Select(g => g.Count());
var a = 5;
db.Orders.GroupBy(o => a).Select(g => g.Count());

// Grouping by scalar properties from reference navigations
db.Orders.GroupBy(o => o.Customer.City).Select(g => g.Count());

// Ordering by Key/Aggregate after GroupBy
db.Orders.GroupBy(o => o.CustomerID).OrderBy(o => o.Count()).Select(g => new { g.Key, Count = g.Count() });

// Filtering on Key/Aggregate after GroupBy (Translates to Having clause in SQL)
db.Orders.GroupBy(o => o.CustomerID).Where(o => o.Count() > 0).Select(g => new { g.Key, Count = g.Count() });

// Projecting aggregate into nominal type
db.Orders.GroupBy(o => o.CustomerID).Select(g => new CustomerCountInfo { g.Key, Count = g.Count() });

And a few bugfixes - #11218 #11157 #11176

What is supported in 2.1.0-preview1

// Grouping by single column projecting aggregate or key
db.Orders.GroupBy(o => o.CustomerId).Select(g => g.Count());
db.Orders.GroupBy(o => o.CustomerId).Select(g => new { CustomerId = g.Key, Count = g.Count() });

// Grouping by multiple columns (using anonymous type) projecting aggregate or key or nested key
db.Orders.GroupBy(o => new { o.CustomerId, o.OrderDate }).Select(g => g.Sum(o => o.Cost));
db.Orders.GroupBy(o => new { o.CustomerId, o.OrderDate })
  .Select(g => new { CustomerId = g.Key.CustomerId, Sum = g.Sum(o => o.Cost) });
db.Orders.GroupBy(o => new { o.CustomerId, o.OrderDate })
  .Select(g => new { Key = g.Key, Sum = g.Sum(o => o.Cost) }); // Where Key will be anonymous object

// Grouping after complex query root
(from o in db.Orders.Where(o => o.OrderID < 10400).OrderBy(o => o.OrderDate).Take(100)
    join c in db.Customers.Where(c => c.CustomerID != "DRACD" && c.CustomerID != "FOLKO").OrderBy(c => c.City).Skip(10).Take(50)
        on o.CustomerID equals c.CustomerID
    group o by c.CustomerID)
.Select(g => new { g.Key, Count = g.Average(o => o.OrderID) });

db.Orders.OrderBy(o => o.OrderID)
    .Skip(80)
    .Take(500)
    .GroupBy(o => o.CustomerID)
    .Select(g => g.Max(o => o.OrderID));

// All above examples have group of entity types after GroupBy

// Selecting Group of anonymous types containing multiple columns
db.Orders.GroupBy(o => o.CustomerId, new {o.OrderDate, o.Price}).Select(g => g.Sum(t => t.Price));

Scenarios that we are not planning to improve in the 2.1 release

1. Grouping on constants (available in 2.1.0-preview2)
2. Grouping on an entity (e.g. a reference navigation property)
3. Projecting non-aggregate scalar subqueries after grouping, e.g. FirstOrDefault()
4. Making groups of multiple entityTypes using anonymous types.
5. Using Key/Aggregate values after GroupBy in joins (#10012)

All the scenarios above present different variations depending on what happens after the GroupBy, e.g. is there an aggregate function or not, is the key mentioned in the projection or not, etc. These scenarios will still result in client evaluation.

We would appreciate if customers that care about EF Core supporting any of those scenarios that are scoped out from 2.1 to create individual issues for them, up-vote them, and keep the discussion there.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions