Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to perform multiple aggregates without a group by clause? #27117

Open
bachratyg opened this issue Jan 5, 2022 · 5 comments
Open

How to perform multiple aggregates without a group by clause? #27117

bachratyg opened this issue Jan 5, 2022 · 5 comments

Comments

@bachratyg
Copy link

Consider the following query:

from t in db.Table
group (int?)t.Value by 1 into tg
select new
{
	A = tg.Max(),
	B = tg.Count()
	C = tg.Sum()
}

EFC5 translation:

SELECT MAX(m.Value) AS A, COUNT(m.Value) AS B, COALESCE(SUM(m.Value), 0) AS C
FROM Table AS m

EFC6 translation:

SELECT MAX(t.Value) AS A, COUNT(*) AS B, COALESCE(SUM(m.Value), 0) AS C
FROM 
(
	SELECT m.Value, 1 AS Key
	FROM Table AS m
) AS t
GROUP BY t.Key

When the table (i.e. source being aggregated) is empty the EFC5 query returns (NULL, 0), the EFC6 query returns no rows.
Considering that EFC6 behavior matches the in-memory Enumerable behavior I guess this breaking change is intentional. However I would still need a mechanism to aggregate (without a GROUP BY clause) that always returns exactly 1 row whether the source is empty or not. How can I express this in LINQ?

This

from t in db.Table.DefaultIfEmpty()
group (int?)...

returns bad data: (NULL, 1)

This

(from t in ...
).DefaultIfEmpty()

throws an exception:

System.InvalidOperationException: Nullable object must have a value.
   at lambda_method97(Closure , QueryContext , DbDataReader , ResultContext , SingleQueryResultCoordinator )
   at Microsoft.EntityFrameworkCore.Query.Internal.SingleQueryingEnumerable`1.Enumerator.MoveNext()
   at System.Collections.Generic.List`1..ctor(IEnumerable`1 collection)
   at System.Linq.Enumerable.ToList[TSource](IEnumerable`1 source)
   at Program.Main()

Include your code

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net6.0</TargetFramework>
  </PropertyGroup>
  <ItemGroup>
    <PackageReference Include="Microsoft.EntityFrameworkCore.SqlServer" Version="6.0.1" />
  </ItemGroup>
</Project>
class Program
{
	public class Table
	{
		public int Id { get; set; }
		public int? Value { get; set; }
	}
	public class TestDb : DbContext
	{
		public DbSet<Table> Table { get; set; }
		protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
		{
			optionsBuilder.UseSqlServer("data source=dummy");
		}
	}
	static void Main()
	{
		using var db = new TestDb();
		var q = from t in db.Table
		group (int?)t.Value by 1 into tg
		select new
		{
			A = tg.Max(),
			B = tg.Count(),
			C = tg.Sum(),
		};
		var items = q.ToList();
	}
}

Include provider and version information

EF Core version: 6.0.1
Database provider: Microsoft.EntityFrameworkCore.SqlServer 6.0.1
Target framework: .NET 6.0
Operating system: Windows 10 21H1
IDE: Visual Studio 2022 17.0.4

@bachratyg
Copy link
Author

Found a possible workaround

For the case when the grouping should be a standalone query:

from t in db.Table.Where(...).DefaultIfEmpty()
group t by t != null into g
select new
{
	Count = g.Count(a => a != null),
	Sum = g.Sum(a => a.Value),
}

translates to

SELECT COUNT(CASE
    WHEN [t].[Id] IS NOT NULL THEN 1
END) AS [Count], COALESCE(SUM([t].[Value]), 0) AS [Sum]
FROM (
    SELECT NULL AS [empty]
) AS [e]
LEFT JOIN [Table] AS [t] ON 1 = 1
GROUP BY CASE
    WHEN [t].[Id] IS NOT NULL THEN CAST(1 AS bit)
    ELSE CAST(0 AS bit)
END

Ugly, but semantically correct and without significant perf impact.

For the case when a grouped subquery is part of a larger query a simpler result is possible

from a in db.SomeQueryOrADummyTableWithASingleRow()
from b in from t in db.Table
          // Need an outer ref here. This can be a trivial condition e.g. a.Id == a.Id, in this case a WHERE clause is not generated
          // The outer ref can also be a nav property on 'a'
          .Where(t => t.Id > a.Id)
          .DefaultIfEmpty()
          group t by t != null into g
          select new
          {
              Count = g.Count(a => a != null),
              Sum = g.Sum(a => a.Value),
          }
select new
{
	b.Count,
	b.Sum,
}

translates to:

SELECT [t0].[Count], [t0].[Sum]
FROM [SomeQueryOrADummyTableWithASingleRow] AS [t]
OUTER APPLY (
    SELECT COUNT(1) AS [Count], COALESCE(SUM([t1].[Value]), 0) AS [Sum]
    FROM [Table] AS [t1]
    WHERE [t1].[Id] > [t].[Id]
) AS [t0]

Nice and clean, though I feel like I'm exploiting some unintended behavior here that may change without prior notice.

@Liero
Copy link

Liero commented Jun 3, 2023

I too have this question and according to stackoverflow and MSDN anwers others too.

@dfbolin
Copy link

dfbolin commented Jan 23, 2024

I have this issue as well, and I'm hoping there is a better answer now that EF Core 8 is out.

@bachratyg I was using a similar approach in EF Core 6 because it works well for joining multiple aggregates and cleanly handles an empty set (if no rows then 0 is returned for the Count, etc). As you suspected this behavior has changed after updating to EF Core 8.

The generated sql looks like this:

SELECT [t1].[Count], [t1].[Sum]
FROM [SomeQueryOrADummyTableWithASingleRow] AS [s]
OUTER APPLY (
    SELECT COUNT(1) AS [Count], COALESCE(SUM([t0].[Value]), 0) AS [Sum]
    FROM (
        SELECT [t].[Value], CAST(1 AS bit) AS [Key]
        FROM [Table] AS [t]
        WHERE [t].[Id] > [s].[Id]
    ) AS [t0]
    GROUP BY [t0].[Key]
) AS [t1]

The GROUP BY changes the results for an empty set. If no rows in the Table match the condition, then the Count and Sum are null, instead of 0. I'm stuck trying to rewrite queries with this syntax in EF Core 8.

@bachratyg
Copy link
Author

@dfbolin the outer ref hack indeed does not work anymore. You could use the standalone syntax as the subquery in the composite case, though it may not work as expected for your particular query. I've filed #32909 to address this.

@rstarkov
Copy link

rstarkov commented Apr 9, 2024

#13515 is related.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants