Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aggregates on multiple child tables produces invalid SQL #27163

Closed
jmussett opened this issue Jan 11, 2022 · 0 comments
Closed

Aggregates on multiple child tables produces invalid SQL #27163

jmussett opened this issue Jan 11, 2022 · 0 comments
Assignees
Labels
area-query closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. customer-reported Servicing-approved type-bug
Milestone

Comments

@jmussett
Copy link

jmussett commented Jan 11, 2022

Doing a .Select on multiple aggregates of different child tables after a .GroupBy produces invalid SQL.

Steps to reproduce

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net6.0</TargetFramework>
  </PropertyGroup>
  <ItemGroup>
    <PackageReference Include="Microsoft.EntityFrameworkCore.SqlServer" Version="6.0.1" />
  </ItemGroup>
</Project>
using Microsoft.EntityFrameworkCore;
using System.Linq;

class Program
{
	public class Parent
	{
		public int Id { get; set; }
		public Child1 Child1 { get; set; }
		public Child2 Child2 { get; set; }
	}

	public class Child1
	{
		public int Id { get; set; }
		public string Value1 { get; set; }
	}

	public class Child2
	{
		public int Id { get; set; }
		public string Value2 { get; set; }
	}

	public class TestDb : DbContext
	{
		public DbSet<Parent> Table { get; set; }
		protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
		{
			optionsBuilder.UseSqlServer("data source=dummy");
		}
	}
	static void Main()
	{
		using var db = new TestDb();

		var q = db.Table
			.GroupBy(x => new { })
			.Select(g => new
			{
				Test1 = g
					.Select(x => x.Child1.Value1)
					.Distinct()
					.Count(),
				Test2 = g
					.Select(x => x.Child2.Value2)
					.Distinct()
					.Count()
			});

		var sql = q.ToQueryString();
	}
}

Expected translation (Approximation):

SELECT COUNT(DISTINCT ([c].[Value1])) AS [Test1], COUNT(DISTINCT ([c2].[Value2])) AS [Test2]
FROM (
    SELECT [t].[Child1Id], [t].[Child2Id], 1 AS [Key]
    FROM [Table] AS [t]
) AS [t0]
LEFT JOIN [Child1] AS [c] ON [t0].[Child1Id] = [c].[Id]
LEFT JOIN [Child2] AS [c2] ON [t0].[Child2Id] = [c2].[Id]
GROUP BY [t0].[Key]

Actual translation:

SELECT COUNT(DISTINCT ([c].[Value1])) AS [Test1], COUNT(DISTINCT ([c].[Value2])) AS [Test2]
FROM (
    SELECT [t].[Child1Id], 1 AS [Key]
    FROM [Table] AS [t]
) AS [t0]
LEFT JOIN [Child1] AS [c] ON [t0].[Child1Id] = [c].[Id]
GROUP BY [t0].[Key]

Provider and version information

EF Core version: 6.0.1
Database provider: Microsoft.EntityFrameworkCore.SqlServer 6.0.1
Target framework: .NET 6.0
Operating system: Windows 10 Pro
IDE: Visual Studio 2022 17.0.1

smitpatel added a commit that referenced this issue Jan 12, 2022
Earlier we only matched alias and skipped if alias matched. This could happen if the table name starts with same character.
Added logic to unwrap the join table and match exact table name/schema to identify if it is parent table
This may not be the best fix but it avoids different table matching and ends up generating invalid SQL in normal scenario.
The additional tables from navigation are going to be joinExpression only. If we find a shape which is not what we expect we fallback to previous behavior.
The inner table of joinExpression may not always be table (think of a case where there is query filter probably). Though fixing for a complicated case may cause instability in running queries. Deferring for that till customer reports. In most cases running into this bug will generate invalid SQL exception

Resolves #27163
@smitpatel smitpatel added closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. Servicing-consider labels Jan 12, 2022
@smitpatel smitpatel added this to the 6.0.x milestone Jan 12, 2022
smitpatel added a commit that referenced this issue Jan 14, 2022
Earlier we only matched alias and skipped if alias matched. This could happen if the table name starts with same character.
Even if we do deeper match unwinding joins, we cannot match if the join is to subquery (which can be generated when target of navigation has query filter).
Solution:
When applying group by on SelectExpression, remember the original table count. Once Groupby has been applied we cannot add more joins to SelectExpression other than group by aggregate term lifting.
During lifting,
- If aliases of original tables in parent and subquery doesn't match then we assume subquery is non-grouping element rooted and we don't lift.
- If parent have lifted additional joins, one of them being a subquery join, then we abort lifting if the subquery contains a join to lift which is a subquery.
- If we are allowed to join after first 2 checks then,
  - We copy over owned entity in initial tables
  - We try to match additional joins after initial if they are table joins and joining to same table, in which case we don't need to join them again.
  - We copy over all other joins.

Resolves #27163
@smitpatel smitpatel modified the milestones: 6.0.x, 6.0.3 Jan 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-query closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. customer-reported Servicing-approved type-bug
Projects
None yet
Development

No branches or pull requests

3 participants