Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't repeat joins when using multiple Selects in a custom projection on the same entity #33608

Open
samusaran opened this issue Apr 24, 2024 · 1 comment

Comments

@samusaran
Copy link

samusaran commented Apr 24, 2024

Current situation

Currently I have an EF Query built like this:

        var query = GetOffersQuery(identities, loggedUser)
            .AsNoTracking();

        var dtoQuery = query.Select(x => new OfferDTO
            {
                Status = x.Status,
                CalculationDate = x.CalculationDate,
                CreatedDate = x.CreatedDate,
                ProductNames = x.Components.Select(c => c.ProductName),
                Quotations = x.Components.Select(c => c.Header.SalesDoc),
                BatteryTypes = x.Components.Select(c => c.BatteryType),
                ApproverName = x.Approver!.DisplayName,
                CustomerName = x.Customer.Name,
                Id = x.Id,
                SalesmanName = x.Salesman!.DisplayName,
                UploaderName = x.Uploader!.DisplayName,
                OpportunityNumber = x.OpportunityNumber,
                CountryCode = x.CountryCode
            })
            .AsSingleQuery();

Notice that ProductNames, Quotations and BatteryTypes all refers to the Components entity. The Components entity is alread Included inside GetOffersQuery.

This results in the following SQL Server query being generated:

      SELECT [t].[Status], [t].[CalculationDate], [t].[CreatedDate], [t].[Id], [u].[CountryCode], [u].[UserPrincipalName], [u0].[CountryCode], [u0].[UserPrincipalName], [u1].[CountryCode], [u1].[UserPrincipalName], [o1].[ProductName], [o1].[Id], [o2].[Header_SalesDoc], [o2].[Id], [o3].[BatteryType], [o3].[Id], [u].[DisplayName], [t].[Customer_Name], [u0].[DisplayName], [u1].[DisplayName], [t].[OpportunityNumber], [t].[CountryCode]
      FROM (
          SELECT TOP(@__p_1) [o].[Id], [o].[ApprovedBy], [o].[CalculationDate], [o].[CountryCode], [o].[CreatedBy], [o].[CreatedDate], [o].[OpportunityNumber], [o].[SalesEmployee], [o].[Status], [o].[Customer_Name]
          FROM [OfferHeader] AS [o]
          WHERE ([o].[CountryCode] IN (...) OR EXISTS (
              SELECT 1
              FROM [OfferGrantedUsers] AS [o0]
              WHERE [o0].[OfferId] = [o].[Id] AND [o0].[UserPrincipalName] = @__loggedUser_0)) AND [o].[Deleted] = CAST(0 AS bit)
          ORDER BY [o].[CalculationDate] DESC, [o].[Id] DESC
      ) AS [t]
      LEFT JOIN [UserHierarchyRoles] AS [u] ON [t].[CountryCode] = [u].[CountryCode] AND [t].[ApprovedBy] = [u].[UserPrincipalName]
      INNER JOIN [UserHierarchyRoles] AS [u0] ON [t].[CountryCode] = [u0].[CountryCode] AND [t].[SalesEmployee] = [u0].[UserPrincipalName]
      INNER JOIN [UserHierarchyRoles] AS [u1] ON [t].[CountryCode] = [u1].[CountryCode] AND [t].[CreatedBy] = [u1].[UserPrincipalName]
      LEFT JOIN [OfferComponent] AS [o1] ON [t].[Id] = [o1].[OfferId]
      LEFT JOIN [OfferComponent] AS [o2] ON [t].[Id] = [o2].[OfferId]
      LEFT JOIN [OfferComponent] AS [o3] ON [t].[Id] = [o3].[OfferId]
      ORDER BY [t].[CalculationDate] DESC, [t].[Id] DESC, [u].[CountryCode], [u].[UserPrincipalName], [u0].[CountryCode], [u0].[UserPrincipalName], [u1].[CountryCode], [u1].[UserPrincipalName], [o1].[Id], [o2].[Id]

Notice that OfferComponent table gets added once for every Select. In fact, if I remove one or more Selects inside the projection, the number of joins adjust accordingly.

Expected result

I would expect EF (or SqlClient) to only join the table once, since all 3 Selects are based on the same entity. The query I expect is the following:

      SELECT [t].[Status], [t].[CalculationDate], [t].[CreatedDate], [t].[Id], [u].[CountryCode], [u].[UserPrincipalName], [u0].[CountryCode], [u0].[UserPrincipalName], [u1].[CountryCode], [u1].[UserPrincipalName], [o1].[ProductName], [o1].[Id], [o1].[Header_SalesDoc], [o1].[BatteryType], [u].[DisplayName], [t].[Customer_Name], [u0].[DisplayName], [u1].[DisplayName], [t].[OpportunityNumber], [t].[CountryCode]
      FROM (
          SELECT TOP(@__p_1) [o].[Id], [o].[ApprovedBy], [o].[CalculationDate], [o].[CountryCode], [o].[CreatedBy], [o].[CreatedDate], [o].[OpportunityNumber], [o].[SalesEmployee], [o].[Status], [o].[Customer_Name]
          FROM [OfferHeader] AS [o]
          WHERE ([o].[CountryCode] IN (...) OR EXISTS (
              SELECT 1
              FROM [OfferGrantedUsers] AS [o0]
              WHERE [o0].[OfferId] = [o].[Id] AND [o0].[UserPrincipalName] = @__loggedUser_0)) AND [o].[Deleted] = CAST(0 AS bit)
          ORDER BY [o].[CalculationDate] DESC, [o].[Id] DESC
      ) AS [t]
      LEFT JOIN [UserHierarchyRoles] AS [u] ON [t].[CountryCode] = [u].[CountryCode] AND [t].[ApprovedBy] = [u].[UserPrincipalName]
      INNER JOIN [UserHierarchyRoles] AS [u0] ON [t].[CountryCode] = [u0].[CountryCode] AND [t].[SalesEmployee] = [u0].[UserPrincipalName]
      INNER JOIN [UserHierarchyRoles] AS [u1] ON [t].[CountryCode] = [u1].[CountryCode] AND [t].[CreatedBy] = [u1].[UserPrincipalName]
      LEFT JOIN [OfferComponent] AS [o1] ON [t].[Id] = [o1].[OfferId]
      ORDER BY [t].[CalculationDate] DESC, [t].[Id] DESC, [u].[CountryCode], [u].[UserPrincipalName], [u0].[CountryCode], [u0].[UserPrincipalName], [u1].[CountryCode], [u1].[UserPrincipalName], [o1].[Id]

This can benefit by reducing the number of reduntant joins.

I know that one way to fix it would be to project the 3 properties as a single DTO with 3 properties, to enumerate Components once, but this changes the output object definition (and in my case the objective of the DTO projection is exactly to have a flat object)

All tests have been done on:

  • EF Core 8.0.4
  • Microsoft.Data.SqlClient 5.1.5
@roji
Copy link
Member

roji commented May 2, 2024

Makes sense, we should indeed improve this; putting in the backlog for now. We may want to do this after getting rid of the nav expansion step, as this is likely to involve various work there.

As a minimal repro, for a LINQ query such as the following:

_ = await context.Blogs
    .Include(b => b.Posts)
    .Select(b => new
    {
        PostIds = b.Posts.Select(p => p.Id),
        PostTitles = b.Posts.Select(p => p.Title),
        PostRatings = b.Posts.Select(p => p.Rating)
    })
    .ToListAsync();

... we get the following SQL:

SELECT [b].[Id], [p].[Id], [p0].[Title], [p0].[Id], [p1].[Rating], [p1].[Id]
FROM [Blogs] AS [b]
LEFT JOIN [Post] AS [p] ON [b].[Id] = [p].[BlogId]
LEFT JOIN [Post] AS [p0] ON [b].[Id] = [p0].[BlogId]
LEFT JOIN [Post] AS [p1] ON [b].[Id] = [p1].[BlogId]
ORDER BY [b].[Id], [p].[Id], [p0].[Id]
Full minimal repro
await using var context = new BlogContext();
await context.Database.EnsureDeletedAsync();
await context.Database.EnsureCreatedAsync();

_ = await context.Blogs
    .Include(b => b.Posts)
    .Select(b => new
    {
        PostIds = b.Posts.Select(p => p.Id),
        PostTitles = b.Posts.Select(p => p.Title),
        PostRatings = b.Posts.Select(p => p.Rating)
    })
    .ToListAsync();

public class BlogContext : DbContext
{
    public DbSet<Blog> Blogs { get; set; }

    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
        => optionsBuilder
            .UseSqlServer("Server=localhost;Database=test;User=SA;Password=Abcd5678;Connect Timeout=60;ConnectRetryCount=0;Encrypt=false")
            .LogTo(Console.WriteLine, LogLevel.Information)
            .EnableSensitiveDataLogging();
}

public class Blog
{
    public int Id { get; set; }
    public string Name { get; set; }

    public List<Post> Posts { get; set; }
}

public class Post
{
    public int Id { get; set; }
    public string Title { get; set; }
    public int Rating { get; set; }

    public Blog Blog { get; set; }
}

@roji roji added this to the Backlog milestone May 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants