Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove reference joins in split queries #29182

Open
stevendarby opened this issue Sep 22, 2022 · 5 comments
Open

Remove reference joins in split queries #29182

stevendarby opened this issue Sep 22, 2022 · 5 comments

Comments

@stevendarby
Copy link
Contributor

stevendarby commented Sep 22, 2022

If I include both reference navigations and collection navigations in a query and execute with AsSplitQuery then joins to the reference navigations are present in all of the collection queries. This doesn't appear to be necessary and can slow down the query significantly if there are multiple reference navigations.

Repro:

using System;
using Microsoft.EntityFrameworkCore;
using Microsoft.Extensions.Logging;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;

{
    using var context = new MyContext();

    if (context.Database.EnsureCreated())
    {
        var blogType = new BlogType { Type = "Development" };
        var blog = new Blog { Name = "EF", BlogType = blogType };
        var post = new Post { Title = "Split Query", Blog = blog };

        context.AddRange(blogType, blog, post);
        context.SaveChanges();
    }
}
{
    using var context = new MyContext();
    var result = context.Blogs
        .Include(x => x.BlogType)
        .Include(x => x.Posts)
        .AsSplitQuery()
        .ToList();
}

public class MyContext : DbContext
{
    public DbSet<Blog> Blogs { get; set; }

    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
        => optionsBuilder
            .UseSqlServer("Server=.;Database=Split;Trusted_Connection=True;Encrypt=False")
            .LogTo(Console.WriteLine, LogLevel.Information);
}

public class Blog
{
    public int Id { get; set; }
    public string Name { get; set; }
    public int BlogTypeId { get; set; }
    public BlogType BlogType { get; set; }
    public ICollection<Post> Posts { get; set; }
}

public class Post
{
    public int Id { get; set; }
    public string Title { get; set; }
    public int BlogId { get; set; }
    public Blog Blog { get; set; }
}

public class BlogType
{
    public int Id { get; set; }
    public string Type { get; set; }
    public ICollection<Blog> Blogs { get; set; }
}

Produces this SQL:

SELECT [b].[Id], [b].[BlogTypeId], [b].[Name], [b0].[Id], [b0].[Type]
FROM [Blogs] AS [b]
INNER JOIN [BlogType] AS [b0] ON [b].[BlogTypeId] = [b0].[Id]
ORDER BY [b].[Id], [b0].[Id]

SELECT [p].[Id], [p].[BlogId], [p].[Title], [b].[Id], [b0].[Id]
FROM [Blogs] AS [b]
INNER JOIN [BlogType] AS [b0] ON [b].[BlogTypeId] = [b0].[Id]
INNER JOIN [Post] AS [p] ON [b].[Id] = [p].[BlogId]
ORDER BY [b].[Id], [b0].[Id]

I believe the second query could simply be:

SELECT [p].[Id], [p].[BlogId], [p].[Title], [b].[Id]
FROM [Blogs] AS [b]
INNER JOIN [Post] AS [p] ON [b].[Id] = [p].[BlogId]
ORDER BY [b].[Id]

The reduced joins, and reduced fields selected and in the order by, would improve the query plan. I don't think the BlogType ID is required to match the Posts up to the Blog.

Note that in this example, I am just using a single reference navigation, but when multiple are included, joins to all of them are repeated in each collection query and this begins to really impact the performance.

Same behaviour also occurs in non-entity projection and split query.

@stevendarby stevendarby changed the title Unnecessary joins in split queries Unnecessary joins in split queries causing performance issues Sep 22, 2022
@stevendarby
Copy link
Contributor Author

Just a quick example with a couple more reference navigations and an extra collection navigation, just to show how the problem grows. The reference navigations are also optional in this example.

SELECT [b].[Id], [b].[AccountId], [b].[BlogTypeId], [b].[Name], [b].[OwnerId], [b0].[Id], [b0].[Type], [o].[Id], [o].[Name], [a].[Id], [a].[Name]
FROM [Blogs] AS [b]
LEFT JOIN [BlogType] AS [b0] ON [b].[BlogTypeId] = [b0].[Id]
LEFT JOIN [Owner] AS [o] ON [b].[OwnerId] = [o].[Id]
LEFT JOIN [Account] AS [a] ON [b].[AccountId] = [a].[Id]
ORDER BY [b].[Id], [b0].[Id], [o].[Id], [a].[Id]

SELECT [p].[Id], [p].[BlogId], [p].[Title], [b].[Id], [b0].[Id], [o].[Id], [a].[Id]
FROM [Blogs] AS [b]
LEFT JOIN [BlogType] AS [b0] ON [b].[BlogTypeId] = [b0].[Id]
LEFT JOIN [Owner] AS [o] ON [b].[OwnerId] = [o].[Id]
LEFT JOIN [Account] AS [a] ON [b].[AccountId] = [a].[Id]
INNER JOIN [Post] AS [p] ON [b].[Id] = [p].[BlogId]
ORDER BY [b].[Id], [b0].[Id], [o].[Id], [a].[Id]

SELECT [s].[Id], [s].[BlogId], [s].[Name], [b].[Id], [b0].[Id], [o].[Id], [a].[Id]
FROM [Blogs] AS [b]
LEFT JOIN [BlogType] AS [b0] ON [b].[BlogTypeId] = [b0].[Id]
LEFT JOIN [Owner] AS [o] ON [b].[OwnerId] = [o].[Id]
LEFT JOIN [Account] AS [a] ON [b].[AccountId] = [a].[Id]
INNER JOIN [Subscriber] AS [s] ON [b].[Id] = [s].[BlogId]
ORDER BY [b].[Id], [b0].[Id], [o].[Id], [a].[Id]

I believe this could be simplified to:

SELECT [b].[Id], [b].[AccountId], [b].[BlogTypeId], [b].[Name], [b].[OwnerId], [b0].[Id], [b0].[Type], [o].[Id], [o].[Name], [a].[Id], [a].[Name]
FROM [Blogs] AS [b]
LEFT JOIN [BlogType] AS [b0] ON [b].[BlogTypeId] = [b0].[Id]
LEFT JOIN [Owner] AS [o] ON [b].[OwnerId] = [o].[Id]
LEFT JOIN [Account] AS [a] ON [b].[AccountId] = [a].[Id]
ORDER BY [b].[Id], [b0].[Id], [o].[Id], [a].[Id]

SELECT [p].[Id], [p].[BlogId], [p].[Title], [b].[Id]
FROM [Blogs] AS [b]
INNER JOIN [Post] AS [p] ON [b].[Id] = [p].[BlogId]
ORDER BY [b].[Id]

SELECT [s].[Id], [s].[BlogId], [s].[Name], [b].[Id]
FROM [Blogs] AS [b]
INNER JOIN [Subscriber] AS [s] ON [b].[Id] = [s].[BlogId]
ORDER BY [b].[Id]

@ajcvickers ajcvickers added this to the Backlog milestone Sep 27, 2022
@stevendarby stevendarby changed the title Unnecessary joins in split queries causing performance issues Remove reference joins in split queries Oct 19, 2022
@stevendarby
Copy link
Contributor Author

stevendarby commented Oct 19, 2022

Thanks for accepting this into the backlog. I believe this has come up at least once before and was closed, perhaps due to misunderstanding: #25731

Bringing query filters into it muddied the waters but I'm fairly sure the core issue there was about the unnecessary repetition of reference joins in each split query.

@stevendarby
Copy link
Contributor Author

Here's another that was possibly trying to get at this issue but struggled to describe it clearly: #24420

Hoping this can be considered for 8.0 👀

@stevendarby
Copy link
Contributor Author

FYI found this when looking for a duplicates for something else, which I didn't see before. Was closed because of single query in 3.0. #12022

@itsmegopi
Copy link

itsmegopi commented Feb 6, 2024

Is there any update on this @stevendarby ??

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants