Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Includes are unpredictably ignored #12446

Closed
slimeygecko opened this issue Jun 21, 2018 · 19 comments
Closed

Includes are unpredictably ignored #12446

slimeygecko opened this issue Jun 21, 2018 · 19 comments
Labels
closed-no-further-action The issue is closed and no further action is planned. customer-reported

Comments

@slimeygecko
Copy link

@HalosGhost and I have been working on a project together and have found it increasingly difficult to guarantee some of the navigational properties in our queries. When I first read the documentation on ignoring includes (https://docs.microsoft.com/en-us/ef/core/querying/related-data#ignored-includes), I thought that only it was applicable when lazy loading was being used. I was quite surprised to learn that it wasn't.

It's unreasonable to say that .include() can be ignored just because it isn't referenced before it is ToListed.

I searched past issues but couldn't find the reasoning behind why .include()s can be ignored. Can you either shed some light on this or make a way to explicitly force the includes to be used? Without being sure that a navigational property is eagerly loaded, it is very hard to keep our project running correctly.

Steps to reproduce

Exception / Warning message:
> The Include operation for navigation '[x].Distributor.LevelTypeLookup' is unnecessary and was ignored because the navigation is not reachable in the final query results. See https://go.microsoft.com/fwlink/?linkid=850303 for more information.

By running the code below, the exception/warning above is logged during runtime, and the LevelTypeLookup navigational property on the Distributor is null.

var tree = _services.GetService<AdHocService>().GetTreeNodes("123456")
    .Select(x => x.Distributor)
    .ToList();

var level = tree.FirstOrDefault()?.LevelTypeLookup?.Name;

Here's the resulting query:

SELECT [x.Distributor].[Id], [x.Distributor].[CompanyName], [x.Distributor].[ContractTypeLookupId], [x.Distributor].[CreatedBy], [x.Distributor].[CreatedOn], [x.Distributor].[Cycle], [x.Distributor].[EnrollmentTypeLookupId], [x.Distributor].[FamilyTypeLookupId], [x.Distributor].[IsBusiness], [x.Distributor].[LevelTypeLookupId], [x.Distributor].[ModifiedBy], [x.Distributor].[ModifiedOn], [x.Distributor].[SponsorId], [x.Distributor].[SpousalAgreement], [x.Distributor].[StartDate], [x.Distributor].[StatusTypeLookupId]
FROM (
    select * from dbo.GetTreeNodes(@p0, @p1)
) AS [x]
LEFT JOIN [Distributor] AS [x.Distributor] ON [x].[DistributorId] = [x.Distributor].[Id]

I was surprised to find that when the .ToList() is moved before the .Select(), then the LevelTypeLookup property is retrieved and accessible.

 var tree = _services.GetService<AdHocService>().GetTreeNodes("123456")
+     .ToList()
+     .Select(x => x.Distributor);
-     .Select(x => x.Distributor)
-     .ToList();

Method in my repository:

public IQueryable<TreeNode> GetTreeNodes(string distributorId, DateTime? asOf = null) =>
    _repo.TreeNodes
        .Include(x => x.Distributor)
        .ThenInclude(x => x.LevelTypeLookup)
        .FromSql("select * from dbo.GetTreeNodes({0}, {1})", distributorId, asOf ?? DateTime.Now);

The models:

    public class TreeNode
    {
        public string DistributorId { get; set; }
        public string ParentId { get; set; }
        public int NumberOfChildren { get; set; }

        public virtual Distributor Distributor { get; set; }
    }

    public class Distributor
    {
        public string Id { get; set; }
        public int LevelTypeLookupId { get; set; }
        public string CompanyName { get; set; }

        [ForeignKey("LevelTypeLookupId")]
        public virtual LevelTypeLookup LevelTypeLookup { get; set; }
    }

    public class LevelTypeLookup
    {
        public int Id { get; set; }
        public string Abbreviation { get; set; }
        public string Name { get; set; }
    }   

within the DbContext:

        ...
        public virtual DbQuery<TreeNode> TreeNodes { get; set; }

        protected override void OnModelCreating(ModelBuilder modelBuilder)
        {
            base.OnModelCreating(modelBuilder);
            new TreeNodeConfig(modelBuilder);
        }

TreeNodeConfig.cs:

    public class TreeNodeConfig
    {
        public TreeNodeConfig(ModelBuilder modelBuilder)
        {
            modelBuilder.Query<TreeNode>()
                .Property(e => e.DistributorId)
                .HasColumnName("DistributorId");

            modelBuilder.Query<TreeNode>()
                .Property(e => e.ParentId)
                .HasColumnName("ManagerId");

            modelBuilder.Query<TreeNode>()
                .Property(e => e.NumberOfChildren)
                .HasColumnName("NumberOfChildren");
        }
    }

Further technical details

EF Core version: 2.1.0
Database Provider: Microsoft.EntityFrameworkCore.SqlServer
Operating system: Windows 10
IDE: Visual Studio 2017 15.7.4

@smitpatel
Copy link
Member

You applied Include on TreeNode entity type and you are not projecting out TreeNode in your final projection. You are selecting Distribution. Hence include got ignored. (When you do ToList without any Select it implies .Select(e => e) which is equivalent of projecting out TreeNode)

@slimeygecko
Copy link
Author

Why is this intended behavior?

@smitpatel
Copy link
Member

You specify include on entity type you want. To include navigation on distribution above, do Select before Include and it will load nav prop.

@HalosGhost
Copy link

If that is the case, why allow chained ThenIncludes? Unless projecting to the deepest property, all of those would be thrown away regardless, no?

I guess what we're balking at is that eliding a side-effect (especially one as fundamental as "fetch this data”) is hugely surprising.

@ajcvickers
Copy link
Member

@HalosGhost The semantics of Include are, "when Include is used on an IQueryable<T>, then when returning entities of type T, also bring back related entities as specified in the Include calls."

If the query does not return entities of type T, then Include does not apply.

@slimeygecko
Copy link
Author

Isn't this a huge pitfall for developers (with many devs who probably have used Entity Framework in the past and are now moving to EFCore)? I thought Entity Framework always respected Includes (but I could be wrong).

@ajcvickers
Copy link
Member

@slimeygecko EF6 and all previous versions work in basically the same way.

@HalosGhost
Copy link

I would also be interested in knowing why this is the case.

It does not, for example, always avoid fetching extra data (for example, if you use a Where clause before the Select which operates on a given navigational property, that property will still need to be fetched for the evaluation of that clause, it will just have its data thrown away after the Selection).

What's the rationale behind this behavior?

@ajcvickers
Copy link
Member

@HalosGhost Depends what you mean by "fetched". If you project out the navigation, then the data will be returned. If the navigation is used in the projection as part of the query, then it may not result in the data for that navigation being returned. They are really two different things.

@HalosGhost
Copy link

I think we may have misunderstood one another there; but I'm not worried about how it works anymore. I'm interested in why it was designed to work this way rather than just actually loading the requested data.

Can you shed some light on why it was designed to work this way?

@vsopko
Copy link

vsopko commented Jun 25, 2018

@ajcvickers, could you please clarify correct usage of projections and navigations based on your many to many implementations found here? Your are making joined navigation properties private, then include them via fluent api Include() string method, and everything works fine until we are not trying to use this with any projections, because in this case we will fall into the situation with ignored includes, explained
in docs
where navigation will never be loaded due they are private and cannot be used enywhere in projection. Is there any workaroud or am i missing something?

@ajcvickers
Copy link
Member

@vsopko I'm not able to fully understand what you are asking. I don't see any real connection between the many-to-many implementation and ignored Includes. Making a navigation private has no bearing on whether or not an Include will be ignored.

@HalosGhost I'm not sure on the reasoning for why it works this way. I will follow up on that.

@HalosGhost
Copy link

I appreciate it. I certainly find this behavior very surprising (as I said above, to ignore an explicitly-requested side-effect is not at all what I would expect).

@ajcvickers
Copy link
Member

@HalosGhost and others. I'm going to try to explain this based on our discussion. Consider this model:

public class Blog
{
    [DatabaseGenerated(DatabaseGeneratedOption.None)]
    public int Id { get; set; }

    public virtual ICollection<Post> OldPosts { get; set; }
    public virtual ICollection<Post> FeaturedPosts { get; set; }
}

public class Post
{
    [DatabaseGenerated(DatabaseGeneratedOption.None)]
    public int Id { get; set; }

    public Author Author { get; set; }
}

public class Author
{
    [DatabaseGenerated(DatabaseGeneratedOption.None)]
    public int Id { get; set; }
}

Notice that there are two ways to get from a blog to a post--the OldPosts relationship and the FeaturedPosts relationship. Let's populate this with model with some data:

using (var context = new BloggingContext())
{
    context.Database.EnsureDeleted();
    context.Database.EnsureCreated();

    var post1 = new Post { Id = 1, Author = new Author { Id = 1 } };
    var post2 = new Post { Id = 2, Author = new Author { Id = 2 } };
    var post3 = new Post { Id = 3, Author = new Author { Id = 3 } };
    var post4 = new Post { Id = 4, Author = new Author { Id = 4 } };

    context.Add(
        new Blog
        {
            Id = 1,
            OldPosts = new List<Post> { post1, post2 },
            FeaturedPosts = new List<Post> { post3, post4 }
        });
        
    context.SaveChanges();
}

Notice that some posts are related to blogs through one relationship, some through the other. Now lets do a simple query with Include:

using (var context = new BloggingContext())
{
    var blog = context.Blogs
        .Include(e => e.FeaturedPosts).ThenInclude(e => e.Author)
        .Include(e => e.OldPosts)
        .First();

    foreach (var post in blog.OldPosts)
    {
        Console.WriteLine($"Old post: {post.Id} Author: {post.Author?.Id}");
    }

    foreach (var post in blog.FeaturedPosts)
    {
        Console.WriteLine($"Featured post: {post.Id} Author: {post.Author?.Id}");
    }
}

The output from this on my machine is:

Old post: 1 Author:
Old post: 2 Author:
Featured post: 3 Author: 3
Featured post: 4 Author: 4

Notice that both posts accessed through the FeaturedPosts relationship have the author included. This is because the Include for FeaturedPosts also has a ThenInclude for Author: .Include(e => e.FeaturedPosts).ThenInclude(e => e.Author)

However, neither post accessed only through the OldPosts relationship have the author included, since the Include for this relationship does not have an associated ThenInclude for Author: .Include(e => e.OldPosts)

Hopefully all that is quite clear. 😄

So what happens if we change the shape of the query with a projection? Consider for example:

using (var context = new BloggingContext())
{
    var oldPosts = context.Blogs
        .Include(e => e.FeaturedPosts).ThenInclude(e => e.Author)
        .Include(e => e.OldPosts)
        .SelectMany(e => e.OldPosts)
        .ToList();

    var featuredPosts = context.Blogs
        .Include(e => e.FeaturedPosts).ThenInclude(e => e.Author)
        .Include(e => e.OldPosts)
        .SelectMany(e => e.FeaturedPosts)
        .ToList();

    foreach (var post in oldPosts)
    {
        Console.WriteLine($"Post: {post.Id} Author: {post.Author?.Id}");
    }

    foreach (var post in featuredPosts)
    {
        Console.WriteLine($"Featured post: {post.Id} Author: {post.Author?.Id}");
    }
}

As described above, the Include is dropped, so the output on my machine is:

Old post: 1 Author:
Old post: 2 Author:
Featured post: 3 Author: 
Featured post: 4 Author: 

That is, no authors have been loaded because the Include was ignored.

Now, a simplistic suggestion for not ignoring Include is to say something like, "If an entity type in the final projection is also contained in some Include path, then use the remainder of that Include path." Doing so in this case would mean that since Post is in the final projection, and Post is in one of the Include paths, then Author should be included in the results because it exists as a ThenInclude on Post.

However, there are two issues with this. First, if the type is used in multiple Include paths, then which one should be used? Ignoring that for the moment, and looking at the output if we did this:

Old post: 1 Author: 1
Old post: 2 Author: 2
Featured post: 3 Author: 3
Featured post: 4 Author: 4

Why, in this case, are authors being Included for old posts? This is wrong--there was never a ThenInclude for the OldPosts relationship, so they should not have Author included. This is the second and more significant issue.

But what if we use information about the projection to drive this? For example, looking again at this:

var oldPosts = context.Blogs
    .Include(e => e.FeaturedPosts).ThenInclude(e => e.Author)
    .Include(e => e.OldPosts)
    .SelectMany(e => e.OldPosts)
    .ToList();

we are selecting OldPosts, in the projection, so we could use this to determine which Include path to use--in this case, the one that does not also include Author. Likewise, for this query:

var featuredPosts = context.Blogs
    .Include(e => e.FeaturedPosts).ThenInclude(e => e.Author)
    .Include(e => e.OldPosts)
    .SelectMany(e => e.FeaturedPosts)
    .ToList();

we would use the other Include path, which would include Author.

This would likely work for all cases except for entirely uncorrelated types in the projection. However, the implementation is very complicated, and as the length of this post shows, the mental model for figuring out what is going on is also complicated.

This is why, rather than attempting to implement something like this, we believe a better approach is as described in #2953. That is, if a relationship should always be included when an entity of a given type is loaded, regardless of where it came from, then that should be specified in a different API that allows such rules to be expressed independently of the shape of the query.

@HalosGhost
Copy link

@ajcvickers, thank you for the long and detailed reasoning and explanation. This is yet another point of significant disappointment I've run into with Entity Framework, but I recognize that this (along with many other pieces) is not necessarily a simple problem.

Consider this my vote to close.

@smitpatel
Copy link
Member

@ntimmerman - Please file a new issue with a repro code which demonstrate the issue you are seeing.

@ntimmerman
Copy link

@ntimmerman - Please file a new issue with a repro code which demonstrate the issue you are seeing.

The problem was with my data. I had some unexpected junk data causing my problem. Sorry for the trouble.

@jon-peel
Copy link

Please help, I really can't figure this out.

 var query = _db.Orders
                .Include(o => o.Items)
                .OrderBy(o => o.OrderDate)
                .ToArray();

In this, Items is an empty collection for each order.

var tt = _db.Orders
                .Select(x => new {x.Member, x.Items.Count})
                .ToArray();

This shows the counts correctly.

Please tell me what is wrong with the first code.

@ajcvickers
Copy link
Member

@Thorocaine Please open a new issue and include a small, runnable project/solution or complete code listing and we will investigate.

@ajcvickers ajcvickers reopened this Oct 16, 2022
@ajcvickers ajcvickers closed this as not planned Won't fix, can't repro, duplicate, stale Oct 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closed-no-further-action The issue is closed and no further action is planned. customer-reported
Projects
None yet
Development

No branches or pull requests

7 participants