Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query: [Nav Prop Translation] Optional navigation translates to INNER JOIN instead of LEFT OUTER JOIN #3186

Closed
kennywangjin opened this issue Sep 22, 2015 · 20 comments
Assignees
Labels
closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. type-bug
Milestone

Comments

@kennywangjin
Copy link

I have model:

public Class Item

{

    public int Id {get; set;}

    public int? ParentId {get; set;}

    public virtual Item Parent {get;set;}

}

when the query dbContenxt.Items.Where(item=>item.ParentId==null || item.Parent.ParentId==null)

it returns just the items matching the second filter: item.Parent.ParentId==null, the first filter doesn't work!
if I remove the second filter and just use the first one, then I could get the items matching the first filter.

Is it a bug or any misunderstanding of the framework?

@rowanmiller rowanmiller added this to the Investigation milestone Sep 29, 2015
@maumar
Copy link
Contributor

maumar commented Sep 29, 2015

This is a legitimate bug. The problem is that we create the following sql:

SELECT [i].[Id], [i].[Name], [i].[ParentId]
FROM [Item] AS [i]
INNER JOIN [Item] AS [i.Parent] ON [i].[ParentId] = [i.Parent].[Id]
WHERE [i].[ParentId] IS NULL OR [i.Parent].[ParentId] IS NULL

where the correct sql would be:

SELECT [i].[Id], [i].[Name], [i].[ParentId]
FROM [Item] AS [i]
LEFT JOIN [Item] AS [i.Parent] ON [i].[ParentId] = [i.Parent].[Id]
WHERE [i].[ParentId] IS NULL OR [i.Parent].[ParentId] IS NULL

(LEFT JOIN instead of INNER JOIN)

@cda-1
Copy link

cda-1 commented Nov 26, 2015

Hey @maumar , I just want to be sure you know that the inner join problem crops up for all cases of projection on an optional navigation property, not just OP's case where it's in a Where clause. It seems you are aware but I wanted to be 100% sure and offer another example:

    return await Properties
        //.OrderBy(p => p.Id)
        //.Skip(20)
        .Select(p => new PropertyDto
        {
            Id = p.Id,
            ClientId = p.ClientId,
            ClientName = p.Client.Name,
            Name = p.Name,
            Address1 = p.Address1,
            Address2 = p.Address2,
            City = p.City,
            State = p.State,
            Zip = p.Zip,
            Phone = p.Phone,
            Fax = p.Fax,
            LocalPropertyId = p.LocalPropertyId,
            EmailAddresses = p.EmailAddresses,
            OwnerName = p.OwnerName,
            SendFileClient = p.SendFileClient,
            Units = p.Units,
            AgencyId = p.AgencyId,
            AgencyName = p.Agency.Name,
            VpEmployeeId = p.VpEmployeeId,
            VpEmployeeName = p.VpEmployee.LastName + ", " + p.VpEmployee.FirstName,
            RpmEmployeeId = p.RpmEmployeeId,
            RpmEmployeeName = p.RpmEmployee.LastName + ", " + p.RpmEmployee.FirstName
        })
        .ToListAsync();

In this example with my Property entity, ClientId is int (All properties have a client), but AgencyId, VpEmployeeId, and RpmEmployeeId are int? (Properties have optionally one Agency, optionally one Vice President, optionally one Regional Property Manager).

The code generated is

    SELECT [p].[Id], [p].[ClientId], [p.Client].[Name], [p].[Name], [p].[Address1], [p].[Address2], [p].[City], [p].[State], [p].[Zip], [p].[Phone], [p].[Fax], [p].[LocalPropertyId], [p].[EmailAddresses], [p].[OwnerName], [p].[SendFileClient], [p].[Units], [p].[AgencyId], [p.Agency].[Name], [p].[VpEmployeeId], ([p.VpEmployee].[LastName] + ', ') + [p.VpEmployee].[FirstName], [p].[RpmEmployeeId], ([p.RpmEmployee].[LastName] + ', ') + [p.RpmEmployee].[FirstName]
    FROM [Property] AS [p]
    INNER JOIN [Employee] AS [p.RpmEmployee] ON [p].[RpmEmployeeId] = [p.RpmEmployee].[Id]
    INNER JOIN [Employee] AS [p.VpEmployee] ON [p].[VpEmployeeId] = [p.VpEmployee].[Id]
    INNER JOIN [Agency] AS [p.Agency] ON [p].[AgencyId] = [p.Agency].[Id]
    INNER JOIN [Client] AS [p.Client] ON [p].[ClientId] = [p.Client].[Id]

But should be:

    SELECT [p].[Id], [p].[ClientId], [p.Client].[Name], [p].[Name], [p].[Address1], [p].[Address2], [p].[City], [p].[State], [p].[Zip], [p].[Phone], [p].[Fax], [p].[LocalPropertyId], [p].[EmailAddresses], [p].[OwnerName], [p].[SendFileClient], [p].[Units], [p].[AgencyId], [p.Agency].[Name], [p].[VpEmployeeId], ([p.VpEmployee].[LastName] + ', ') + [p.VpEmployee].[FirstName], [p].[RpmEmployeeId], ([p.RpmEmployee].[LastName] + ', ') + [p.RpmEmployee].[FirstName]
    FROM [Property] AS [p]
    LEFT JOIN [Employee] AS [p.RpmEmployee] ON [p].[RpmEmployeeId] = [p.RpmEmployee].[Id]
    LEFT JOIN [Employee] AS [p.VpEmployee] ON [p].[VpEmployeeId] = [p.VpEmployee].[Id]
    LEFT JOIN [Agency] AS [p.Agency] ON [p].[AgencyId] = [p.Agency].[Id]
    INNER JOIN [Client] AS [p.Client] ON [p].[ClientId] = [p.Client].[Id]

It seems to happen any time an optional navigation property is accessed.

Also, and I think it was already reported, but if you uncomment my skip/take lines, an exception is thrown: "ArgumentNullException: Value cannot be null. Parameter name: provider"

This means in order to paginate, I have to do my projection before skip, and I wonder if that is less optimal when the projection is complex.

I hope this helps and thank you guys for working so hard on all this!

@KyryloLehonkov
Copy link

I faced with the same issue in my project

@maumar
Copy link
Contributor

maumar commented Nov 30, 2015

@cda-1 yes we are aware of this as well. The problem is in the part of the code that translates navigation properties into joins, which is the same regardless of the operator.

The way it currently works is that we translate navigation property like:

from c in Customer
select c.Order.Id

into:

from c in Customer
join o in Order on c.Id equals o.CustomerId
select o.Name

and in this form the query gets passed to our relational pipeline for translation to SQL. Problem is that in Linq (to objects) the concept of LOJ doesn't exist on its own.

Proposed fix is to represent optional navigation using SelectMany-GroupJoin-DefaultIfEmpty combination and then collapse this pattern into LOJ in our relational pipeline. Problem is that this creates much more complex queries (mainly due to introduction of subqueries) and currently breaks for majority of non-trivial cases. Those bugs need to be addressed before we can fix the problem with navigation property expansion.

We do recognize this as a high priority bug, as it potentially returns incorrect results.

@cda-1
Copy link

cda-1 commented Nov 30, 2015

Thanks for the explanation! I didn't even consider that navigation property access would have to be translated to Linq-to-objects first. This does seem pretty tricky (but I already think your relational pipeline is magic).

@maumar
Copy link
Contributor

maumar commented Nov 30, 2015

@cda-1 We do this so our In-Memory provider works as well. We try to create a common tree (based on relinq), that all providers can consume (and further process to suit their needs). The better that common tree, the less code duplication in the providers themselves.

@cda-1
Copy link

cda-1 commented Nov 30, 2015

Makes perfect sense!

maumar added a commit that referenced this issue Dec 4, 2015
- Adding (disabled) tests for #3186

- Fixing bug in query that manifests when running InMemory - when creating joins, where key selectors have different types (e.g. int and int?) we were making incorrect casting at times which would cause exceptions if one of the values was null. Fix is to always convert to nullable type.

- Moving Navigation to foreign key access optimization to relational as it causes nullability problems in relational. Problem is that this translation can change the type of the result, e.g. customer.Detail.Id (int), would get transformed into customer.DetailId (int?).
We perform compensation and cast the result to non-nullable type, to preserve the correct type. This doesn't work on InMemory (trying to fit null value into non-nullable type).
maumar added a commit that referenced this issue Dec 4, 2015
- Adding (disabled) tests for #3186

- Fixing bug in query that manifests when running InMemory - when creating joins, where key selectors have different types (e.g. int and int?) we were making incorrect casting at times which would cause exceptions if one of the values was null. Fix is to always convert to nullable type.

- Moving Navigation to foreign key access optimization to relational as it causes nullability problems in relational. Problem is that this translation can change the type of the result, e.g. customer.Detail.Id (int), would get transformed into customer.DetailId (int?).
We perform compensation and cast the result to non-nullable type, to preserve the correct type. This doesn't work on InMemory (trying to fit null value into non-nullable type).
@maumar maumar mentioned this issue Dec 6, 2015
maumar added a commit that referenced this issue Dec 7, 2015
- Adding (disabled) tests for #3186

- Fixing bug in query that manifests when running InMemory - when creating joins, where key selectors have different types (e.g. int and int?) we were making incorrect casting at times which would cause exceptions if one of the values was null. Fix is to always convert to nullable type.

- Moving Navigation to foreign key access optimization to relational as it causes nullability problems in relational. Problem is that this translation can change the type of the result, e.g. customer.Detail.Id (int), would get transformed into customer.DetailId (int?).
We perform compensation and cast the result to non-nullable type, to preserve the correct type. This doesn't work on InMemory (trying to fit null value into non-nullable type).
maumar added a commit that referenced this issue Dec 7, 2015
- Adding (disabled) tests for #3186

- Fixing bug in query that manifests when running InMemory - when creating joins, where key selectors have different types (e.g. int and int?) we were making incorrect casting at times which would cause exceptions if one of the values was null. Fix is to always convert to nullable type.

- Moving Navigation to foreign key access optimization to relational as it causes nullability problems in relational. Problem is that this translation can change the type of the result, e.g. customer.Detail.Id (int), would get transformed into customer.DetailId (int?).
We perform compensation and cast the result to non-nullable type, to preserve the correct type. This doesn't work on InMemory (trying to fit null value into non-nullable type).
@rowanmiller rowanmiller modified the milestones: 7.0.0-rc2, 7.0.0 Dec 9, 2015
maumar added a commit that referenced this issue Dec 9, 2015
- Adding (disabled) tests for #3186

- Fixing bug in query that manifests when running InMemory - when creating joins, where key selectors have different types (e.g. int and int?) we were making incorrect casting at times which would cause exceptions if one of the values was null. Fix is to always convert to nullable type.

- Fixing bug in nav prop optimization (which was changing order.Customer.Id into order.CustomerId - before we used to do this for nav props which were only one level, non-composite, and which access PK on the other side of the navigatio (turning this into FK call). However we should not be doing that for optional navigations. In that case the initial Expression type is different than type after optimization (e.g. optional FK being nullable int, and PK being int). We could try to compensate for this in some cases (join, binary expression etc), but for other cases it's problematic (e.g. when projecting into anonymous type). Fix is to reduce the scope of the optimization, and only apply it for required navigations.
maumar added a commit that referenced this issue Dec 9, 2015
- Adding (disabled) tests for #3186

- Fixing bug in query that manifests when running InMemory - when creating joins, where key selectors have different types (e.g. int and int?) we were making incorrect casting at times which would cause exceptions if one of the values was null. Fix is to always convert to nullable type.

- Fixing bug in nav prop optimization (which was changing order.Customer.Id into order.CustomerId - before we used to do this for nav props which were only one level, non-composite, and which access PK on the other side of the navigatio (turning this into FK call). However we should not be doing that for optional navigations. In that case the initial Expression type is different than type after optimization (e.g. optional FK being nullable int, and PK being int). We could try to compensate for this in some cases (join, binary expression etc), but for other cases it's problematic (e.g. when projecting into anonymous type). Fix is to reduce the scope of the optimization, and only apply it for required navigations.

CR: Andrew
maumar added a commit that referenced this issue Dec 9, 2015
- Adding (disabled) tests for #3186

- Fixing bug in query that manifests when running InMemory - when creating joins, where key selectors have different types (e.g. int and int?) we were making incorrect casting at times which would cause exceptions if one of the values was null. Fix is to always convert to nullable type.

- Fixing bug in nav prop optimization (which was changing order.Customer.Id into order.CustomerId - before we used to do this for nav props which were only one level, non-composite, and which access PK on the other side of the navigatio (turning this into FK call). However we should not be doing that for optional navigations. In that case the initial Expression type is different than type after optimization (e.g. optional FK being nullable int, and PK being int). We could try to compensate for this in some cases (join, binary expression etc), but for other cases it's problematic (e.g. when projecting into anonymous type). Fix is to reduce the scope of the optimization, and only apply it for required navigations.

CR: Andrew
@jemiller0
Copy link

I'm running into the same problem. It happens for counts as well. All the JOINs in my case should be LEFT JOINs. The first thing I noticed is that counts were significantly slower than in EF 6. I think this is because count is a lot faster with LEFT JOINs. I tried putting a null check in to see if that might make a difference since it's a little weird to dereference the values like nulls don't matter since that would fail in LINQ to Objects. It still used a INNER JOIN although the query was made more complicated. I really hope this can be fixed without creating a bunch of nested JOINs. EF 6 is really terrible about that. I was pretty stoked that EF 7 looked greatly improved in that regard until I ran across this problem. The queries are much easier to read, and I'm assuming easier for the back end database to compile and optimize. Please figure out how to get this working without sub-queries.

Also, I thought EF 7 loaded the data differently where it would execute separate queries to pull in the foreign tables rather than doing big multiplies/joins. I'm guessing that doesn't work for projections? It seems like it would work better if it did.

        var q = from b in libraryContext.Bibs
                select new
                {
                    b.Id,
                    b.OclcNumber,
                    b.Title,
                    b.Author,
                    b.PublicationDate,
                    Language = b.Language.Name,
                    //Language = b.Language != null ? b.Language.Name : null,
                    b.LanguageId,
                    Format = b.Format.Name,
                    b.FormatId,
                    BibStatus = b.BibStatus.Name,
                    b.BibStatusId,
                    b.BibStatusLastWriteTime,
                    b.BibStatusLastWriteUserName,
                    b.IsPublic,
                    b.Type,
                    b.Level,
                    b.CreationTime,
                    b.CreationUserName,
                    b.LastWriteTime,
                    b.LastWriteUserName
                };

        BibsRadGrid.VirtualItemCount = q.Count();

SELECT COUNT(*)
FROM [Bibs] AS [b]
INNER JOIN [BibStatuses] AS [b.BibStatus] ON [b].[BibStatusId] = [b.BibStatus].[Id]
INNER JOIN [Formats] AS [b.Format] ON [b].[FormatId] = [b.Format].[Id]
INNER JOIN [Languages] AS [b.Language] ON [b].[LanguageId] = [b.Language].[Id]

        q = q.Skip(BibsRadGrid.PageSize * BibsRadGrid.CurrentPageIndex).Take(BibsRadGrid.PageSize);
        BibsRadGrid.DataSource = q.ToArray();

SELECT [b].[Id], [b].[OclcNumber], [b].[Title], [b].[Author], [b].[PublicationDate], [b.Language].[Name], [b].[LanguageId], [b.Format].[Name], [b].[FormatId], [b.BibStatus].[Name], [b].[BibStatusId], [b].[BibStatusLastWriteTime], [b].[BibStatusLastWriteUserName], [b].[IsPublic], [b].[Type], [b].[Level], [b].[CreationTime], [b].[CreationUserName], [b].[LastWriteTime], [b].[LastWriteUserName]
FROM [Bibs] AS [b]
INNER JOIN [BibStatuses] AS [b.BibStatus] ON [b].[BibStatusId] = [b.BibStatus].[Id]
INNER JOIN [Formats] AS [b.Format] ON [b].[FormatId] = [b.Format].[Id]
INNER JOIN [Languages] AS [b.Language] ON [b].[LanguageId] = [b.Language].[Id]
ORDER BY [b].[LastWriteTime] DESC
OFFSET 0 ROWS FETCH NEXT 100 ROWS ONLY

@jemiller0
Copy link

Regarding counts, it seems like even if the foreign key properties were required, it should still use LEFT rather than INNER JOINs. Otherwise, counts are a lot slower. I think this is a very common use case. For example, if you have a UI grid with paging. You always need to know the number of rows in the result set, excluding paging limits. This is a problem that I've ran into in the past with other databases like MySQL that have poor count performance compared to SQL Server. I hope this is a use case that is given priority.

@jemiller0
Copy link

I'm wondering if interceptors are supported in EF 7? I want to say this was a feature that was supposed to be improved over EF 6. Maybe one could use that to temporarily work around this problem? I.e. do a text replace of "INNER JOIN" with "LEFT JOIN".

@mrahhal
Copy link

mrahhal commented Jan 4, 2016

Any update on this? Is this going to make it soon?
I started using EF7 in a production project without realizing that LOJ are not supported and right now this is really causing me all sorts of awkward hacks when I want to query a model with an optional relationship. I'm also heavily depending on the InMemoryProvider in my tests, so this turned out to be a real pain.

@rowanmiller
Copy link
Contributor

@mrahhal this one is on the list of things to be fixed in RC2 (our next release)

@jemiller0
Copy link

There is no way this one couldn't be fixed in RTM. It's too blatant a bug. I suspect EF 7 is labeled RC to go along with other products such as ASP.NET 5. Otherwise, it doesn't seem to even be RC quality. There are major bugs still in it like this one that are very obvious if you try to do anything with it. Maybe there are just a few bugs like this that need fixing. But, it's definitely not usable at this point. Using this in production is crazy.

@ErikEJ
Copy link
Contributor

ErikEJ commented Jan 4, 2016

@jemiller0 Hold your horses - RC2 is before RTM

@jemiller0
Copy link

I know, I'm saying that an RC is supposed to be a candidate for release. One would expect major bugs to be gone by that time. This is more like beta quality.

@rowanmiller
Copy link
Contributor

@jemiller0 - no disagreement from me 😄... as you speculated, we are part of a bigger release this time around.

@jemiller0
Copy link

I just wanted to say thanks for all the developer's hard work though. Although it's not quite there yet, I think it will be a great product when it's released. I just hope once this gets fixed, the SQL doesn't have a bunch of extra nested joins like EF 6 did.

@mrahhal
Copy link

mrahhal commented Jan 4, 2016

@jemiller0 I completely agree. I just wish I knew about this before choosing to go with EF7. I hope this will be resolved as soon as possible (no need to wait for rc2, anyone can depend on a pre-release package during development).

@smitpatel smitpatel changed the title strange behavior for query Optional navigation translates to INNER JOIN instead of LEFT OUTER JOIN Jan 20, 2016
@rowanmiller rowanmiller changed the title Optional navigation translates to INNER JOIN instead of LEFT OUTER JOIN Query: [Nav Prop Translation] Optional navigation translates to INNER JOIN instead of LEFT OUTER JOIN Jan 25, 2016
@maumar
Copy link
Contributor

maumar commented Feb 19, 2016

Basic support have been added in c4e4447
There are still several outstanding issues, listed in the pull request description (most prominently Include not working with optional navigations and queries being client-evaluated more often)

@maumar maumar closed this as completed Feb 19, 2016
@cda-1
Copy link

cda-1 commented Feb 22, 2016

Though I haven't tried it yet, I'm thrilled with this. Thanks, @maumar!

@ajcvickers ajcvickers added the closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. label Oct 15, 2022
@ajcvickers ajcvickers modified the milestones: 1.0.0-rc2, 1.0.0 Oct 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. type-bug
Projects
None yet
Development

No branches or pull requests

9 participants