Skip to content

WHERE conditions are duplicated when nested query is reused in LINQ #31646

@miegir

Description

@miegir

The following program prints a certain SQL query to the console:

using Microsoft.EntityFrameworkCore;

using var db = new AppDbContext();

var query =
    from campaign in db.Campaigns
    where campaign.Id == 1
    select new
    {
        campaign.Id,
        Counts = new Dictionary<int, int>((
        from customer in db.Customers
        let histories = campaign.Histories.Where(h => h.CustomerId == customer.Id) // reused subquery
        where histories.Any()
        group customer.Id by histories.Select(h => (int?)h.Result).FirstOrDefault() ?? 0 into g
        select new KeyValuePair<int, int>(g.Key, g.Distinct().Count())).ToList()),
    };

Console.WriteLine(query.ToQueryString());

class Customer
{
    public int Id { get; set; }
}

class Campaign
{
    public int Id { get; set; }
    public List<CampaignHistory> Histories { get; set; } = new();
}

class CampaignHistory
{
    public int Id { get; set; }
    public int CustomerId { get; set; }
    public int Result { get; set; }
}

class AppDbContext : DbContext
{
    public DbSet<Customer> Customers { get; set; }
    public DbSet<Campaign> Campaigns { get; set; }

    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
    {
        optionsBuilder.UseNpgsql();
    }
}

This program outputs the following SQL (without the comment of course):

SELECT c."Id", t0."Key", t0.c
FROM "Campaigns" AS c
LEFT JOIN LATERAL (
    SELECT t."Key", count(DISTINCT t."Id")::int AS c
    FROM (
        SELECT c0."Id", COALESCE((
            SELECT c1."Result"
            FROM "CampaignHistory" AS c1
            -- conditions are duplicated here:
            WHERE c."Id" = c1."CampaignId" AND c1."CustomerId" = c0."Id" AND c."Id" = c1."CampaignId" AND c1."CustomerId" = c0."Id"
            LIMIT 1), 0) AS "Key"
        FROM "Customers" AS c0
        WHERE EXISTS (
            SELECT 1
            FROM "CampaignHistory" AS c2
            WHERE c."Id" = c2."CampaignId" AND c2."CustomerId" = c0."Id")
    ) AS t
    GROUP BY t."Key"
) AS t0 ON TRUE
WHERE c."Id" = 1
ORDER BY c."Id"

The SQL is generally correct, but the conditions in the WHERE clause are duplicated for "CampaignHistory" AS c1 while are not duplicated for "CampaignHistory" AS c2. There is no such duplication in the original LINQ query. This seems like a minor bug in the EF.

Include provider and version information

EF Core version: 7.0.4
Database provider: Npgsql.EntityFrameworkCore.PostgreSQL (this problem can be reproduced with any other provider including Microsoft.EntityFrameworkCore.SqlServer)
Target framework: .NET 7.0
Operating system: Windows 11
IDE: Visual Studio 2022 17.7.3

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions