Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Union/Concat throwing exception when using includes/joins #30771

Closed
CorsairRO opened this issue Apr 26, 2023 · 10 comments
Closed

Union/Concat throwing exception when using includes/joins #30771

CorsairRO opened this issue Apr 26, 2023 · 10 comments

Comments

@CorsairRO
Copy link

CorsairRO commented Apr 26, 2023

File a bug

EF Core cannot translate to sql a simple UNION / UNION ALL ?
Projection is to the SAME class that is NOT a model one.
If not using Include or Join then it translates correctly.

Include your code

I have a class like this and would like to select in a single query from 2 different DbSets

public MyProjectionClass {
       public int Id {get;set;}
       public string Name {get;set;}
}

The following Code works fine ( removed async and await for simplicity )

var aList = ctx.MyDbSet1
            .Select(
                  x=> new MyProjectionClass{
                        Id = x.DbSet1Id
                  }
            )
            .Concat(
                        ctx.MyDbSet2
                                  .Select(
                                          x=> new MyProjectionClass{
                                                       Id = x.DbSet2Id
                                          }
                                   )
            ).ToList()          

Generates correctly:

SELECT t1.DbSet1Id FROM DbSet1
UNION ALL
SELECT t2.DbSet2Id FROM DbSet2

Adding ANY include stops working throwing exception with unable to use set operator?

var aList = ctx.MyDbSet1.Include(x=>x.OtherTable1)
            .Select(
                  x=> new MyProjectionClass{
                        Id = x.DbSet1Id,
                        Name = x.OtherTable1.Name
                  }
            )
            .Concat(
                        ctx.MyDbSet2.Include(x=>x.OtherTable2)
                                  .Select(
                                          x=> new MyProjectionClass{
                                                       Id = x.DbSet2Id
                                                       Name = x.OtherTable2.Name
                                          }
                                   )
            ).ToList()          

Expected Result is:

SELECT t1.DbSet1Id, lk1.Name FROM DbSet1 JOIN OtherTable1 AS lk1 ON t1.DbSet1Id = lk1.DbSet1Id
UNION ALL
SELECT t2.DbSet2Id, lk1.Name FROM DbSet2 JOIN OtherTable2 AS lk2 ON t2.DbSet2Id = lk2.DbSet2Id

Include stack traces

If i dont use my OWN class for Both projections, i could understand that SQL Server cannot handle the UNIONs.
But if i use my own class that HAS the same properties then why does it complain?

Please help me with some ideas, as i have Plenty of code in edmx old version that calls SP to return unioned rows but i would like to use EF Core and Linq to discontinue them as they just do a select from a UNION select FROM b

System.InvalidOperationException : Unable to translate set operations when both sides don't assign values to the same properties in the nominal type. Please make sure that the same properties are included on both sides, and consider assigning default values if a property doesn't require a specific value.

Include verbose output

Include provider and version information

EF Core version: 7.0.5
Database provider: Microsoft.EntityFrameworkCore.SqlServer
Target framework: NET 7.0
Operating system: Windows 11
IDE: Visual Studio 2022 17.5

@roji
Copy link
Member

roji commented Apr 26, 2023

@CorsairRO can you please post the code which doesn't work so we can see what you're doing? Please include a minimal model as well to make sure we have a runnable code sample.

@CorsairRO
Copy link
Author

CorsairRO commented Apr 26, 2023

OK, i am not very familiar with code first models and migrations as i work with existing DB, so please let me know if the example is ok as it is.

Basically the problem appears with ANY include that i have in any branches of the UNION.
If i would expect the DB to automatically do that operation, i would expect the error message.
But if i am specifying explicitly to BOTH branches to return the same class ( OrderInfo ) i dont understand why it crashes insted of generating the UNION ALL between the 2 branches with tens of joins if i want, because all what it needs to do is to project only my columns from my Projection class, and not trying to union other properties from other classes found in the heirarchy.

Basically i tried Union and Concat using the same strategy with having a predefined projection class and it quite find them useless if they dont support this scenario.

i mean, why i would like to UNION only the 2 tables?
It is Much easier to run 2 queries then and union them clientside.
But i was expecting EF Core to help me to be performant and flexible to get rid of legacy stored procedures while moving from old EF.

using Microsoft.EntityFrameworkCore;
using System.ComponentModel.DataAnnotations;

using var ctx = new MyContext( );

// UNION ALL works fine - No Includes or Joins in method syntax
var working = await
	ctx.Orders		
		.Select( x =>
			new OrderInfo {
				Id = x.OrderId				
			}
		)
		.Concat(
			ctx.SalesOrders				
				.Select( x =>
					new OrderInfo {
						Id = x.SalesOrderId						
					}
				)
		)
		.ToListAsync( );

// Union All - Crashing - Any Include in any branch, even if i specify explicitly same projection class in both
var fail = await
	ctx.Orders
		.Include( x => x.Current_Account )
		.Select( x =>
			new OrderInfo {
				Id = x.OrderId ,
				AccountName = x.Current_Account.VendorName
			}
		)
		.Concat(
			ctx.SalesOrders
				.Include(x=>x.Current_Account)
				.Select( x =>
					new OrderInfo {
						Id = x.SalesOrderId ,
						AccountName = x.Current_Account.VendorName
					}
				)
		)
		.ToListAsync( );

public class MyContext : DbContext {

	public DbSet<Order> Orders { get; set; }

	public DbSet<SalesOrder> SalesOrders { get; set; }

	public DbSet<Account> Accounts { get; set; }

	protected override void OnModelCreating( ModelBuilder modelBuilder ) {
		base.OnModelCreating( modelBuilder );

		modelBuilder.Entity<Order>( )
			.HasOne( x => x.Current_Account )
			.WithMany( x => x.Orders )
			.HasForeignKey( x => x.AccountId );

		modelBuilder.Entity<SalesOrder>( )
			.HasOne( x => x.Current_Account )
			.WithMany( x => x.SalesOrders )
			.HasForeignKey( x => x.AccountId );

	}

}

/// <summary>
/// My Projection Class that i expect in the UNION result of the selects
/// </summary>
public class OrderInfo {

	public int Id { get; set; }

	public string AccountName { get; set; }

}

public class Order {

	[Key]
	public int OrderId { get; set; }

	public int AccountId { get; set; }

	public virtual Account Current_Account { get; set; }

}

public class SalesOrder {

	[Key]
	public int SalesOrderId { get; set; }

	public int AccountId { get; set; }

	public virtual Account Current_Account { get; set; }

}

public class Account {

	[Key]
	public int AccountId { get; set; }

	public string VendorName { get; set; }

	public List<Order> Orders { get; set; } = new( );

	public List<SalesOrder> SalesOrders { get; set; } = new( );

}

@ajcvickers
Copy link
Member

@CorsairRO I am not able to reproduce this. I put your code into a runnable application and it does not crash. The code and output is below.

using var ctx = new MyContext( );

ctx.Database.EnsureDeleted();
ctx.Database.EnsureCreated();

// UNION ALL works fine - No Includes or Joins in method syntax
var working = await
    ctx.Orders
        .Select(
            x =>
                new OrderInfo { Id = x.OrderId }
        )
        .Concat(
            ctx.SalesOrders
                .Select(
                    x =>
                        new OrderInfo { Id = x.SalesOrderId }
                )
        )
        .ToListAsync();

// Union All - Crashing - Any Include in any branch, even if i specify explicitly same projection class in both
var fail = await
    ctx.Orders
        .Include(x => x.Current_Account)
        .Select(
            x =>
                new OrderInfo { Id = x.OrderId, AccountName = x.Current_Account.VendorName }
        )
        .Concat(
            ctx.SalesOrders
                .Include(x => x.Current_Account)
                .Select(
                    x =>
                        new OrderInfo { Id = x.SalesOrderId, AccountName = x.Current_Account.VendorName }
                )
        )
        .ToListAsync();

public class MyContext : DbContext
{
    public DbSet<Order> Orders { get; set; }
    public DbSet<SalesOrder> SalesOrders { get; set; }
    public DbSet<Account> Accounts { get; set; }

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.Entity<Order>()
            .HasOne(x => x.Current_Account)
            .WithMany(x => x.Orders)
            .HasForeignKey(x => x.AccountId);

        modelBuilder.Entity<SalesOrder>()
            .HasOne(x => x.Current_Account)
            .WithMany(x => x.SalesOrders)
            .HasForeignKey(x => x.AccountId);
    }

    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
    {
        optionsBuilder
            .UseSqlServer(@"Data Source=(LocalDb)\MSSQLLocalDB;Database=AllTogetherNow")
            .LogTo(Console.WriteLine, LogLevel.Information)
            .EnableSensitiveDataLogging();
    }
}

public class OrderInfo
{
	public int Id { get; set; }
	public string AccountName { get; set; }
}

public class Order
{
	public int OrderId { get; set; }
	public int AccountId { get; set; }
	public virtual Account Current_Account { get; set; }
}

public class SalesOrder
{
	public int SalesOrderId { get; set; }
	public int AccountId { get; set; }
	public virtual Account Current_Account { get; set; }
}

public class Account
{
	public int AccountId { get; set; }
	public string VendorName { get; set; }
	public List<Order> Orders { get; set; } = new( );
	public List<SalesOrder> SalesOrders { get; set; } = new( );
}
warn: 4/26/2023 19:34:12.651 CoreEventId.SensitiveDataLoggingEnabledWarning[10400] (Microsoft.EntityFrameworkCore.Infrastructure)
      Sensitive data logging is enabled. Log entries and exception messages may include sensitive application data; this mode should only be enabled during development.
info: 4/26/2023 19:34:13.096 RelationalEventId.CommandExecuted[20101] (Microsoft.EntityFrameworkCore.Database.Command)
      Executed DbCommand (19ms) [Parameters=[], CommandType='Text', CommandTimeout='30']
      SELECT 1
info: 4/26/2023 19:34:13.145 RelationalEventId.CommandExecuted[20101] (Microsoft.EntityFrameworkCore.Database.Command)
      Executed DbCommand (4ms) [Parameters=[], CommandType='Text', CommandTimeout='60']
      IF SERVERPROPERTY('EngineEdition') <> 5
      BEGIN
          ALTER DATABASE [AllTogetherNow] SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
      END;
info: 4/26/2023 19:34:13.158 RelationalEventId.CommandExecuted[20101] (Microsoft.EntityFrameworkCore.Database.Command)
      Executed DbCommand (12ms) [Parameters=[], CommandType='Text', CommandTimeout='60']
      DROP DATABASE [AllTogetherNow];
info: 4/26/2023 19:34:13.339 RelationalEventId.CommandExecuted[20101] (Microsoft.EntityFrameworkCore.Database.Command)
      Executed DbCommand (159ms) [Parameters=[], CommandType='Text', CommandTimeout='60']
      CREATE DATABASE [AllTogetherNow];
info: 4/26/2023 19:34:13.370 RelationalEventId.CommandExecuted[20101] (Microsoft.EntityFrameworkCore.Database.Command)
      Executed DbCommand (30ms) [Parameters=[], CommandType='Text', CommandTimeout='60']
      IF SERVERPROPERTY('EngineEdition') <> 5
      BEGIN
          ALTER DATABASE [AllTogetherNow] SET READ_COMMITTED_SNAPSHOT ON;
      END;
info: 4/26/2023 19:34:13.373 RelationalEventId.CommandExecuted[20101] (Microsoft.EntityFrameworkCore.Database.Command)
      Executed DbCommand (0ms) [Parameters=[], CommandType='Text', CommandTimeout='30']
      SELECT 1
info: 4/26/2023 19:34:13.459 RelationalEventId.CommandExecuted[20101] (Microsoft.EntityFrameworkCore.Database.Command)
      Executed DbCommand (6ms) [Parameters=[], CommandType='Text', CommandTimeout='30']
      CREATE TABLE [Accounts] (
          [AccountId] int NOT NULL IDENTITY,
          [VendorName] nvarchar(max) NOT NULL,
          CONSTRAINT [PK_Accounts] PRIMARY KEY ([AccountId])
      );
info: 4/26/2023 19:34:13.461 RelationalEventId.CommandExecuted[20101] (Microsoft.EntityFrameworkCore.Database.Command)
      Executed DbCommand (1ms) [Parameters=[], CommandType='Text', CommandTimeout='30']
      CREATE TABLE [Orders] (
          [OrderId] int NOT NULL IDENTITY,
          [AccountId] int NOT NULL,
          CONSTRAINT [PK_Orders] PRIMARY KEY ([OrderId]),
          CONSTRAINT [FK_Orders_Accounts_AccountId] FOREIGN KEY ([AccountId]) REFERENCES [Accounts] ([AccountId]) ON DELETE CASCADE
      );
info: 4/26/2023 19:34:13.462 RelationalEventId.CommandExecuted[20101] (Microsoft.EntityFrameworkCore.Database.Command)
      Executed DbCommand (1ms) [Parameters=[], CommandType='Text', CommandTimeout='30']
      CREATE TABLE [SalesOrders] (
          [SalesOrderId] int NOT NULL IDENTITY,
          [AccountId] int NOT NULL,
          CONSTRAINT [PK_SalesOrders] PRIMARY KEY ([SalesOrderId]),
          CONSTRAINT [FK_SalesOrders_Accounts_AccountId] FOREIGN KEY ([AccountId]) REFERENCES [Accounts] ([AccountId]) ON DELETE CASCADE
      );
info: 4/26/2023 19:34:13.463 RelationalEventId.CommandExecuted[20101] (Microsoft.EntityFrameworkCore.Database.Command)
      Executed DbCommand (0ms) [Parameters=[], CommandType='Text', CommandTimeout='30']
      CREATE INDEX [IX_Orders_AccountId] ON [Orders] ([AccountId]);
info: 4/26/2023 19:34:13.464 RelationalEventId.CommandExecuted[20101] (Microsoft.EntityFrameworkCore.Database.Command)
      Executed DbCommand (0ms) [Parameters=[], CommandType='Text', CommandTimeout='30']
      CREATE INDEX [IX_SalesOrders_AccountId] ON [SalesOrders] ([AccountId]);
info: 4/26/2023 19:34:13.709 RelationalEventId.CommandExecuted[20101] (Microsoft.EntityFrameworkCore.Database.Command)
      Executed DbCommand (17ms) [Parameters=[], CommandType='Text', CommandTimeout='30']
      SELECT [o].[OrderId] AS [Id]
      FROM [Orders] AS [o]
      UNION ALL
      SELECT [s].[SalesOrderId] AS [Id]
      FROM [SalesOrders] AS [s]
info: 4/26/2023 19:34:13.786 RelationalEventId.CommandExecuted[20101] (Microsoft.EntityFrameworkCore.Database.Command)
      Executed DbCommand (4ms) [Parameters=[], CommandType='Text', CommandTimeout='30']
      SELECT [o].[OrderId] AS [Id], [a].[VendorName] AS [AccountName]
      FROM [Orders] AS [o]
      INNER JOIN [Accounts] AS [a] ON [o].[AccountId] = [a].[AccountId]
      UNION ALL
      SELECT [s].[SalesOrderId] AS [Id], [a0].[VendorName] AS [AccountName]
      FROM [SalesOrders] AS [s]
      INNER JOIN [Accounts] AS [a0] ON [s].[AccountId] = [a0].[AccountId]

@CorsairRO
Copy link
Author

CorsairRO commented Apr 26, 2023 via email

@CorsairRO
Copy link
Author

CorsairRO commented Apr 26, 2023 via email

@CorsairRO
Copy link
Author

CorsairRO commented Apr 27, 2023 via email

@roji
Copy link
Member

roji commented Apr 27, 2023

That sounds like #19129, can you please try the latest 8.0 preview and see if that solves the issue for you?

All works fine in your/mine tests with string columns for example with no length restriction, literally being NVARCHAR(MAX).
Of course this is a VERY bad DB Design, and our DBA has set MaxLengths to 99.99 percent of the string columns in DB.

I'm certainly not aware of a reason to systematically always avoid NVARCHAR(MAX), you may want to revisit that idea.

@CorsairRO
Copy link
Author

CorsairRO commented Apr 27, 2023 via email

@roji
Copy link
Member

roji commented Apr 27, 2023

Indexes on such columns are very slow and adds a lot of extra memory consumption on the Server

Indexes over nvarchar(max) aren't slow; they're not supported at all. If you need to index a text column, that's indeed a good reason to use nvarchar(x) over nvarchar(max).

All varchar/nvarchar (x) will be used directly from the memory.
All varchar/nvarchar(MAX) are used from PAGED on disk memory.

I don't think you're suggesting that SQL Server caches all varchar/nvarchar(x) data in memory - that definitely wouldn't be possible for large databases. There are indeed some performance implications of using varchar/nvarchar(max), but whether that's actually relevant depends on your specific application. In any case, I won't go into a long discussion on this - I'd just advise carefully understanding exactly why you think this is the case rather than just systematically doing it this way.

Regardless, I'll go ahead and close this as a duplicate of #19129,

@roji
Copy link
Member

roji commented Apr 27, 2023

Duplicate of #19129

@roji roji marked this as a duplicate of #19129 Apr 27, 2023
@roji roji closed this as not planned Won't fix, can't repro, duplicate, stale Apr 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants