Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Strategy for inserting rows with table-values parameters #2484
With SQL Server you can used table-values parameters to perform bulk DML very quickly and elegantly. Unfortunately, this is tedious:
Here is a proposal for how EF could pull off all of this transparently for inserts and deletes in SaveChanges:
Generate a table type programmatically for all tables that require it. 3 issues:
I think all of this would work for deletes as well. Updates are trickier because there is a great variety of columns that might change or not change. Maybe EF can use a single type for all updates and simply ignore some columns for some updates.
Who says that EF is not suitable for bulk inserts? Right now that might be the case but it does not have to be so. This would be awesome.
The point of this ticket is to present a viable plan for how this could be implemented in EF. Some inspiration for the dev team.
@GSPP I agree with @ErikEJ, this sounds like a great PR. Currently EF Core supports batching, so things are already factored in a way that multiple operations of the same type are grouped and submitted to the server together, and that could make a TVP-based strategy easier to plug in. I would expect TVPs to have a perf advantage over what we do, but I don't know how much.
@ErikEJ I understand why you might think that but measurements turn out to now support this view.
I have meant this TVP feature to be used with "bulky" inserts (100-1M rows). It is not primarily targeted at OLTP inserts (dozens of rows) but the perf benefits are visible there as well. I can only encourage you to try it because the gains can be staggering.
I'd like to quote this piece:
Do not underestimate the difference that a better plan can do. Per-index maintenance in bigger batches can be significantly faster than issuing new queries and round-trips all the time. Even without indexes, imagine a single-row insert (small row, no indexes). 90% of the execution time is per-statement and per-batch overhead. The useful work (the per-row work) is small. With TVPs the per-statement and per-batch overheads exist once for many rows (granted, they will be a little higher but not by much).
I will not be able to issue a PR for this but I hope that this ticket might spark one!
referenced this issue
Jan 14, 2017
I really like the idea of adding support for TVP's! As a smaller change short term, it would be nice if there was a strategy for compressing the parameters using the current process. For example, if there are 2100 parameters getting passed, but 90% of them are just null, maybe we should just pass null as one parameter and reuse it.
For example, currently an insert might look like this:
exec sp_executesql N'SET NOCOUNT ON; INSERT INTO [dbo].[MyTable] (Column1, Column2, Column3, Column4, Columen5) VALUES (@p0, @p1, @p2, @p3, @p4), (@p5, @p6, @p7, @p8, @p9); ',N'@p0 int,@p1 nvarchar(50),@p2 nvarchar(50),@p3 nvarchar(50),@p4 nvarchar(50), @p5 int,@p6 nvarchar(50),@p7 nvarchar(50),@p8 nvarchar(50),@p9 nvarchar(50)', @p0=1,@p1=NULL,@p2=NULL,@p3=NULL,@p4=NULL, @p5=1,@p6=NULL,@p7=NULL,@p8=NULL,@p9=NULL
But with "compression" it would be:
exec sp_executesql N'SET NOCOUNT ON; INSERT INTO [dbo].[MyTable] (Column1, Column2, Column3, Column4, Columen5) VALUES (@p0, @p1, @p1, @p1, @p1), (@p0, @p1, @p1, @p1, @p1); ',N'@p0 int,@p1 nvarchar(50)', @p0=1,@p1=NULL
For bulk inserts with many fields with the same values, this could make the save operation happen much faster as we could add more records per batch.