'System.OutOfMemoryException' was thrown on a Large Table #19

deyshin · 2016-08-12T14:28:50Z

The script has been very helpful to my project. Thank you very much.

I have run into one problem - When I run the script on a very large table, I get the following error:

Exception of type 'System.OutOfMemoryException' was thrown.

The particular table has 4 columns - 2 Int32 and 2 String type with about 1.7 million rows of data. Is it something that we can solve - or is it a inherent system limitation?

dnlnln · 2019-03-28T02:52:53Z

The tool was originally designed for synchronising smaller data sets between environments via source control, so typically this meant that a few thousand rows of static data would be the max. Others have used it for slightly different purposes, such as for performing simple ETL operations between databases.

In order to allow it to be used for datasets of this size, and have the data fully synchronised (e.g. all rows in the target inserted/updated/deleted to reflect the source data set), then a fundamentally different approach is needed in order to merge data in a reliable and efficient way.

The @include_values bit parameter, added recently (see #37, #47 & #34), might be a better option for working with larger datasets. When set to 0, it omits the VALUES clause containing all the data, in favour of simply joining onto the @table_name at merge time. This would allow you to populate the source table before running the merge, in whichever way is appropriate for your dataset (e.g. a single BULK INSERT statement containing all data, batches of INSERT statements containing thousands of rows at a time, etc).

For example:

EXEC [AdventureWorks2017].dbo.sp_generate_merge
  @table_name = 'CurrencyRate', 
  @schema = 'Sales',
  @include_values = 0

Generates the following statement:

MERGE INTO [Sales].[CurrencyRate] AS [Target]
USING [Sales].[CurrencyRate] AS [Source]
ON ([Target].[CurrencyRateID] = [Source].[CurrencyRateID])
WHEN MATCHED AND (
	NULLIF([Source].[CurrencyRateDate], [Target].[CurrencyRateDate]) IS NOT NULL OR NULLIF([Target].[CurrencyRateDate], [Source].[CurrencyRateDate]) IS NOT NULL OR 
	NULLIF([Source].[FromCurrencyCode], [Target].[FromCurrencyCode]) IS NOT NULL OR NULLIF([Target].[FromCurrencyCode], [Source].[FromCurrencyCode]) IS NOT NULL OR 
	NULLIF([Source].[ToCurrencyCode], [Target].[ToCurrencyCode]) IS NOT NULL OR NULLIF([Target].[ToCurrencyCode], [Source].[ToCurrencyCode]) IS NOT NULL OR 
	NULLIF([Source].[AverageRate], [Target].[AverageRate]) IS NOT NULL OR NULLIF([Target].[AverageRate], [Source].[AverageRate]) IS NOT NULL OR 
	NULLIF([Source].[EndOfDayRate], [Target].[EndOfDayRate]) IS NOT NULL OR NULLIF([Target].[EndOfDayRate], [Source].[EndOfDayRate]) IS NOT NULL OR 
	NULLIF([Source].[ModifiedDate], [Target].[ModifiedDate]) IS NOT NULL OR NULLIF([Target].[ModifiedDate], [Source].[ModifiedDate]) IS NOT NULL) THEN
 UPDATE SET
  [CurrencyRateDate] = [Source].[CurrencyRateDate], 
  [FromCurrencyCode] = [Source].[FromCurrencyCode], 
  [ToCurrencyCode] = [Source].[ToCurrencyCode], 
  [AverageRate] = [Source].[AverageRate], 
  [EndOfDayRate] = [Source].[EndOfDayRate], 
  [ModifiedDate] = [Source].[ModifiedDate]
WHEN NOT MATCHED BY TARGET THEN
 INSERT([CurrencyRateID],[CurrencyRateDate],[FromCurrencyCode],[ToCurrencyCode],[AverageRate],[EndOfDayRate],[ModifiedDate])
 VALUES([Source].[CurrencyRateID],[Source].[CurrencyRateDate],[Source].[FromCurrencyCode],[Source].[ToCurrencyCode],[Source].[AverageRate],[Source].[EndOfDayRate],[Source].[ModifiedDate])
WHEN NOT MATCHED BY SOURCE THEN 
 DELETE
;

The downside is that you'd need to manually edit the generated statement to use the pre-populated table object (rather than the original table object), so that the USING clause reads something like this:

USING [#currencyRateBulkInserted] AS [Source]

I'm curious to know if this kind of general approach would suit your needs? If so then there's always the possibility of adding a new parameter to override the source table name.

JV-Gh · 2021-07-19T21:37:26Z

"The downside is that you'd need to manually edit the generated statement to use the pre-populated table object (rather than the original table object), "

A way to make it more dynamic would be add an additional parameter that would take in a string - custom table name (e.g. @custom_source_name = '#currencyRateBulkInserted') - and the procedure would use this parameter's value instead if NOT NULL (a validation can be added at the top of the merge to throw an error if an object with that name doesn't exist). Just a thought off the top of my head. It's possible that I might be missing something.

dnlnln · 2023-07-31T07:14:22Z

An update on this: @EitanBlumin has very helpfully implemented a new parameter that allows you to split source rows into multiple MERGE statements. To use, use @max_rows_per_batch=1000 or whatever batch size you need and be sure to also include the @delete_if_not_matched=0 param.

This should avoid the out-of-memory exception. If it recurs in spite of this, please comment here and I will re-open the issue.

dnlnln mentioned this issue Mar 28, 2019

SQL Server row limitations for scripts #11

Closed

EitanBlumin mentioned this issue Jun 12, 2023

added optional parameters @update_existing and @max_rows_per_batch #94

Merged

dnlnln closed this as completed Jul 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

'System.OutOfMemoryException' was thrown on a Large Table #19

'System.OutOfMemoryException' was thrown on a Large Table #19

deyshin commented Aug 12, 2016

dnlnln commented Mar 28, 2019 •

edited

Loading

JV-Gh commented Jul 19, 2021

dnlnln commented Jul 31, 2023

'System.OutOfMemoryException' was thrown on a Large Table #19

'System.OutOfMemoryException' was thrown on a Large Table #19

Comments

deyshin commented Aug 12, 2016

dnlnln commented Mar 28, 2019 • edited Loading

JV-Gh commented Jul 19, 2021

dnlnln commented Jul 31, 2023

dnlnln commented Mar 28, 2019 •

edited

Loading