Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PostgreSQLBulkCopy performance degradation for method BulkCopyType.ProviderSpecific #4445

Closed
AndreyShipunov opened this issue Mar 9, 2024 · 0 comments · Fixed by #4459
Closed
Labels
status: has-pr There is active PR for issue type: bug
Milestone

Comments

@AndreyShipunov
Copy link
Contributor

Good afternoon!

After switching to .NET 8 bugs have been fixed in the ProviderSpecificCopySyncImpl method https://github.com/linq2db/linq2db/blob/master/Source/LinqToDB/DataProvider/PostgreSQL/PostgreSQLBulkCopy.cs#L139

Bugs have been fixed by adding lines:
https://github.com/linq2db/linq2db/blob/master/Source/LinqToDB/DataProvider/PostgreSQL/PostgreSQLBulkCopy.cs#L167
https://github.com/linq2db/linq2db/blob/master/Source/LinqToDB/DataProvider/PostgreSQL/PostgreSQLBulkCopy.cs#L170

Both lines use the GetNativeType method - this method is incredibly slow. And it is called in a loop for each column and for each row.

In the line https://github.com/linq2db/linq2db/blob/master/Source/LinqToDB/DataProvider/PostgreSQL/PostgreSQLBulkCopy.cs#L170 you can swap the check order to reduce the number of GetNativeType calls:
if (value is DateTimeOffset dto && _provider.GetNativeType(dataType.DbType) == NpgsqlProviderAdapter.NpgsqlDbType.TimeTZ)

The NormalizeTimeStamp method also calls `GetNativeType'.

At the moment, inserting 10,000 rows with 15 columns (four of them require calling NormalizeTimeStamp) via BulkCopyType.MultipleRows is about 3 times faster than BulkCopyType.ProviderSpecific. At the moment, BulkCopyType.ProviderSpecific cannot be used in production - it always works slower.

BulkCopyType.MultipleRows spends a lot of time and memory generating SQL strings, and BulkCopyType.ProviderSpecific spends 99% of the time calling GetNativeType 200,000 times.

As a good solution, you can calculate GetNativeType before loops, because it makes no sense to check the column type for each row.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: has-pr There is active PR for issue type: bug
Development

Successfully merging a pull request may close this issue.

2 participants