Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve bulk import performance #1539

Merged

Conversation

Ptosiek
Copy link
Contributor

@Ptosiek Ptosiek commented Feb 9, 2023

Problem
When transactions are supported, import_data_inner will create a subtransaction for each row.
In bulk mode, no db operations are made anyway so there's no need to for the extra db queries of creating and releasing a save point.

Solution

Do not create save point for each row when bulk mode is enabled.

Acceptance Criteria
Sample bulk_import results (on postgres):

before the changes:
do_create()
Time 6.004
Memory 2.21875
Max 82.12890625 Min 79.91015625

do_update()
Time 5.205
Memory 3.9375
Max 103.12890625 Min 99.19140625

do_delete()
Time 3.548
Memory 0.01171875
Max 105.78125 Min 105.76953125

after the changes:
do_create()
Time 2.2
Memory 2.890625
Max 79.8125 Min 76.921875

do_update()
Time 1.616
Memory 7.921875
Max 101.4609375 Min 93.5390625

do_delete()
Time 0.9456
Memory 0.00390625
Max 104.59765625 Min 104.59375

Significant time decrease. The memory delta is increased, but as you can see with added Max/Min in the output, the overall memory used has not.

@Ptosiek Ptosiek force-pushed the improve-bulk-with-transaction branch from 2b7ce84 to 516eb61 Compare February 9, 2023 13:07
@coveralls
Copy link

coveralls commented Feb 9, 2023

Coverage Status

Coverage: 100.0%. Remained the same when pulling 962eb45 on Ptosiek:improve-bulk-with-transaction into 490c737 on django-import-export:main.

Copy link
Contributor

@matthewhegarty matthewhegarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great - thanks for picking this up. I made a couple of minor comments.

import_export/resources.py Outdated Show resolved Hide resolved
import_export/resources.py Show resolved Hide resolved
@Ptosiek
Copy link
Contributor Author

Ptosiek commented Feb 9, 2023

@matthewhegarty
Added a test and deprecated raise_errors (not sure if the cleanest way but it works)

Copy link
Contributor

@matthewhegarty matthewhegarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding these changes. I had one comment about adding a test for the deprecation warning.

import_export/resources.py Show resolved Hide resolved
add deprecation test
@matthewhegarty
Copy link
Contributor

All good - thanks so much for submitting this change.

@matthewhegarty matthewhegarty merged commit 299f371 into django-import-export:main Feb 10, 2023
@Ptosiek Ptosiek deleted the improve-bulk-with-transaction branch February 13, 2023 10:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants