Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[1.8.x] Fixed #23372 -- Improved test suite execution time on MSSQL. #5438

Closed
wants to merge 1 commit into from

Conversation

timgraham
Copy link
Member

Executing loaddata for each app in Django's test suite is very
expensive on MSSQL due to the re-enabling of constraint checks
that happen each time, even if no fixtures need to be loaded.

django-mssql should set os.environ['BULK_LOAD_INITIAL_DATA'] = 1
for its test suite. This improves the speed by ~20% (2.5 hours).

https://code.djangoproject.com/ticket/23372

@akaariai
Copy link
Member

If the problem is enabling and disabling of constraints by fixture loading - could we detect automatically migrations that are known not to contain cycles? At least cases like "one model, no self-referential foreign key" are easy.

@akaariai
Copy link
Member

The profile generated by python -m cProfile -s cumulative ./runtests.py --parallel=1 > prof.txt could help a lot.

@manfre
Copy link
Contributor

manfre commented Oct 21, 2015

I'm running the profiling against stable/1.8.x and my results will likely be available to share in a day..

@shaib
Copy link
Member

shaib commented Oct 21, 2015

I've seen this problem with South migrations; the project did not use any fixtures, and South's "migrate" had a "--no-data" flag, so that's how we solved it then.

Doesn't the current PR change the semantics, though? Sets of fixtures which cannot be loaded before it become valid after it, because they're now loaded in the same transaction? On the other hand, loading big fixtures is notoriously memory-hungry, and this unites fixtures, so I suspect for some users, things which worked before this change may fail on OutOfMemory -- unless I'm missing something basic.

@timgraham
Copy link
Member Author

I don't plan to put much more effort into this if it has problems. This speeds up the test suite on MSSQL by 2.5 hours (to 5.25 hours) in Michael's test, but obviously if there are problems with it, we'll have to try something else. If there is some way to limit it to only activate when running the Django test suite, this might be enough.

if self.app_label:
# Passing app_label as a set is a temporary hack for Django 1.8
# to speedup Django's test suite on MSSQL.
if isinstance(self.app_label, set):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe adding and os.environ.get('BULK_LOAD_INITIAL_DATA') data is a practical solution?

…n time on django-mssql.

Executing loaddata for each app in Django's test suite is very
expensive on MSSQL due to the re-enabling of constraint checks
that happen each time, even if no fixtures need to be loaded.

django-mssql should set os.environ['BULK_LOAD_INITIAL_DATA'] = 1
for its test suite. This improves the speed by ~20% (2.5 hours).
@timgraham
Copy link
Member Author

@manfre, I adjusted this to require setting an environment variable, so it won't have any effect if not running the django-mssql test suite. Please double check that it still works.

@akaariai
Copy link
Member

akaariai commented Dec 9, 2015

See #5801 for a possible alternate approach. I don't have MSSQL available, so can't test the actual performance.

@akaariai
Copy link
Member

I've done a bit of investigation, and the reason for the extremely bad runtime on 1.8 seems to be a couple of initial_data fixtures in test applications. Due to the way Django loads fixture files, these are loaded for each transactional test case. On MSSQL this means that constraints are checked for each table for each transactional test case.

On master the initial_data fixtures are gone, and on my laptop the runtime of the test suite is around an hour. With some fixes to fixture loading the runtime can be pushed down to half an hour which I guess is already in the tolerable range. Parallel testing can improve this even further.

If we want to do something about this for 1.8, I think the approach in #5801 is most promising. We can make the change to 1.8 in a conservative way - that is, a backend can opt-in to the tables argument for enabling and disabling of constraints. If the backend doesn't need the tables argument, nothing is changed. This is a bit more work, but worth it to make sure 1.8 remains extremely stable.

@manfre
Copy link
Contributor

manfre commented Dec 29, 2015

@akaariai thanks for the investigative work. Sub-hour test suite time would be an amazing improvement over the current situation. I took a brief look at #5801 and I agree that would be a good backwards compatible approach for 1.8.

@timgraham
Copy link
Member Author

Okay, I'll close this then.

@timgraham timgraham closed this Dec 29, 2015
@timgraham timgraham deleted the 23372 branch December 30, 2015 22:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants