Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve test collection speed #375

Merged
merged 46 commits into from
Jul 28, 2021
Merged

Conversation

lukasturcani
Copy link
Owner

@lukasturcani lukasturcani commented Jul 28, 2021

Related Issues: #370
Requested Reviewers: @andrewtarzia
Note for Reviewers: If you accept the review request add a 馃憤 to this post

The speed at which tests are collected has been dramatically improved. Previously, running

pytest --collect-only

would take approximately 12.84 s. It is now 3.07 s. This is extremely important because collection time is spent every time pytest is run, even if only a subset of tests are run. For example

pytest -k get_id

would have spent 12.84 s collecting all the tests and then about 1 s to actually run the tests. This is an incredibly frustrating developer experience because it makes running tests very unresponsive. Now running the same command would take 4.07 s.
Importantly the total time spent running tests is basically unchanged, going from 48.24 s previously to 48.78 s now.

Another important change is that test will no longer break in the collection phase, or at least, very few of them will. The issue with tests breaking during the collection phase is that if this happens, pytest does not provide very good error information.

Both the collection time and prevention of breakage during the collection change was achieved by deferring the creation of test cases to after the collection phase. For example, a fixture such as this

@pytest.fixture(
    params=(
        CaseData(...),
        CaseData(...),
    ),
)
def case_data(request):
    return request.param

has been changed to this

@pytest.fixture(
    params=(
        lambda: CaseData(...),
        lambda: CaseData(...)
    ),
)
def case_data(request):
    return request.param()

The difference between these two pieces of code is that in the original version the CaseData instance is created when the module is being read. However, in the new version, the CaseData instance is created only once the fixture is being used by the test. This means the collection phase of pytest does not actually create any instances any more, the test execution phase creates them now.

The other oft-repeated change is turning module-level variables into functions

some_var = Something()

into functions

def _get_some_var() -> Something:
    return Seomthing()

Again, this means that instances are not created when the module is being read. This change however does mean that the variable can no longer be re-used as a new instance will be created every time _get_some_var() is called. For most cases, I expect this will make no practical difference. In cases where I did think a performance impact may be incurred I used fixtures

@pytest.fixture(scope='session')
def _get_some_var() -> Something:
    return Something()

These changes constitute the vast majority of the changes, some other notable changes are:

  • pytest.ini: Added this file and explicitly specified the directory where the tests are located. This means that running pytest will only search the tests directory rather than all other other directories too when looking for tests, giving a small reduction in test collection time.
  • tests/molecular/molecules/.../fixtures/building_blocks.py: I added this file to factor out the common building blocks used across multiple modules.

@andrewtarzia
Copy link
Collaborator

My only comment is that you should delete all .npy and rebuild all (git commit should only be delete 2, add 2), because I think you added 2 instead of replacing 2.

@lukasturcani
Copy link
Owner Author

lukasturcani commented Jul 28, 2021

My only comment is that you should delete all .npy and rebuild all (git commit should only be delete 2, add 2), because I think you added 2 instead of replacing 2.

I'm not following. Could you spell it out a little more for me please?

@andrewtarzia
Copy link
Collaborator

You changed two test names in the cage fixtures, which lead to two new .npy files. But you did not delete the .npy files from the old name. Right?

@lukasturcani
Copy link
Owner Author

I did, that's why the the files are listed as renamed.

@andrewtarzia
Copy link
Collaborator

Oops misread!

@lukasturcani lukasturcani merged commit f363470 into master Jul 28, 2021
@lukasturcani lukasturcani deleted the lukas/improve-collection-speed branch July 28, 2021 21:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants