Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MNT] remove coverage reporting and pytest-cov from PR CI and setup.cfg #6363

Merged
merged 5 commits into from
Jun 16, 2024

Conversation

fkiraly
Copy link
Collaborator

@fkiraly fkiraly commented Apr 29, 2024

This PR removes generation of coverage reports, installation and use of pytest-cov from standard CI. Also removes the (unreliable) coverage badge from the README

Reasons:

@fkiraly fkiraly added maintenance Continuous integration, unit testing & package distribution module:tests test framework functionality - only framework, excl specific tests labels Apr 29, 2024
@fkiraly
Copy link
Collaborator Author

fkiraly commented May 1, 2024

Anecdotal, but looks like this leads to substantial runtime improvements.

Before:
image

After:
image

@yarnabrina
Copy link
Member

@fkiraly I want to be catious for this. Can we test these:

  1. what happens if instead of removing completely we only skip the xml and html reports? I believe a base one (.coverage) gets generated always, and these two run separately.
  2. if it is removed completely (assuming only for CI runs in PR), how does coverage report appear in README (after merge to main)? If it shows missing or 0% or similar, that will be misleading. To test, may be you can try to edit the link of README in this branch without actually merging.

My caution is mainly for the reason that it's highly counter intuitive to me that coverage will affect timing by this much. It's more than 3-4 times in your screenshots, and if had that been the general effect of pytest-cov, it's expected to be detected by users quite ago. It's very popular and standard, so I am really wondering if we are missing something else (though I don't have any alternative ideas yet).

@fkiraly
Copy link
Collaborator Author

fkiraly commented May 1, 2024

what happens if instead of removing completely we only skip the xml and html reports? I believe a base one (.coverage) gets generated always, and these two run separately.

According to the profiler, indeed these parts create the overhead.

How would you turn these off separately? Can you help? Would it be removing the --cov-report html etc, but not --cov?

how does coverage report appear in README

This PR also removes the badge from the readme, because it is misleading anyway, with or without this PR.

We should find a way to display genuine coverage in the readme (see #5090) - I would consider that a separate issue (namely, #5090), and it would then include adding the correct coverage display to the readme.

@fkiraly
Copy link
Collaborator Author

fkiraly commented May 1, 2024

so I am really wondering if we are missing something else (though I don't have any alternative ideas yet).

so am I.
A wild guess is that we have some runaway import chains in the style of #6355, which is causing the long runtimes.

Or, perhaps cause/effect are hard to detect in general?

@yarnabrina
Copy link
Member

How would you turn these off separately? Can you help? Would it be removing the --cov-report html etc, but not --cov?

Yes, that only. Let's see what happens.

This PR also removes the badge from the readme, because it is misleading anyway, with or without this PR.

I think if README shows 0% or etc. it may give potential users/contributors a negative impression that this framework is untested (e.g. I know I'll feel the same for a new tool).

@fkiraly
Copy link
Collaborator Author

fkiraly commented May 2, 2024

ok - I've added it back in the test-nosoftdeps-full job now, and in the pyproject.toml, to see what happens

@yarnabrina
Copy link
Member

Only 5 jobs got triggered, not a single testing job! How did it ran everything earlier?

@fkiraly
Copy link
Collaborator Author

fkiraly commented May 2, 2024

Only 5 jobs got triggered, not a single testing job! How did it ran everything earlier?

I see - I think I understand why the difference.

Previously, pytest-cov was removed from pyproject.toml, and that triggered "test all". Now, we added it back, and pyproject.toml is no longer modified, so there is no trigger for testing anything anymore.

@yarnabrina
Copy link
Member

https://github.com/sktime/sktime/actions/runs/8940323391

I triggered a manual test all workflow on this branch for debugging.

@fkiraly
Copy link
Collaborator Author

fkiraly commented May 3, 2024

Thanks.

Sth is taking hours again, how do we find out with which estimator it gets stuck?

@yarnabrina
Copy link
Member

I am not aware of a better solution than going into verbose mode.

By the way, have you seen the failures? It seems every single "other" run failed with this:

FAILED sktime/tests/tests/test_test_utils.py::test_run_test_for_class - AssertionError: assert 'True_run_always' in ['True_pyproject_change', 'True_changed_class', 'True_changed_tests']

@fkiraly
Copy link
Collaborator Author

fkiraly commented May 3, 2024

By the way, have you seen the failures? It seems every single "other" run failed with this:

Thanks for pointing this out - this is a bug with a test I added to make sure we test the run_test_for_class utility.

The bug surfaces only in the test_all workflow which has a certain combination of conditions. Fix here:
#6383

@yarnabrina
Copy link
Member

I checked other jobs, and so far no timeout failure. Only one module job failed and it's for forecasting:

FAILED sktime/forecasting/model_evaluation/tests/test_evaluate.py::test_evaluate_common_configs[backend8-scoring1-refit-1-10-fh5-ExpandingWindowSplitter] - OverflowError: Python int too large to convert to C long

Ref. https://github.com/sktime/sktime/actions/runs/8940323391/job/24558260007#step:3:6594

Any idea if it's sporadic? We'll probably know from random seed diagnostic.

(FYI @benHeid )

@fkiraly
Copy link
Collaborator Author

fkiraly commented May 3, 2024

FAILED sktime/forecasting/model_evaluation/tests/test_evaluate.py::test_evaluate_common_configs[backend8-scoring1-refit-1-10-fh5-ExpandingWindowSplitter] - OverflowError: Python int too large to convert to C long

This is definitely a new one - have not seen this before.

However, there have been failures in test_evaluate_common_configs in ancient times, but these seem unrelated?
#1194

We'll probably know from random seed diagnostic.

Probably not, as that one does not add random seeds except in TestAllForecasters, the test_evaluate_common_configs lives elsewhere.

@fkiraly fkiraly changed the title [MNT] remove coverage reporting and pytest-cov from PR CI [MNT] remove coverage reporting and pytest-cov from PR CI and setup.cfg Jun 10, 2024
@Abhay-Lejith
Copy link
Contributor

@fkiraly @yarnabrina , mentioning this here as it might be related.
I was not able to make breakpoints work in my python debugger in vscode up until now.
I've managed to fix it by essentially disabling pytest-cov ( or something of that effect ) by modifying my launch.json and adding the following field .
"env": {"PYTEST_ADDOPTS": "--no-cov"}
It appears that pytest-cov somehow interferes with the debugger. More info on it in the "Note section" here: https://code.visualstudio.com/docs/python/testing#_pytest-configuration-settings

@yarnabrina
Copy link
Member

@fkiraly I think you use vs code and use the integrated debugging? Did you ever face the issue @Abhay-Lejith mentioned?

I definitely face a lot of issues in our current setup.cfg, so I have a local patch to ignore that file altogether, essentially doing what @Abhay-Lejith has done with VS Cide settings so never faced this issue myself. If this is indeed an issue, as the documentation seem to suggest, I expect this to be a very common thing, and am wondering why no one ever reported it before? 😕

@fkiraly
Copy link
Collaborator Author

fkiraly commented Jun 10, 2024

@fkiraly I think you use vs code and use the integrated debugging? Did you ever face the issue @Abhay-Lejith mentioned?

Yes, the GUI integrated breakpoint debugging never worked, so I ended up adopting a more manual workflow and also ignoring the file in practice.

I applaud @Abhay-Lejith to having finally identified the reason.

I expect this to be a very common thing, and am wondering why no one ever reported it before?

Perhaps groupthink bias, i.e., everyone thinks it works for everyone else and they would be considered stupid if they raise it in public? Nothing further from the truth, but sometimes it is how the mind works.

Shall we remove the flags then, it seems to cause problems systematically?

@fkiraly fkiraly marked this pull request as ready for review June 10, 2024 22:57
Copy link
Member

@yarnabrina yarnabrina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are disabling coverage everywhere, should we drop them from test dependencies too?

@fkiraly
Copy link
Collaborator Author

fkiraly commented Jun 11, 2024

we should probably have some replacement plan in mind, e.g., where and when we run coverage. On the full test run?

@fkiraly fkiraly merged commit bd2f0e3 into main Jun 16, 2024
177 of 183 checks passed
@fkiraly fkiraly deleted the remove-pytest-cov branch June 16, 2024 11:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Continuous integration, unit testing & package distribution module:tests test framework functionality - only framework, excl specific tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants