[BUG] logic problem with new CI - coverage of `TestAllEstimators` and `TestAllObjects` incorrect #6352

fkiraly · 2024-04-28T11:01:44Z

I think there is a bug in the new CI. The coverage is not affected since "old CI" is still running.

The issue occurs when, say, a forecaster changes.

The new CI will then trigger tests in sktime.forecasting, but not in sktime.tests which contains TestAllEstimator.
So, we run the forecaster specific tests for the changed forecaster, but not the object and estimator specific tests.

FYI @yarnabrina, is there a way to fix this?
The "one liner" to "run all tests for estimator X" is check_estimator - however, this is intentionally avoided in the original setup as an entry point since this distributes less well across workers, via pytest-xdist.

The text was updated successfully, but these errors were encountered:

fkiraly · 2024-04-28T11:04:34Z

The "opposite case" is where there are changes in "other" modules, as well as in a forecaster. The TestAllEstimators test for that forecaster then runs in the "other" test job, not in the "module/forecasters" bucket:
https://github.com/sktime/sktime/actions/runs/8861405811/job/24333294008?pr=6348

yarnabrina · 2024-04-28T11:17:53Z

I just commented in Discord as well, this job is not coming from "other" case in new CI. It's coming from old CI relying on the Makefile, specifically this line is being called to run the tests:

sktime/.github/workflows/test.yml

Line 298 in 3156135

run: make test_without_datasets

New CI jobs will have been called by this line instead:

sktime/.github/workflows/test_other.yml

Lines 66 to 82 in 3156135

    
                   run: >- 
        
                     python3 
        
                     -m pytest 
        
                     sktime 
        
                     --ignore sktime/base 
        
                     --ignore sktime/datasets 
        
                     --ignore sktime/alignment 
        
                     --ignore sktime/annotation 
        
                     --ignore sktime/classification 
        
                     --ignore sktime/clustering 
        
                     --ignore sktime/forecasting 
        
                     --ignore sktime/networks 
        
                     --ignore sktime/param_est 
        
                     --ignore sktime/regression 
        
                     --ignore sktime/transformations 
        
                     --matrixdesign True 
        
                     --only_changed_modules True

fkiraly · 2024-04-28T11:32:04Z

Sorry, wrong link:
https://github.com/sktime/sktime/actions/runs/8861405811/job/24334091449?pr=6348

This is coming from "other" in new CI, screenshot:

fkiraly · 2024-04-28T11:33:07Z

The solution could be, to run tests in all "module" tests, that will at least ensure that the estimator/object tests are run for changed forecasters.

We cannot turn them off in "other", as estimators may live in "other" modules. But that will trigger only if an "other" and a "module" estimator is changed in the same PR, so perhaps low incidence.

yarnabrina · 2024-04-28T14:22:15Z

I've created PR #6353, please take a look. But based on sample runs on my laptop, this is going to make new CI significantly slower, even the test collection part. For example, the times before and after the change for forecasting with just test collection (--co flag) is 21.85 seconds (43669 tests) vs 314.01 seconds (79603 tests).

fkiraly · 2024-04-28T15:29:14Z

Only test collection?

Before we merge then, can we diagnose where this is coming from?

Collection should certainly not take so long.

fkiraly · 2024-04-28T15:40:35Z

I just checked construction times with the code below. It looks like construction is not the culprit, most likely.
Most estimators construct instantaneously, there are some which need 10s of milliseconds which is probably too long (but does not explain collection). It could of course be that I am missing something in a soft dependency I do not have installed.

from sktime.utils.validation._dependencies import _check_estimator_deps
from sktime.registry import all_estimators

ests = all_estimators(return_names=False)

def _measure_init_time(cls, params=None):
    from time import time

    start = time()
    try:
        cls(**params)
    except Exception:
        pass
    end = time()
    return end - start

times = []

for est in ests:
    if _check_estimator_deps(est, severity="none"):
        params = est.get_test_params()
        if not isinstance(params, list):
            params = [params]
        for param in params:
            times.append((est, _measure_init_time(est, param)))

yarnabrina · 2024-04-28T15:49:04Z

Just in case it's device or OS specific, what are the times for you for the above reported cases? I used these commands:

# pre-PR
python -m pytest sktime/forecasting --co

# post-PR
python -m pytest sktime/forecasting sktime/tests --co

yarnabrina · 2024-04-28T15:50:33Z

I just checked construction times with the code below. It looks like construction is not the culprit, most likely.
Most estimators construct instantaneously, there are some which need 10s of milliseconds which is probably too long (but does not explain collection). It could of course be that I am missing something in a soft dependency I do not have installed.

I think construction or soft dependencies do not affect the above two commands I shared. I am not 100% confident though.

fkiraly · 2024-04-28T17:28:50Z

I ran the first commands on main, python 3.11, windows - I cancelled it after it was running for 10 minutes.

That is very odd. Last time we measured test collection time, it was 15sec (minimal soft deps), or 1min (many extras)

update:

1st command: 661 secs (11 min)
2nd runs >20min, terminated

my show_versions:

Python dependencies:
          pip: 23.3.1
       sktime: 0.28.1
      sklearn: 1.3.2
       skbase: 0.7.6
        numpy: 1.26.2
        scipy: 1.11.4
       pandas: 2.1.4
   matplotlib: 3.8.3
       joblib: 1.3.2
        numba: 0.59.1
  statsmodels: 0.14.0
     pmdarima: 2.0.4
statsforecast: None
      tsfresh: 0.20.2
      tslearn: None
        torch: None
   tensorflow: None
tensorflow_probability: None

fkiraly · 2024-04-28T17:49:30Z

what are your timings?

I also note how strange tihs is, because isn't 600s the stanard timeout for pytest collect?

fkiraly · 2024-04-28T18:14:25Z

I think we are getting closer - I updated the issue #6344 with the problem description. I will now run a profiler on the test collection which takes too long.

yarnabrina · 2024-04-28T18:34:26Z

For example, the times before and after the change for forecasting with just test collection (--co flag) is 21.85 seconds (43669 tests) vs 314.01 seconds (79603 tests).

These are my times. But I wonder why's mine so much faster than yours!!

Idea: I don't have pytest-xdist or other plugins installed so disabled those in configuration (setup.cfg). Can you try that once?

yarnabrina · 2024-04-28T18:36:07Z

I also note how strange tihs is, because isn't 600s the stanard timeout for pytest collect?

Yes, it's configured in setup.cfg. My guess is it is for test execution, not for collection. Don't know how to check that.

fkiraly · 2024-04-28T18:40:40Z

That is very odd. 20 sec are more in line with the times from this older issue: #4900

fkiraly · 2024-04-28T18:40:56Z

should we move the test times discussion to here? #6344

Fixes #6352 This PR adds coverage of `sktime/tests` for per module CI workflows.

fkiraly added bug Something isn't working module:tests test framework functionality - only framework, excl specific tests labels Apr 28, 2024

yarnabrina mentioned this issue Apr 28, 2024

[MNT] include unit tests in sktime/tests in per module tests #6353

Merged

fkiraly mentioned this issue Apr 28, 2024

sktime v0.28.1 conda-forge/sktime-feedstock#119

Merged

3 tasks

yarnabrina closed this as completed in #6353 May 9, 2024

yarnabrina added a commit that referenced this issue May 9, 2024

[MNT] include unit tests in sktime/tests in per module tests (#6353)

d638877

Fixes #6352 This PR adds coverage of `sktime/tests` for per module CI workflows.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] logic problem with new CI - coverage of `TestAllEstimators` and `TestAllObjects` incorrect #6352

[BUG] logic problem with new CI - coverage of `TestAllEstimators` and `TestAllObjects` incorrect #6352

fkiraly commented Apr 28, 2024 •

edited

Loading

fkiraly commented Apr 28, 2024

yarnabrina commented Apr 28, 2024

fkiraly commented Apr 28, 2024

fkiraly commented Apr 28, 2024

yarnabrina commented Apr 28, 2024

fkiraly commented Apr 28, 2024 •

edited

Loading

fkiraly commented Apr 28, 2024 •

edited

Loading

yarnabrina commented Apr 28, 2024

yarnabrina commented Apr 28, 2024

fkiraly commented Apr 28, 2024 •

edited

Loading

fkiraly commented Apr 28, 2024 •

edited

Loading

fkiraly commented Apr 28, 2024

yarnabrina commented Apr 28, 2024

yarnabrina commented Apr 28, 2024

fkiraly commented Apr 28, 2024

fkiraly commented Apr 28, 2024

[BUG] logic problem with new CI - coverage of TestAllEstimators and TestAllObjects incorrect #6352

[BUG] logic problem with new CI - coverage of TestAllEstimators and TestAllObjects incorrect #6352

Comments

fkiraly commented Apr 28, 2024 • edited Loading

fkiraly commented Apr 28, 2024

yarnabrina commented Apr 28, 2024

fkiraly commented Apr 28, 2024

fkiraly commented Apr 28, 2024

yarnabrina commented Apr 28, 2024

fkiraly commented Apr 28, 2024 • edited Loading

fkiraly commented Apr 28, 2024 • edited Loading

yarnabrina commented Apr 28, 2024

yarnabrina commented Apr 28, 2024

fkiraly commented Apr 28, 2024 • edited Loading

fkiraly commented Apr 28, 2024 • edited Loading

fkiraly commented Apr 28, 2024

yarnabrina commented Apr 28, 2024

yarnabrina commented Apr 28, 2024

fkiraly commented Apr 28, 2024

fkiraly commented Apr 28, 2024

[BUG] logic problem with new CI - coverage of `TestAllEstimators` and `TestAllObjects` incorrect #6352

[BUG] logic problem with new CI - coverage of `TestAllEstimators` and `TestAllObjects` incorrect #6352

fkiraly commented Apr 28, 2024 •

edited

Loading

fkiraly commented Apr 28, 2024 •

edited

Loading

fkiraly commented Apr 28, 2024 •

edited

Loading

fkiraly commented Apr 28, 2024 •

edited

Loading

fkiraly commented Apr 28, 2024 •

edited

Loading