-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Fixing dockerized tests #5426
Conversation
It seems that you did the exact same change as I suggested, but I currently don't have a way to test if this will solve docker issues that you had earlier (don't have docker setup in my personal laptop). The changes look okay and I'll approve, subject to someone testing them on docker images. |
I did run them, but it takes too long and editI ran the test using all resources on my laptop, it ran fine till about 15%, then workers started crashing |
I don't think I'm aware how these images are expected to be used. Current CI takes well over 3 hours (e.g. https://github.com/sktime/sktime/actions/runs/6443020075). Depending on your laptop's specifications, and how much resource you allocated to docker, crashing is probably not very surprising. (By the way, trying in CI was a nice idea, but it probably would need some extra rights that you'll not have in sktime GHA. You may try in your fork though, but it stull needs some more tinkering. I never tried so don't know for sure. However, a small feedback: building for every push in all branches and every PR seems a bit too frequent.) @fkiraly @achieveordie can you tell how are these docker images intended to be used by users? Is it just for the purpose of giving new developers a working environment with everything pre-installed, or are these released and maintained in any way? If the former case, if @kurayami07734 checks basic usage and one of us confirms, I think it's enough. If this gets published in dockerhub or similar in the latter case, then we need to find a better approach to test updates in docker, unless we assume CI testing on |
@yarnabrina sktime/classification/tests/test_sklearn_compatability.py::test_sklearn_composite_classifiers[composite_classifier2-data_args1]
/usr/local/lib/python3.8/site-packages/sklearn/calibration.py:300: FutureWarning: `base_estimator` was renamed to `estimator` in version 1.2 and will be removed in 1.4.
warnings.warn(
sktime/benchmarking/tests/test_forecasting.py::test_forecastingbenchmark[expected_results_df1-scorers1]
sktime/benchmarking/tests/test_forecasting.py::test_forecastingbenchmark[expected_results_df0-scorers0]
/usr/src/sktime/sktime/benchmarking/forecasting.py:38: DeprecationWarning: Starting v0.25.0 model_evaluation.evaluate module will rearrange all metric columns to the left of its output result DataFrame. Please use loc references when addressing the columns. You can safely ignore this warning if you don't use evaluate function directly.
scores_df = evaluate(forecaster=estimator, y=y, cv=cv_splitter, scoring=scorers)
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
---------- coverage: platform linux, python 3.8.16-final-0 -----------
Coverage HTML written to dir htmlcov
Coverage XML written to file coverage.xml
============================================================================================================ slowest 20 durations =============================================================================================================
128.77s call sktime/forecasting/tests/test_all_forecasters.py::TestAllForecasters::test_update_predict_predicted_index[BATS-1-y:2cols-update_params=True-1-fh=1]
92.56s call sktime/forecasting/tests/test_all_forecasters.py::TestAllForecasters::test_update_predict_predicted_index[BATS-1-y:2cols-update_params=True-1-fh=[2 5]]
88.56s call sktime/forecasting/tests/test_all_forecasters.py::TestAllForecasters::test_update_predict_predicted_index[BATS-1-y:2cols-update_params=True-5-fh=1]
83.41s call sktime/forecasting/tests/test_all_forecasters.py::TestAllForecasters::test_hierarchical_with_exogeneous[BATS-1-y:2cols]
60.23s call sktime/forecasting/tests/test_all_forecasters.py::TestAllForecasters::test_update_predict_predicted_index[BATS-1-y:1cols-update_params=True-1-fh=1]
52.18s call sktime/forecasting/tests/test_all_forecasters.py::TestAllForecasters::test_update_predict_predicted_index[BATS-1-y:2cols-update_params=True-5-fh=[2 5]]
51.10s call sktime/classification/early_classification/tests/test_teaser.py::test_teaser_full_length
50.75s call sktime/datasets/tests/test_single_problem_loaders.py::test_load_UEA
45.11s call sktime/forecasting/tests/test_all_forecasters.py::TestAllForecasters::test_hierarchical_with_exogeneous[BATS-1-y:1cols]
41.03s call sktime/forecasting/tests/test_all_forecasters.py::TestAllForecasters::test_update_predict_predicted_index[BATS-1-y:1cols-update_params=True-5-fh=1]
41.00s call sktime/forecasting/tests/test_all_forecasters.py::TestAllForecasters::test_update_predict_predicted_index[BATS-1-y:1cols-update_params=True-1-fh=[2 5]]
32.82s call sktime/classification/hybrid/tests/test_hivecote_v2.py::test_contracted_hivecote_v2
31.96s call sktime/clustering/tests/test_k_shapes.py::test_kshapes
30.92s call sktime/transformations/series/tests/test_clear_sky.py::test_clearsky_trafo_vals
28.90s call sktime/tests/test_all_estimators.py::TestAllEstimators::test_non_state_changing_method_contract[StatsForecastMSTL-2-ForecasterFitPredictUnivariateWithX-get_fitted_params]
25.82s call sktime/utils/tests/test_check_estimator.py::test_check_estimator_does_not_raise[ForecastKnownValues]
25.79s call sktime/utils/tests/test_check_estimator.py::test_check_estimator_passed[ForecastKnownValues]
25.62s call sktime/forecasting/tests/test_all_forecasters.py::TestAllForecasters::test_predict_time_index[StatsForecastAutoCES-1-y:2cols-fh=[2 5]-datetime-datetime-False]
25.30s call sktime/forecasting/model_evaluation/tests/test_evaluate.py::test_evaluate_bigger_X[AutoARIMA]
24.22s call sktime/forecasting/tests/test_conformal.py::test_conformal_with_hierarchical
=========================================================================================================== short test summary info ===========================================================================================================
FAILED sktime/datasets/tests/test_single_problem_loaders.py::test_check_link_downloadable[m1_quarterly_dataset] - urllib.error.HTTPError: HTTP Error 500: INTERNAL SERVER ERROR
====================================================================== 1 failed, 38254 passed, 874 skipped, 13 xfailed, 10 xpassed, 2625 warnings in 10640.72s (2:57:20) ======================================================================
make: *** [Makefile:38: full_test] Error 1
make: *** [Makefile:104: dockertest] Error 2 |
I won't worry much about the HTTP error, happening for a few days in the CI. But if you are running full tests, I wonder why 874 tests are skipped. Is that not unintuitive? Do you see any patterns in the skipped tests? (Also the number of tests appear to be too less than what I expected. I was expecting around 84463 tests, which I got by running
Just in case you missed this, tagging once more @fkiraly @achieveordie . |
=========================== short test summary info ============================
SKIPPED [1] sktime/classification/early_classification/tests/test_all_early_classifiers.py: got empty parameter set ['estimator_instance'], function test_multivariate_input_exception at /usr/src/sktime/sktime/classification/early_classification/tests/test_all_early_classifiers.py:36
SKIPPED [1] sktime/classification/early_classification/tests/test_all_early_classifiers.py: got empty parameter set ['estimator_instance', 'scenario'], function test_classifier_output at /usr/src/sktime/sktime/classification/early_classification/tests/test_all_early_classifiers.py:41
SKIPPED [1] sktime/tests/test_softdeps.py:209: got empty parameter set ['estimator'], function test_python_error at /usr/src/sktime/sktime/tests/test_softdeps.py:208
SKIPPED [1] sktime/proba/tests/test_proba_basic.py:11: skip test if required soft dependency is not available
SKIPPED [2] sktime/utils/tests/test_mlflow_sktime_model_export.py:392: skip test if required soft dependency not available
SKIPPED [4] sktime/utils/tests/test_mlflow_sktime_model_export.py:357: skip test if required soft dependency not available
SKIPPED [4] sktime/utils/tests/test_mlflow_sktime_model_export.py:415: skip test if required soft dependency not available
SKIPPED [2] sktime/utils/tests/test_mlflow_sktime_model_export.py:119: skip test if required soft dependency not available
SKIPPED [1] sktime/utils/tests/test_mlflow_sktime_model_export.py:302: skip test if required soft dependency not available
SKIPPED [4] sktime/utils/tests/test_mlflow_sktime_model_export.py:489: skip test if required soft dependency not available
SKIPPED [1] sktime/utils/tests/test_mlflow_sktime_model_export.py:463: skip test if required soft dependency not available
SKIPPED [1] sktime/utils/tests/test_mlflow_sktime_model_export.py:695: skip test if required soft dependency not available
SKIPPED [1] sktime/utils/tests/test_mlflow_sktime_model_export.py:588: skip test if required soft dependency not available
SKIPPED [1] sktime/utils/tests/test_mlflow_sktime_model_export.py:650: skip test if required soft dependency not available
SKIPPED [2] sktime/utils/tests/test_mlflow_sktime_model_export.py:96: skip test if required soft dependency not available
SKIPPED [1] sktime/utils/tests/test_mlflow_sktime_model_export.py:532: skip test if required soft dependency not available
SKIPPED [1] sktime/utils/tests/test_mlflow_sktime_model_export.py:671: skip test if required soft dependency not available
SKIPPED [1] sktime/utils/tests/test_mlflow_sktime_model_export.py:225: skip test if required soft dependency not available
SKIPPED [1] sktime/utils/tests/test_mlflow_sktime_model_export.py:630: skip test if required soft dependency not available
SKIPPED [1] sktime/utils/tests/test_mlflow_sktime_model_export.py:564: skip test if required soft dependency not available
SKIPPED [1] sktime/utils/tests/test_mlflow_sktime_model_export.py:171: skip test if required soft dependency not available
SKIPPED [1] sktime/utils/tests/test_mlflow_sktime_model_export.py:609: skip test if required soft dependency not available
SKIPPED [1] sktime/utils/tests/test_mlflow_sktime_model_export.py:279: skip test if required soft dependency not available
SKIPPED [696] sktime/tests/test_all_estimators.py:1432: hangs on mac and unix remote tests
SKIPPED [72] sktime/forecasting/model_selection/tests/test_tune.py:246: run test only if softdeps are present and incrementally (if requested)
SKIPPED [1] sktime/forecasting/model_selection/tests/test_tune.py:286: run test only if softdeps are present and incrementally (if requested)
SKIPPED [54] sktime/forecasting/base/tests/test_fh.py:71: steps and fh_type are incompatible
SKIPPED [15] ../../local/lib/python3.8/site-packages/_pytest/doctest.py:456: all tests skipped by +SKIP option
SKIPPED [1] sktime/dists_kernels/tests/test_all_dist_kernels.py: got empty parameter set ['estimator_instance', 'scenario'], function test_pairwise_transformers_tabular at /usr/src/sktime/sktime/dists_kernels/tests/test_all_dist_kernels.py:29
= 38255 passed, 874 skipped0m, 13 xfailed, 10 xpassed, 115 warnings in 12703.16s (3:31:43) |
Apologies for having missed the tags. I was not involved during this development but from my perspective, it seems like this is meant to act as a developer environment rather than something an end user will use. |
@kurayami07734 This is very surprising to me to get these many skips caused by missing soft dependencies, if you have installed all the soft dependencies as part of docker build. Also, significant mismatch in the number of tests is very odd as well. However, if it's just for development and not for end users directly, I'm satisfied with all the changes as of now. That's because a lot is coming from |
Fixes #5352
Changes
What should a reviewer concentrate their feedback on?
I was thinking that to have similar functionality to
only_changed_modules
we could do adiff
on acurl
from theorigin/main