Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] dealing with sklearn 1.2 deprecation warnings and solving the first part #2190

Merged
merged 2 commits into from Mar 10, 2022

Conversation

hmtbgc
Copy link
Contributor

@hmtbgc hmtbgc commented Mar 10, 2022

Reference Issues/PRs

Fixes the first part of #2143 : replacing float variable names by str in pandas.DataFrame passed.

What does this implement/fix? Explain your changes.

I trace the warning and find out in summarize.py, quantiles are presented as "float" while other features are represented as "str", and that is the root of this warning. So I transform these float numbers to string.

Does your contribution introduce a new dependency? If yes, which one?

No.

What should a reviewer concentrate their feedback on?

All unit tests in 'classification' directory have passed. And here is the output:

======================================== test session starts ========================================
platform linux -- Python 3.8.12, pytest-7.0.1, pluggy-1.0.0
rootdir: /root/sktime_dev/sktime, configfile: setup.cfg
plugins: cov-3.0.0, xdist-2.5.0, forked-1.4.0
collected 1 item                                                                                    

classification/feature_based/_summary_classifier.py .                                         [100%]

========================================= 1 passed in 4.14s ======================================                       


(sktime-dev) root@x86_64-conda-linux-gnu:sktime/sktime ‹main*›# 
(sktime-dev) root@x86_64-conda-linux-gnu:sktime/sktime ‹main*›# 
(sktime-dev) root@x86_64-conda-linux-gnu:sktime/sktime ‹main*›# pytest classification/feature_based/tests/test_summary_classifier.py
================================================== test session starts ===================================================
platform linux -- Python 3.8.12, pytest-7.0.1, pluggy-1.0.0
rootdir: /root/sktime_dev/sktime, configfile: setup.cfg
plugins: cov-3.0.0, xdist-2.5.0, forked-1.4.0
collected 2 items                                                                                                        

classification/feature_based/tests/test_summary_classifier.py ..                                                   [100%]

=================================================== 2 passed in 4.80s ====================================================
(sktime-dev) root@x86_64-conda-linux-gnu:sktime/sktime ‹main*›# pytest classification/tests/test_all_classifiers.py
================================================== test session starts ===================================================
platform linux -- Python 3.8.12, pytest-7.0.1, pluggy-1.0.0
rootdir: /root/sktime_dev/sktime, configfile: setup.cfg
plugins: cov-3.0.0, xdist-2.5.0, forked-1.4.0
collected 90 items                                                                                                       

classification/tests/test_all_classifiers.py ...........
.......................................................... [ 76%]
.....................                                                                                              [100%]

================================================== 90 passed in 44.68s ===================================================

For tests/test_all_estimators.py, here is the output:

==================================================== warnings summary ====================================================
sktime/tests/test_all_estimators.py: 14 warnings
  /root/miniconda/envs/sktime-dev/lib/python3.8/site-packages/scipy/stats/morestats.py:897: RuntimeWarning: invalid value encountered in log
    logdata = np.log(data)

sktime/tests/test_all_estimators.py: 14 warnings
  /root/miniconda/envs/sktime-dev/lib/python3.8/site-packages/scipy/stats/morestats.py:906: RuntimeWarning: invalid value encountered in power
    variance = np.var(data**lmb / lmb, axis=0)

sktime/tests/test_all_estimators.py: 64 warnings
  /root/sktime_dev/sktime/sktime/performance_metrics/forecasting/_functions.py:1543: FutureWarning: In the percentage error metric functions the default argument symmetric=True is changing to symmetric=False in v0.12.0.
    warn(

sktime/tests/test_all_estimators.py: 16 warnings
  /root/miniconda/envs/sktime-dev/lib/python3.8/site-packages/sklearn/decomposition/_pca.py:499: RuntimeWarning: invalid value encountered in true_divide
    explained_variance_ = (S ** 2) / (n_samples - 1)

sktime/tests/test_all_estimators.py::TestAllEstimators::test_fit_updates_state[Prophet-ForecasterFitPredictUnivariateNoX]
  /root/miniconda/envs/sktime-dev/lib/python3.8/site-packages/pyximport/pyximport.py:51: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
    import imp

sktime/tests/test_all_estimators.py: 28 warnings
  /root/sktime_dev/sktime/sktime/transformations/series/boxcox.py:377: RuntimeWarning: invalid value encountered in power
    x_ratio = x_std / x_mean ** (1 - lmb)

sktime/tests/test_all_estimators.py::TestAllEstimators::test_fit_updates_state[TimeSeriesKMeans-ClustererFitPredict]
  /root/miniconda/envs/sktime-dev/lib/python3.8/site-packages/tslearn/utils/utils.py:156: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
  Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
    def to_time_series_dataset(dataset, dtype=numpy.float):

sktime/tests/test_all_estimators.py::TestAllEstimators::test_fit_updates_state[TimeSeriesKMeans-ClustererFitPredict]
  /root/miniconda/envs/sktime-dev/lib/python3.8/site-packages/tslearn/utils/cast.py:15: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
  Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
    def to_sklearn_dataset(dataset, dtype=numpy.float, return_dim=False):

sktime/tests/test_all_estimators.py::TestAllEstimators::test_fit_updates_state[TimeSeriesKMeans-ClustererFitPredict]
  /root/miniconda/envs/sktime-dev/lib/python3.8/site-packages/tslearn/metrics/utils.py:9: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
  Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
    compute_diagonal=True, dtype=numpy.float, *args, **kwargs):

sktime/tests/test_all_estimators.py::TestAllEstimators::test_fit_updates_state[TimeSeriesKMeans-ClustererFitPredict]
  /root/miniconda/envs/sktime-dev/lib/python3.8/site-packages/tslearn/metrics/sax.py:2: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
  Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
    from .cysax import cydist_sax

sktime/tests/test_all_estimators.py::TestAllEstimators::test_fit_updates_state[TimeSeriesKMeans-ClustererFitPredict]
  /root/miniconda/envs/sktime-dev/lib/python3.8/site-packages/tslearn/metrics/sax.py:2: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
  Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
    from .cysax import cydist_sax

sktime/tests/test_all_estimators.py::TestAllEstimators::test_fit_updates_state[TimeSeriesKMeans-ClustererFitPredict]
  /root/miniconda/envs/sktime-dev/lib/python3.8/site-packages/tslearn/metrics/__init__.py:24: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
  Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
    from .cycc import cdist_normalized_cc, y_shifted_sbd_vec

sktime/tests/test_all_estimators.py: 50137 warnings
  /root/miniconda/envs/sktime-dev/lib/python3.8/site-packages/tslearn/utils/utils.py:149: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
  Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
    if ts_out.dtype != numpy.float:

sktime/tests/test_all_estimators.py::TestAllEstimators::test_fit_idempotent[LogTransformer-TransformerFitTransformPanelUnivariateWithClassY]
sktime/tests/test_all_estimators.py::TestAllEstimators::test_methods_do_not_change_state[LogTransformer-TransformerFitTransformPanelUnivariateWithClassY]
sktime/tests/test_all_estimators.py::TestAllEstimators::test_methods_have_no_side_effects[LogTransformer-TransformerFitTransformPanelUnivariateWithClassY]
sktime/tests/test_all_estimators.py::TestAllEstimators::test_persistence_via_pickle[LogTransformer-TransformerFitTransformPanelUnivariateWithClassY]
  /root/sktime_dev/sktime/sktime/transformations/series/boxcox.py:250: RuntimeWarning: invalid value encountered in log
    Xt = np.log(X)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
============================= 3679 passed, 243 skipped, 50284 warnings in 402.83s (0:06:42) ==============================

There is no warning about "FutureWarning: Feature names only support names that are all strings. Got feature names with dtypes: ['float', 'str']. An error will be raised in 1.2."

Any other comments?

The above warnings need to be solved too.

PR checklist

For all contributions
  • I've added myself to the list of contributors.
  • Optionally, I've updated sktime's CODEOWNERS to receive notifications about future changes to these files.
  • I've added unit tests and made sure they pass locally.
For new estimators
  • I've added the estimator to the online documentation.
  • I've updated the existing example notebooks or provided a new one to showcase how my estimator works.

Copy link
Contributor

@TonyBagnall TonyBagnall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me, thanks.

@TonyBagnall TonyBagnall merged commit 297588c into sktime:main Mar 10, 2022
@lmmentel lmmentel added the bugfix Fixes a known bug or removes unintended behavior label Mar 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix Fixes a known bug or removes unintended behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants