Disregard NaNs in TimeSeriesScalerMeanVariance #175

eliwoods · 2020-01-07T02:58:32Z

If a time series with any NaN values is passed to TimeSeriesScalerMeanVariance.fit_transform() the transformed time series returns as all NaN

E.g:

In [1]: from tslearn.preprocessing import TimeSeriesScalerMeanVariance                                                                

In [2]: from numpy import nan                                                                                                         

In [3]: TimeSeriesScalerMeanVariance().fit_transform([0, nan, 6])                                                                     
Out[3]: 
array([[[nan],
        [nan],
        [nan]]])

This could be fixed by using numpy.nanmean() and numpy.nanstd() in TimeSeriesScalerMeanVariance.transform instead of the current implementation. This also makes tslearn in line with sklearn in terms of NaN's and preprocessing: scikit-learn/scikit-learn#10404

MAINT Include test folder in wheels

MAINT More fixes to include test/ folder in sdist

Use np.nanmean and np.nanstd instead of np.mean and np.std

codecov-io · 2020-01-07T03:11:46Z

Codecov Report

Merging #175 into dev will decrease coverage by 0.28%.
The diff coverage is 76.19%.

@@            Coverage Diff             @@
##              dev     #175      +/-   ##
==========================================
- Coverage   93.97%   93.69%   -0.29%     
==========================================
  Files          22       22              
  Lines        3039     2585     -454     
==========================================
- Hits         2856     2422     -434     
+ Misses        183      163      -20

Impacted Files	Coverage Δ
tslearn/__init__.py	`75% <100%> (ø)`	⬆️
tslearn/preprocessing.py	`91.93% <100%> (ø)`	⬆️
tslearn/tests/test_shapelets.py	`100% <100%> (ø)`	⬆️
tslearn/tests/test_estimators.py	`89.28% <58.33%> (-5.59%)`	⬇️
tslearn/tests/sklearn_patches.py	`88.6% <0%> (-2.85%)`	⬇️
tslearn/shapelets.py	`91.2% <0%> (-2.58%)`	⬇️
tslearn/tests/test_metrics.py	`100% <0%> (ø)`	⬆️
tslearn/tests/test_barycenters.py	`100% <0%> (ø)`	⬆️
tslearn/svm.py	`98.43% <0%> (ø)`	⬆️
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 40cf0c9...2996607. Read the comment docs.

rtavenar · 2020-01-07T08:14:27Z

Hi @eliwoods

Thanks for the suggestion, this makes a lot of sense.

I would suggest the following improvements:

Do the same for TimeSeriesScalerMinMax
Add doctest examples to make the behaviour explicit for end-users
Change the destination branch to dev such that your work can be included in the next tslearn version to come

Once all this is done, one last thing would be to document your contribution in CHANGELOG.md.

pep8speaks · 2020-01-07T21:53:26Z

Hello @eliwoods! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

In the file tslearn/preprocessing.py:

Line 114:80: E501 line too long (87 > 79 characters)

rtavenar and others added 25 commits September 12, 2019 22:02

Update variablelength.rst

affe939

Update variablelength.rst

95a3d3c

Install requirements for PyPI changed

ccca697

Merge branch 'master' of https://github.com/rtavenar/tslearn

eaaf444

Create issue templates

5f90e28

Update issue templates

c0a335a

Update CONTRIBUTING.md

6ff911a

Create CODE_OF_CONDUCT.md

c67438b

C++ build tools details

80b6b5a

Merge branch 'master' of https://github.com/rtavenar/tslearn

6ae9112

Include tests folder in wheels

3675e18

Fix imports

447b9a3

Skip shapelets tests if keras / tensorflow is not installed

b391b2c

Version 0.2.4

2e82ae2

Merge pull request tslearn-team#154 from rth/include-tests-in-wheels

33b27e9

MAINT Include test folder in wheels

Update CHANGELOG.md

6742485

Update shapelets.py

bd7f6e6

Update shapelets.py

b840e5b

Update requirements_rtd.txt

8c0875b

Update requirements_rtd.txt

1254f7e

Update requirements.txt

39ad33b

More fixes to include test/ folder in sdist

86f297f

Version 0.2.5

b7499aa

Merge pull request tslearn-team#159 from rth/more-fixes-setup

1721482

MAINT More fixes to include test/ folder in sdist

Disregard NaNs in TimeSeriesScalerMeanVariance

0d2d05b

Use np.nanmean and np.nanstd instead of np.mean and np.std

eliwoods added 2 commits January 7, 2020 13:15

disregard nans in TimeSeriesScalerMinMax

01f394e

Update docstrings

2996607

eliwoods changed the base branch from master to dev January 7, 2020 21:51

eliwoods mentioned this pull request Jan 7, 2020

[MRG] Disregard NaNs in preprocessing #177

Merged

eliwoods closed this Jan 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disregard NaNs in TimeSeriesScalerMeanVariance #175

Disregard NaNs in TimeSeriesScalerMeanVariance #175

eliwoods commented Jan 7, 2020

codecov-io commented Jan 7, 2020 •

edited

rtavenar commented Jan 7, 2020

pep8speaks commented Jan 7, 2020

Disregard NaNs in TimeSeriesScalerMeanVariance #175

Disregard NaNs in TimeSeriesScalerMeanVariance #175

Conversation

eliwoods commented Jan 7, 2020

codecov-io commented Jan 7, 2020 • edited

Codecov Report

rtavenar commented Jan 7, 2020

pep8speaks commented Jan 7, 2020

codecov-io commented Jan 7, 2020 •

edited