Unable to handle "nans" #66

Jakobhenningjensen · 2018-12-11T12:18:21Z

Description

When the dataset contains "nans", it seems to fail. When using auto.arima in R, these are handled/omitted

Steps/Code to Reproduce

Expected Results

Ommit nans

Actual Results

Raises an "ValueError"

Versions

Windows-7-6.1.7601-SP1
Python 3.6.7 |Anaconda custom (64-bit)| (default, Oct 28 2018, 19:44:12) [MSC v.1915 64 bit (AMD64)]
pmdarima 1.0.0
NumPy 1.15.4
SciPy 1.1.0
Scikit-Learn 0.20.0
Statsmodels 0.9.0

tgsmith61591 · 2018-12-11T12:51:59Z

First off, thanks for the well-formatted issue. I understand the frustration around NaNs; R handles them implicitly (albeit in an almost black-box fashion) while most python libraries make you, the programmer, address them first.

The reason for the error is due to the call to scikit-learn's check_array which does not tolerate NaNs. Best strategy for now is to impute them yourself, and if scikit makes the move towards handling NaNs it might make sense for us to as well.

tgsmith61591 · 2018-12-17T23:32:04Z

Closing since this is currently a "wontfix." But will leave the backlog tag in case scikit decides to support this behavior later.

tgsmith61591 added the backlog label Dec 11, 2018

tgsmith61591 closed this as completed Dec 17, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to handle "nans" #66

Unable to handle "nans" #66

Jakobhenningjensen commented Dec 11, 2018

tgsmith61591 commented Dec 11, 2018

tgsmith61591 commented Dec 17, 2018

Unable to handle "nans" #66

Unable to handle "nans" #66

Comments

Jakobhenningjensen commented Dec 11, 2018

Description

Steps/Code to Reproduce

Expected Results

Actual Results

Versions

tgsmith61591 commented Dec 11, 2018

tgsmith61591 commented Dec 17, 2018