Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to handle "nans" #66

Closed
Jakobhenningjensen opened this issue Dec 11, 2018 · 2 comments
Closed

Unable to handle "nans" #66

Jakobhenningjensen opened this issue Dec 11, 2018 · 2 comments
Labels

Comments

@Jakobhenningjensen
Copy link

Description

When the dataset contains "nans", it seems to fail. When using auto.arima in R, these are handled/omitted

Steps/Code to Reproduce

Expected Results

Ommit nans

Actual Results

Raises an "ValueError"

Versions

Windows-7-6.1.7601-SP1
Python 3.6.7 |Anaconda custom (64-bit)| (default, Oct 28 2018, 19:44:12) [MSC v.1915 64 bit (AMD64)]
pmdarima 1.0.0
NumPy 1.15.4
SciPy 1.1.0
Scikit-Learn 0.20.0
Statsmodels 0.9.0

@tgsmith61591
Copy link
Member

First off, thanks for the well-formatted issue. I understand the frustration around NaNs; R handles them implicitly (albeit in an almost black-box fashion) while most python libraries make you, the programmer, address them first.

The reason for the error is due to the call to scikit-learn's check_array which does not tolerate NaNs. Best strategy for now is to impute them yourself, and if scikit makes the move towards handling NaNs it might make sense for us to as well.

@tgsmith61591
Copy link
Member

Closing since this is currently a "wontfix." But will leave the backlog tag in case scikit decides to support this behavior later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants