Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: missing value handling in basic statistics #5416

Open
josef-pkt opened this issue Dec 10, 2018 · 1 comment
Open

ENH: missing value handling in basic statistics #5416

josef-pkt opened this issue Dec 10, 2018 · 1 comment

Comments

@josef-pkt
Copy link
Member

josef-pkt commented Dec 10, 2018

Currently we don't have much support for missing values outside of the model option for rowwise deletion.

#2630 is for improving descriptive statistics.

For univariate statistics like mean, std including robust statistics like mad or the quantile based skew and kurtosis, it would be better to have case/elementwise instead of rowwise deletion.

One problem is that simple vectorization will not work anymore if columns have nans/missing values in different rows. For simple statistics there are nan-aware functions, but we might need either a mask solution like MaskedArray statistics uses (essentially with 0/1 weights after changing to something finite) or looping over columns/series.

@rlucas7
Copy link
Contributor

rlucas7 commented Dec 11, 2018

This CV post might also be relevant? (in tsa though)
https://stats.stackexchange.com/questions/381378/abbreviations-statsmodels-handling-nan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants