-
-
Notifications
You must be signed in to change notification settings - Fork 25.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add warning for pandas sparse Dataframe in check_array #16021
Conversation
The added test fails and gives |
Linux pylatest_pip_openblas_pandas passes this test. |
This bug was fixed in numpy/numpy#5710. |
Is there a pd 0.23 equivalent of I suppose I'm fine with having this feature limited to recent Pandas |
I'm not sure if there's a method that'll work that far back.
…On Tue, Jan 7, 2020 at 11:59 PM Joel Nothman ***@***.***> wrote:
Is there a pd 0.23 equivalent of pd.api.types.is_sparse, @TomAugspurger
<https://github.com/TomAugspurger>?
I suppose I'm fine with having this feature limited to recent Pandas
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#16021?email_source=notifications&email_token=AAKAOISJNKI73DJ2C2RHMBLQ4VTTFA5CNFSM4KCVGE42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEILI25A#issuecomment-571903348>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKAOIXX3P7QG4G5PFO4GGDQ4VTTFANCNFSM4KCVGE4Q>
.
|
Can somebody please guide me to get rid of these failing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR @rushabh-v !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR @rushabh-v !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise lgtm
sklearn/utils/validation.py
Outdated
from pandas.api.types import is_sparse | ||
if array.dtypes.apply(is_sparse).any(): | ||
warnings.warn( | ||
"pandas Dataframe having sparse columns found." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"pandas Dataframe having sparse columns found." | |
"pandas.DataFrame having sparse columns found." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
sklearn/utils/validation.py
Outdated
if array.dtypes.apply(is_sparse).any(): | ||
warnings.warn( | ||
"pandas Dataframe having sparse columns found." | ||
"It will be inflated automatically." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is "inflated" the formal term? We may use "densified" elsewhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have used "densified" there now.
@jnothman Can you review it again for the changes you requested, please? |
I don't think another review from me us needed. Please await another core developer's review. Thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments, otherwise LGTM thanks!
doc/whats_new/v0.23.rst
Outdated
@@ -131,3 +131,6 @@ Changelog | |||
|
|||
- |Enhancement| improve error message in :func:`utils.validation.column_or_1d`. | |||
:pr:`15926` by :user:`Loïc Estève <lesteve>`. | |||
- |Enhancement| add warning in :func:`utils.validation.check_array` for | |||
pandas sparse Dataframe. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pandas sparse Dataframe. | |
pandas sparse DataFrame. |
pd = pytest.importorskip('pandas') | ||
# restrict the pd versions < '0.24.0' as they have a bug in is_sparse func | ||
if LooseVersion(pd.__version__) < '0.24.0': | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return | |
pytest.skip(reason="pandas 0.24+ required.") |
sklearn/utils/validation.py
Outdated
from pandas.api.types import is_sparse | ||
if array.dtypes.apply(is_sparse).any(): | ||
warnings.warn( | ||
"pandas.DataFrame having sparse columns found." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"pandas.DataFrame having sparse columns found." | |
"pandas.DataFrame with sparse columns found." |
sklearn/utils/validation.py
Outdated
if array.dtypes.apply(is_sparse).any(): | ||
warnings.warn( | ||
"pandas.DataFrame having sparse columns found." | ||
"It will be densified automatically." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"It will be densified automatically." | |
"It will be converted to a dense numpy array." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made all the changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Fixes #15976