New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPR: Deprecate NDFrame.as_matrix #18458

Merged
merged 3 commits into from Nov 26, 2017

Conversation

Projects
None yet
3 participants
@topper-123
Contributor

topper-123 commented Nov 23, 2017

  • [x ] xref #18262
  • [x ] passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • [ x] whatsnew entry

Deprecating NDFrame.as_matrix as per discussion in #18262.

@codecov

This comment has been minimized.

codecov bot commented Nov 24, 2017

Codecov Report

Merging #18458 into master will increase coverage by 0.02%.
The diff coverage is 85.71%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18458      +/-   ##
==========================================
+ Coverage    91.3%   91.32%   +0.02%     
==========================================
  Files         163      163              
  Lines       49781    49783       +2     
==========================================
+ Hits        45451    45464      +13     
+ Misses       4330     4319      -11
Flag Coverage Δ
#multiple 89.12% <85.71%> (+0.02%) ⬆️
#single 40.72% <64.28%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/core/panel.py 96.85% <0%> (-0.29%) ⬇️
pandas/core/internals.py 94.47% <100%> (ø) ⬆️
pandas/core/generic.py 95.78% <83.33%> (+0.05%) ⬆️
pandas/plotting/_converter.py 65.25% <0%> (+1.81%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 38f41e6...48e1fc8. Read the comment docs.

@jorisvandenbossche jorisvandenbossche added this to the 0.22.0 milestone Nov 24, 2017

@jsexauer jsexauer referenced this pull request Nov 24, 2017

Open

DEPR: deprecations from prior versions #6581

0 of 85 tasks complete
@jorisvandenbossche

Thanks for the PR! Left some comments

@@ -82,7 +82,7 @@ Deprecations
~~~~~~~~~~~~
- ``Series.from_array`` and ``SparseSeries.from_array`` are deprecated. Use the normal constructor ``Series(..)`` and ``SparseSeries(..)`` instead (:issue:`18213`).
-
- ``NDFrame.as_matrix`` is deprecated. Use ``NDFrame.values`` instead (:issue:`18458`).

This comment has been minimized.

@jorisvandenbossche

jorisvandenbossche Nov 24, 2017

Member

Can you use DataFrame instead of NDFrame (basically NDFrame should never show up in the docs, it is an internal implementation detail, although there are some places where it leaks through ..)

@@ -3735,6 +3735,9 @@ def _get_bool_data(self):
def as_matrix(self, columns=None):
"""
DEPRECATED: This method will be removed in a future version.
Use :meth:`NDFrame.values` instead.

This comment has been minimized.

@jorisvandenbossche

jorisvandenbossche Nov 24, 2017

Member

same comment here

@@ -3770,6 +3773,8 @@ def as_matrix(self, columns=None):
--------
pandas.DataFrame.values
"""
warnings.warn("This method will be removed in a future version. "
"Use ``.values`` instead.")

This comment has been minimized.

@jorisvandenbossche

jorisvandenbossche Nov 24, 2017

Member

This needs to be a FutureWarning, and you also need to set a stacklevel (this is a top-level method, stacklevel=2 should be the correct one)

This comment has been minimized.

@jorisvandenbossche

jorisvandenbossche Nov 24, 2017

Member

Can you also mention as_matrix explicitly in the message (that is clearer as you don't always directly see what line or what method on a certain line is causing the warning).
And also follow a bit the typical message like "'as_matrix' is deprecated and will be removed in a future version. Use .."

@@ -243,9 +243,9 @@ def test_itertuples(self):
def test_len(self):
assert len(self.frame) == len(self.frame.index)
def test_as_matrix(self):
def test_values(self):

This comment has been minimized.

@jorisvandenbossche

jorisvandenbossche Nov 24, 2017

Member

Can you leave one test (or now add a test_as_matrix_deprecated that copies eg the first case of this test) that uses as_matrix and that asserts it raises a warning and is the same as .values

This comment has been minimized.

@topper-123

topper-123 Nov 24, 2017

Contributor

Ok. I've added in below test_repr_with_mi_nat

@jorisvandenbossche

This comment has been minimized.

Member

jorisvandenbossche commented Nov 24, 2017

General question: I am wondering if we shouldn't rather recommend to do np.asarray(df) instead of df.values ?

@topper-123

This comment has been minimized.

Contributor

topper-123 commented Nov 24, 2017

Interesting, does it conform to .values in all cases?.

Anyway, I suggest that that will be a separate issue from this one.

@topper-123

This comment has been minimized.

Contributor

topper-123 commented Nov 24, 2017

I've changed the PR according to the comments. Is this ok?

@jreback jreback referenced this pull request Nov 24, 2017

Open

DEPR: let's deprecate #18262

12 of 36 tasks complete
@@ -3791,7 +3796,10 @@ def values(self):
int32. By numpy.find_common_type convention, mixing int64 and uint64
will result in a flot64 dtype.
"""
return self.as_matrix()

This comment has been minimized.

@jreback

jreback Nov 24, 2017

Contributor

just return .values

This comment has been minimized.

@jorisvandenbossche

jorisvandenbossche Nov 24, 2017

Member

This is the def of .values

This comment has been minimized.

@jreback

jreback Nov 24, 2017

Contributor

oh, then we should push this entirely down to the BlockManager, so pass the axis as well. The BM can then do the transpose, we don't like to do things in user code like this.

This comment has been minimized.

@jreback

jreback Nov 24, 2017

Contributor

We should actually deprecate .values as well, in favor of .to_numpy() which is the future spelling anyhow.

This comment has been minimized.

@jorisvandenbossche

jorisvandenbossche Nov 24, 2017

Member

This is already almost entirely pushed down, since it is calling directly BlockManager.as_matrix (but OK, the transpose could be done inside BlockManager.as_matrix). The consolidate_inplace needs to happens in NDFrame I think.

Deprecating .values is a no-go IMO, but let's deprecate that in another issue if you want

This comment has been minimized.

@topper-123

topper-123 Nov 24, 2017

Contributor

oh, then we should push this entirely down to the BlockManager, so pass the axis as well. The BM can then do the transpose, we don't like to do things in user code like this.

I'm not too familiar with the BlockManager, except accessing it with ._data. How do I pass the axis to it, and how do I transpose in a BlockManager? E.g. it doesn't have a .transpose method.

This comment has been minimized.

@jreback

jreback Nov 25, 2017

Contributor

look at _take in pandas/core/generic.py, you just pass the get_block_manager_axis to it.

then add axis= to def as_matrix in pandas/core/internals and do the transpose there.

The reason for pushing this to internals is to avoid all of the wrapping layers (e.g. frame/series, etc.) to know about internal details like this.

This comment has been minimized.

@topper-123

topper-123 Nov 25, 2017

Contributor

as_matrix doesn't have a axis parameter. Do you mean adding a new axis parameter there and transposing inside, if axis=0 (where I assume from my limited knowledge that BlockManager.axes[0] is always the same as dataFrame.axes[1], correct?)

This comment has been minimized.

@topper-123

topper-123 Nov 25, 2017

Contributor

Hi @jreback, I've tried quite a bit today and I can't push this down while getting the tests to pass.

Could you look at my latest commit (not passing ATM) and give some advice?

@@ -369,6 +369,12 @@ def test_values(self):
self.frame.values[:, 0] = 5.
assert (self.frame.values[:, 0] == 5).all()
def test_as_matrix_deprecated(self):
with tm.assert_produces_warning(FutureWarning):

This comment has been minimized.

@jreback

jreback Nov 24, 2017

Contributor

add the issue number

This comment has been minimized.

@topper-123

topper-123 Nov 24, 2017

Contributor

added.

if len(self.blocks) == 0:
return np.empty(self.shape, dtype=float)
other_axis = abs(axis-1)

This comment has been minimized.

@jreback

jreback Nov 25, 2017

Contributor

don't do this

if items is not None:
mgr = self.reindex_axis(items, axis=0)
mgr = self.reindex_axis(items, axis=other_axis)

This comment has been minimized.

@jreback

jreback Nov 25, 2017

Contributor

leave this alone

return self.as_matrix()
self._consolidate_inplace()
bm_axis = self._get_block_manager_axis(axis=1)
return self._data.as_matrix(axis=bm_axis)

This comment has been minimized.

@jreback

jreback Nov 25, 2017

Contributor

instead of this, just pass transpose = self._AXIS_REVERSED in.

This comment has been minimized.

@topper-123

topper-123 Nov 25, 2017

Contributor

But _data.as_matrix has no transpose parameter, do you mean axis=self._AXIS_REVERSED?

This comment has been minimized.

@jreback

jreback Nov 25, 2017

Contributor

no I mean add the argument transpose=False rather than axis prob simpler for now.

else:
mgr = self
if self._is_single_block or not self.is_mixed_type:
return mgr.blocks[0].get_values()
arr = mgr.blocks[0].get_values()
else:

This comment has been minimized.

@jreback

jreback Nov 25, 2017

Contributor

change this to transpose instead. should be straightforward from here.

This comment has been minimized.

@topper-123

topper-123 Nov 26, 2017

Contributor

Thanks a lot, transpose is of course much better than axis.

The issue was actually in the if len(self.blocks) == 0: block, as the empty array also must be transposed.

Everything is green now locally and I've pushed that upstream.

@topper-123

This comment has been minimized.

Contributor

topper-123 commented Nov 26, 2017

A thought: BlockManager.as_matrix is a strange name as it implies a np.matrix is returned.

Should I not just change it to BlockManager.as_array? Or, do any external libraries rely on calling BlockManager directly, e.g. statsmodel?

@jreback

This comment has been minimized.

Contributor

jreback commented Nov 26, 2017

@topper-123 ok this lgtm. ping on green.

@jreback

This comment has been minimized.

Contributor

jreback commented Nov 26, 2017

Should I not just change it to BlockManager.as_array? Or, do any external libraries rely on calling BlockManager directly, e.g. statsmodel?

this is completely private and internal. as_array would be fine. while you are at it, can you add a doc-string.

ping when ready.

tp
self._consolidate_inplace()
if self._AXIS_REVERSED:
return self._data.as_matrix(columns).T

This comment has been minimized.

@jreback

jreback Nov 26, 2017

Contributor

this should just return self.values

This comment has been minimized.

@topper-123

topper-123 Nov 26, 2017

Contributor

Returning self.values implies that columns=None which isn't necessary true for user code.

@@ -3842,7 +3848,7 @@ def as_blocks(self, copy=True):
.. deprecated:: 0.21.0
NOTE: the dtypes of the blocks WILL BE PRESERVED HERE (unlike in
as_matrix)
as_array)

This comment has been minimized.

@jreback

jreback Nov 26, 2017

Contributor

leave this spelling, this is a top-level method name

This comment has been minimized.

@topper-123

topper-123 Nov 26, 2017

Contributor

ok.

@@ -3670,19 +3670,22 @@ def copy(self, deep=True, mgr=None):
return self.apply('copy', axes=new_axes, deep=deep,
do_integrity_check=False)
def as_matrix(self, items=None):
def as_array(self, transpose=False, items=None):
if len(self.blocks) == 0:

This comment has been minimized.

@jreback

jreback Nov 26, 2017

Contributor

can you add a doc-string

This comment has been minimized.

@topper-123

topper-123 Nov 26, 2017

Contributor

done

@@ -3770,10 +3773,12 @@ def as_matrix(self, columns=None):
--------
pandas.DataFrame.values
"""
warnings.warn("method ``as_matrix`` will be removed in a future version. "
"Use ``values`` instead.", FutureWarning, stacklevel=2)
self._consolidate_inplace()
if self._AXIS_REVERSED:

This comment has been minimized.

@jreback

jreback Nov 26, 2017

Contributor

ok so this should be the same as below (e.g. passing transpose=). make sure we still have a test on this (e.g. that passes columns).

This comment has been minimized.

@topper-123

topper-123 Nov 26, 2017

Contributor

Ok, and test_as_matrix_deprecated has been modified to a take columns param.

@@ -3770,10 +3773,11 @@ def as_matrix(self, columns=None):
--------
pandas.DataFrame.values
"""
warnings.warn("method ``as_matrix`` will be removed in a future version. "

This comment has been minimized.

@jreback

jreback Nov 26, 2017

Contributor

"method method .as_matrix()...."
"Use .values instead"

This comment has been minimized.

@topper-123

topper-123 Nov 26, 2017

Contributor

I not sure I understand, assume you want the backticks gone. Change uploaded.

@topper-123

This comment has been minimized.

Contributor

topper-123 commented Nov 26, 2017

All green, @jreback

@jreback jreback merged commit f26bed6 into pandas-dev:master Nov 26, 2017

3 checks passed

ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@jreback

This comment has been minimized.

Contributor

jreback commented Nov 26, 2017

thanks @topper-123

@topper-123 topper-123 deleted the topper-123:deprecate_as_matrix branch Nov 26, 2017

jlewis91 added a commit to jlewis91/shap that referenced this pull request May 16, 2018

Fix pandas 0.23.0 deprecation of as_matrix
```
FutureWarning: Method .as_matrix will be removed in a future version.
Use .values instead.
```
pandas-dev/pandas#18458
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment