Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] Fix safe_indexing with read-only indices #9507

Conversation

lesteve
Copy link
Member

@lesteve lesteve commented Aug 7, 2017

Fix #9483.

DataFrame.iloc has problems when read-only arrays are passed into it.

Here is a simplified snippet (from #9483 (comment)):

import numpy as np
import pandas as pd

df = pd.DataFrame({'first': np.ones(100, dtype='float64')})
indices = np.array([1, 3, 6])
indices.flags.writeable = False
df.iloc[indices]
Stack-trace
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-1-a32cbf97fa72> in <module>()
      5 indices = np.array([1, 3, 6])
      6 indices.flags.writeable = False
----> 7 df.iloc[indices]

~/miniconda3/lib/python3.6/site-packages/pandas/core/indexing.py in __getitem__(self, key)
   1326         else:
   1327             key = com._apply_if_callable(key, self.obj)
-> 1328             return self._getitem_axis(key, axis=0)
   1329 
   1330     def _is_scalar_access(self, key):

~/miniconda3/lib/python3.6/site-packages/pandas/core/indexing.py in _getitem_axis(self, key, axis)
   1736         # a list of integers
   1737         elif is_list_like_indexer(key):
-> 1738             return self._get_list_axis(key, axis=axis)
   1739 
   1740         # a single integer

~/miniconda3/lib/python3.6/site-packages/pandas/core/indexing.py in _get_list_axis(self, key, axis)
   1713         """
   1714         try:
-> 1715             return self.obj.take(key, axis=axis, convert=False)
   1716         except IndexError:
   1717             # re-raise with different error message

~/miniconda3/lib/python3.6/site-packages/pandas/core/generic.py in take(self, indices, axis, convert, is_copy, **kwargs)
   1926         new_data = self._data.take(indices,
   1927                                    axis=self._get_block_manager_axis(axis),
-> 1928                                    convert=True, verify=True)
   1929         result = self._constructor(new_data).__finalize__(self)
   1930 

~/miniconda3/lib/python3.6/site-packages/pandas/core/internals.py in take(self, indexer, axis, verify, convert)
   4009         new_labels = self.axes[axis].take(indexer)
   4010         return self.reindex_indexer(new_axis=new_labels, indexer=indexer,
-> 4011                                     axis=axis, allow_dups=True)
   4012 
   4013     def merge(self, other, lsuffix='', rsuffix=''):

~/miniconda3/lib/python3.6/site-packages/pandas/core/internals.py in reindex_indexer(self, new_axis, indexer, axis, fill_value, allow_dups, copy)
   3895             new_blocks = [blk.take_nd(indexer, axis=axis, fill_tuple=(
   3896                 fill_value if fill_value is not None else blk.fill_value,))
-> 3897                 for blk in self.blocks]
   3898 
   3899         new_axes = list(self.axes)

~/miniconda3/lib/python3.6/site-packages/pandas/core/internals.py in <listcomp>(.0)
   3895             new_blocks = [blk.take_nd(indexer, axis=axis, fill_tuple=(
   3896                 fill_value if fill_value is not None else blk.fill_value,))
-> 3897                 for blk in self.blocks]
   3898 
   3899         new_axes = list(self.axes)

~/miniconda3/lib/python3.6/site-packages/pandas/core/internals.py in take_nd(self, indexer, axis, new_mgr_locs, fill_tuple)
   1044             fill_value = fill_tuple[0]
   1045             new_values = algos.take_nd(values, indexer, axis=axis,
-> 1046                                        allow_fill=True, fill_value=fill_value)
   1047 
   1048         if new_mgr_locs is None:

~/miniconda3/lib/python3.6/site-packages/pandas/core/algorithms.py in take_nd(arr, indexer, axis, out, fill_value, mask_info, allow_fill)
   1469     func = _get_take_nd_function(arr.ndim, arr.dtype, out.dtype, axis=axis,
   1470                                  mask_info=mask_info)
-> 1471     func(arr, indexer, out, fill_value)
   1472 
   1473     if flip_order:

pandas/_libs/algos_take_helper.pxi in pandas._libs.algos.take_2d_axis0_float64_float64 (pandas/_libs/algos.c:110417)()

~/miniconda3/lib/python3.6/site-packages/pandas/_libs/algos.cpython-36m-x86_64-linux-gnu.so in View.MemoryView.memoryview_cwrapper (pandas/_libs/algos.c:124730)()

~/miniconda3/lib/python3.6/site-packages/pandas/_libs/algos.cpython-36m-x86_64-linux-gnu.so in View.MemoryView.memoryview.__cinit__ (pandas/_libs/algos.c:120965)()

ValueError: buffer source array is read-only

@lesteve
Copy link
Member Author

lesteve commented Aug 7, 2017

For completeness I opened an issue in pandas: pandas-dev/pandas#17192

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jnothman jnothman changed the title [MRG] Fix safe_indexing with read-only indices [MRG+1] Fix safe_indexing with read-only indices Aug 7, 2017
@ogrisel ogrisel merged commit 682e85f into scikit-learn:master Aug 8, 2017
@ogrisel
Copy link
Member

ogrisel commented Aug 8, 2017

LGTM as well. Merged. @jnothman I think it should be added to your list of backports for 0.19.

@lesteve lesteve deleted the fix-safe-indexing-with-read-only-indices branch August 8, 2017 08:26
@ogrisel ogrisel mentioned this pull request Aug 8, 2017
jnothman pushed a commit to jnothman/scikit-learn that referenced this pull request Aug 8, 2017
yarikoptic added a commit to yarikoptic/scikit-learn that referenced this pull request Aug 12, 2017
Release 0.19.0

* tag '0.19.0': (99 commits)
  DOC one more version issue in doc
  skip docstring tests because not useful to users and has some issues
  deprecation of n_components happened in 0.19 not 0.18 (scikit-learn#9527)
  sync whatsnew with master so I'm less confused
  DOC more navigation links
  DOC a note on data leakage and pipeline (scikit-learn#9510)
  DOC set release date to Friday
  DOC Update news and menu for 0.19 release
  DOC list of contributors to 0.19
  DOC Change release date to Thursday
  DOC Remove some whitespace from what's new
  Update what's new for recent changes
  Use base.is_classifier instead instead of isinstance (scikit-learn#9482)
  Fix safe_indexing with read-only indices (scikit-learn#9507)
  [MRG+1] add scorer based on explained_variance_score (scikit-learn#9259)
  fix wrong assert in test_validation (scikit-learn#9480)
  [MRG+1] FIX Add missing mixins to ClassifierChain (scikit-learn#9473)
  Bring last code block in line with the image. (scikit-learn#9488)
  FIX Pass sample_weight as kwargs in VotingClassifier (scikit-learn#9493)
  FIX Incorrent implementation of noise_variance_ in PCA._fit_truncated (scikit-learn#9108)
  ...
yarikoptic added a commit to yarikoptic/scikit-learn that referenced this pull request Aug 12, 2017
* releases: (99 commits)
  DOC one more version issue in doc
  skip docstring tests because not useful to users and has some issues
  deprecation of n_components happened in 0.19 not 0.18 (scikit-learn#9527)
  sync whatsnew with master so I'm less confused
  DOC more navigation links
  DOC a note on data leakage and pipeline (scikit-learn#9510)
  DOC set release date to Friday
  DOC Update news and menu for 0.19 release
  DOC list of contributors to 0.19
  DOC Change release date to Thursday
  DOC Remove some whitespace from what's new
  Update what's new for recent changes
  Use base.is_classifier instead instead of isinstance (scikit-learn#9482)
  Fix safe_indexing with read-only indices (scikit-learn#9507)
  [MRG+1] add scorer based on explained_variance_score (scikit-learn#9259)
  fix wrong assert in test_validation (scikit-learn#9480)
  [MRG+1] FIX Add missing mixins to ClassifierChain (scikit-learn#9473)
  Bring last code block in line with the image. (scikit-learn#9488)
  FIX Pass sample_weight as kwargs in VotingClassifier (scikit-learn#9493)
  FIX Incorrent implementation of noise_variance_ in PCA._fit_truncated (scikit-learn#9108)
  ...
yarikoptic added a commit to yarikoptic/scikit-learn that referenced this pull request Aug 12, 2017
* dfsg: (99 commits)
  DOC one more version issue in doc
  skip docstring tests because not useful to users and has some issues
  deprecation of n_components happened in 0.19 not 0.18 (scikit-learn#9527)
  sync whatsnew with master so I'm less confused
  DOC more navigation links
  DOC a note on data leakage and pipeline (scikit-learn#9510)
  DOC set release date to Friday
  DOC Update news and menu for 0.19 release
  DOC list of contributors to 0.19
  DOC Change release date to Thursday
  DOC Remove some whitespace from what's new
  Update what's new for recent changes
  Use base.is_classifier instead instead of isinstance (scikit-learn#9482)
  Fix safe_indexing with read-only indices (scikit-learn#9507)
  [MRG+1] add scorer based on explained_variance_score (scikit-learn#9259)
  fix wrong assert in test_validation (scikit-learn#9480)
  [MRG+1] FIX Add missing mixins to ClassifierChain (scikit-learn#9473)
  Bring last code block in line with the image. (scikit-learn#9488)
  FIX Pass sample_weight as kwargs in VotingClassifier (scikit-learn#9493)
  FIX Incorrent implementation of noise_variance_ in PCA._fit_truncated (scikit-learn#9108)
  ...
paulha pushed a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017
AishwaryaRK pushed a commit to AishwaryaRK/scikit-learn that referenced this pull request Aug 29, 2017
maskani-moh pushed a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017
jwjohnson314 pushed a commit to jwjohnson314/scikit-learn that referenced this pull request Dec 18, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GridsearchCV.fit throws ValueError when passed a large dataframe that contains an Object column
3 participants