Checks for left_index and right_index merge parameters #14434

Closed
wants to merge 1 commit into
from

Conversation

Projects
None yet
5 participants
Contributor

ivallesp commented Oct 16, 2016

Hi,

I just committed an error when I was doing an analysis using pandas and this motivated me to implement two checks which in my opinion are necessary.

I was trying to perform a merge and I confused the parameters "left_on" and "right_on" for "left_index" and "right_index". I ran the code and it did not raised me any error. It produced a table which seem to be fine in terms of shape. I suspect that what happened is that the tables got merged by the index of the data frame. I think it would be a great idea to check if right_index and left_index are of type bool, if not, raise an error. This way we will avoid that more people got the same error as mine :D.

Tests passed

Thanks!

codecov-io commented Oct 16, 2016 edited

Current coverage is 85.25% (diff: 100%)

Merging #14434 into master will increase coverage by <.01%

@@             master     #14434   diff @@
==========================================
  Files           140        140          
  Lines         50631      50635     +4   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          43166      43171     +5   
+ Misses         7465       7464     -1   
  Partials          0          0          

Powered by Codecov. Last update c31ea34...e18b7c9

pandas/tools/merge.py
@@ -471,6 +471,14 @@ def __init__(self, left, right, how='inner', on=None,
raise ValueError(
'can not merge DataFrame with instance of '
'type {0}'.format(type(right)))
+ if not isinstance(left_index, bool):
@sinhrks

sinhrks Oct 16, 2016

Member

pls use pandas.api.types.is_bool.

@ivallesp

ivallesp Oct 17, 2016

Contributor

done

pandas/tools/merge.py
@@ -471,6 +471,14 @@ def __init__(self, left, right, how='inner', on=None,
raise ValueError(
'can not merge DataFrame with instance of '
'type {0}'.format(type(right)))
+ if not isinstance(left_index, bool):
+ raise ValueError(
+ 'left_index parameter must be of type bool, not '
@sinhrks

sinhrks Oct 16, 2016

Member

can u add tests to check expected error is raised?

@ivallesp

ivallesp Oct 17, 2016

Contributor

done, thanks!

sinhrks added this to the 0.19.1 milestone Oct 16, 2016

sinhrks added the Reshaping label Oct 16, 2016

pandas/tests/test_categorical.py
@@ -4049,6 +4049,24 @@ def test_merge(self):
result = pd.merge(cleft, cright, how='left', left_on='b', right_on='c')
tm.assert_frame_equal(result, expected)
+ # params left_index right_index checks
+ self.assertRaises(ValueError,
@jreback

jreback Oct 17, 2016 edited

Contributor

wrong file. should go in tools/tests/test_merge.py

@ivallesp

ivallesp Oct 17, 2016

Contributor

thanks!. Updated

Contributor

ivallesp commented Oct 17, 2016

Updated!

Contributor

ivallesp commented Oct 18, 2016

Is there still any change to do? Thanks!

@@ -109,6 +109,15 @@ def test_merge_misspecified(self):
self.assertRaises(ValueError, merge, self.df, self.df2,
left_on=['key1'], right_on=['key1', 'key2'])
+ def test_index_and_on_parameters_confusion(self):
+ self.assertRaises(ValueError, merge, self.df, self.df2, how='left',
@jreback

jreback Oct 19, 2016

Contributor

add the github issue reference as a comment

@jorisvandenbossche

jorisvandenbossche Oct 20, 2016

Owner

Can you address this one?

Contributor

jreback commented Oct 19, 2016

pls add a whatsnew entry in 0.19.1. ping on green.

Contributor

ivallesp commented Oct 19, 2016

Done!

Contributor

ivallesp commented Oct 20, 2016

@jreback green

jreback closed this in 2d3a739 Oct 20, 2016

@jorisvandenbossche

@ivallesp two small comments, looks good!

+
+New features
+~~~~~~~~~~~~
+- Add checks to assure that left_index and right_index are of type bool
@jorisvandenbossche

jorisvandenbossche Oct 20, 2016

Owner

Can you move this to the bug fixes section? (I think that it is rather a bug that it accepted a wrong value, than a new feature that it now checks that)

@@ -109,6 +109,15 @@ def test_merge_misspecified(self):
self.assertRaises(ValueError, merge, self.df, self.df2,
left_on=['key1'], right_on=['key1', 'key2'])
+ def test_index_and_on_parameters_confusion(self):
+ self.assertRaises(ValueError, merge, self.df, self.df2, how='left',
@jorisvandenbossche

jorisvandenbossche Oct 20, 2016

Owner

Can you address this one?

@ivallesp No matter, @jreback already merged and addressed those comments!

Contributor

jreback commented Oct 20, 2016

@jorisvandenbossche was your 2nd one the comment reference in the tests? (decided not a big deal)

yep, but indeed not a big deal

@tworec tworec added a commit to RTBHOUSE/pandas that referenced this pull request Oct 21, 2016

@ivallesp @tworec ivallesp + tworec ERR: Checks for left_index and right_index merge parameters
Author: Iván Vallés Pérez <ivanvallesperez@gmail.com>

Closes #14434 from ivallesp/add-check-for-merge-indices and squashes the following commits:

e18b7c9 [Iván Vallés Pérez] Add some checks for assuring that the left_index and right_index parameters have correct types. Tests added.
bcdda29

@jorisvandenbossche jorisvandenbossche added a commit that referenced this pull request Nov 1, 2016

@ivallesp @jorisvandenbossche ivallesp + jorisvandenbossche [Backport #14434] ERR: Checks for left_index and right_index merge pa…
…rameters

Author: Iván Vallés Pérez <ivanvallesperez@gmail.com>

Closes #14434 from ivallesp/add-check-for-merge-indices and squashes the following commits:

e18b7c9 [Iván Vallés Pérez] Add some checks for assuring that the left_index and right_index parameters have correct types. Tests added.

(cherry picked from commit 2d3a739)
b0e4589

@amolkahat amolkahat added a commit to amolkahat/pandas that referenced this pull request Nov 26, 2016

@ivallesp @amolkahat ivallesp + amolkahat ERR: Checks for left_index and right_index merge parameters
Author: Iván Vallés Pérez <ivanvallesperez@gmail.com>

Closes #14434 from ivallesp/add-check-for-merge-indices and squashes the following commits:

e18b7c9 [Iván Vallés Pérez] Add some checks for assuring that the left_index and right_index parameters have correct types. Tests added.
51cf9f4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment