Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checks for left_index and right_index merge parameters #14434

Closed
wants to merge 1 commit into from
Closed

Checks for left_index and right_index merge parameters #14434

wants to merge 1 commit into from

Conversation

ivallesp
Copy link
Contributor

Hi,

I just committed an error when I was doing an analysis using pandas and this motivated me to implement two checks which in my opinion are necessary.

I was trying to perform a merge and I confused the parameters "left_on" and "right_on" for "left_index" and "right_index". I ran the code and it did not raised me any error. It produced a table which seem to be fine in terms of shape. I suspect that what happened is that the tables got merged by the index of the data frame. I think it would be a great idea to check if right_index and left_index are of type bool, if not, raise an error. This way we will avoid that more people got the same error as mine :D.

Tests passed

Thanks!

@codecov-io
Copy link

codecov-io commented Oct 16, 2016

Current coverage is 85.25% (diff: 100%)

Merging #14434 into master will increase coverage by <.01%

@@             master     #14434   diff @@
==========================================
  Files           140        140          
  Lines         50631      50635     +4   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          43166      43171     +5   
+ Misses         7465       7464     -1   
  Partials          0          0          

Powered by Codecov. Last update c31ea34...e18b7c9

@@ -471,6 +471,14 @@ def __init__(self, left, right, how='inner', on=None,
raise ValueError(
'can not merge DataFrame with instance of '
'type {0}'.format(type(right)))
if not isinstance(left_index, bool):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls use pandas.api.types.is_bool.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -471,6 +471,14 @@ def __init__(self, left, right, how='inner', on=None,
raise ValueError(
'can not merge DataFrame with instance of '
'type {0}'.format(type(right)))
if not isinstance(left_index, bool):
raise ValueError(
'left_index parameter must be of type bool, not '
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can u add tests to check expected error is raised?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, thanks!

@sinhrks sinhrks added the Error Reporting Incorrect or improved errors from pandas label Oct 16, 2016
@sinhrks sinhrks added this to the 0.19.1 milestone Oct 16, 2016
@sinhrks sinhrks added the Reshaping Concat, Merge/Join, Stack/Unstack, Explode label Oct 16, 2016
@@ -4049,6 +4049,24 @@ def test_merge(self):
result = pd.merge(cleft, cright, how='left', left_on='b', right_on='c')
tm.assert_frame_equal(result, expected)

# params left_index right_index checks
self.assertRaises(ValueError,
Copy link
Contributor

@jreback jreback Oct 17, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrong file. should go in tools/tests/test_merge.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!. Updated

@ivallesp
Copy link
Contributor Author

Updated!

@ivallesp
Copy link
Contributor Author

Is there still any change to do? Thanks!

@@ -109,6 +109,15 @@ def test_merge_misspecified(self):
self.assertRaises(ValueError, merge, self.df, self.df2,
left_on=['key1'], right_on=['key1', 'key2'])

def test_index_and_on_parameters_confusion(self):
self.assertRaises(ValueError, merge, self.df, self.df2, how='left',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add the github issue reference as a comment

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you address this one?

@jreback
Copy link
Contributor

jreback commented Oct 19, 2016

pls add a whatsnew entry in 0.19.1. ping on green.

@ivallesp
Copy link
Contributor Author

Done!

@ivallesp
Copy link
Contributor Author

@jreback green

@jreback jreback closed this in 2d3a739 Oct 20, 2016
Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ivallesp two small comments, looks good!


New features
~~~~~~~~~~~~
- Add checks to assure that left_index and right_index are of type bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move this to the bug fixes section? (I think that it is rather a bug that it accepted a wrong value, than a new feature that it now checks that)

@@ -109,6 +109,15 @@ def test_merge_misspecified(self):
self.assertRaises(ValueError, merge, self.df, self.df2,
left_on=['key1'], right_on=['key1', 'key2'])

def test_index_and_on_parameters_confusion(self):
self.assertRaises(ValueError, merge, self.df, self.df2, how='left',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you address this one?

@jorisvandenbossche
Copy link
Member

@ivallesp No matter, @jreback already merged and addressed those comments!

@jreback
Copy link
Contributor

jreback commented Oct 20, 2016

@jorisvandenbossche was your 2nd one the comment reference in the tests? (decided not a big deal)

@jorisvandenbossche
Copy link
Member

yep, but indeed not a big deal

tworec pushed a commit to RTBHOUSE/pandas that referenced this pull request Oct 21, 2016
Author: Iván Vallés Pérez <ivanvallesperez@gmail.com>

Closes pandas-dev#14434 from ivallesp/add-check-for-merge-indices and squashes the following commits:

e18b7c9 [Iván Vallés Pérez] Add some checks for assuring that the left_index and right_index parameters have correct types. Tests added.
jorisvandenbossche pushed a commit that referenced this pull request Nov 1, 2016
…rameters

Author: Iván Vallés Pérez <ivanvallesperez@gmail.com>

Closes #14434 from ivallesp/add-check-for-merge-indices and squashes the following commits:

e18b7c9 [Iván Vallés Pérez] Add some checks for assuring that the left_index and right_index parameters have correct types. Tests added.

(cherry picked from commit 2d3a739)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Error Reporting Incorrect or improved errors from pandas Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants