Add appropriate error on color size mismatch in `scatter` #7363

Merged
merged 5 commits into from Nov 9, 2016

Conversation

Projects
None yet
6 participants
Contributor

bcongdon commented Oct 30, 2016

Addresses #7314.

I imagine that the call to ravel is a necessary evil, but if not, it might be more prudent to compare shape of x and c_array

Contributor

NelleV commented Oct 30, 2016

Can you please add a test?

Contributor

bcongdon commented Oct 30, 2016

Yup! Added one.

@NelleV

NelleV approved these changes Oct 30, 2016

NelleV changed the title from Add appropriate error on color size mismatch in `scatter` to [MRG+1] Add appropriate error on color size mismatch in `scatter` Oct 30, 2016

Contributor

NelleV commented Oct 30, 2016

Thanks!
LGTM

@efiring

Scatter argument handling is tricky...

lib/matplotlib/axes/_axes.py
+ c = np.ma.ravel(c_array)
+ elif c_array.size not in (3, 4):
+ # Wrong size. Not a rgb/rgba and not same size as x
+ raise ValueError("x and c must be the same size")
@efiring

efiring Oct 30, 2016

Owner

I don't think this is the right place or way to do the checking. It is not taking into account the docstring description of c. See below.

lib/matplotlib/axes/_axes.py
+ c = np.ma.ravel(c_array)
+ elif c_array.size not in (3, 4):
+ # Wrong size. Not a rgb/rgba and not same size as x
+ raise ValueError("x and c must be the same size")
if c_array is None:
colors = c # must be acceptable as PathCollection facecolors
@efiring

efiring Oct 30, 2016

Owner

Note the comment. I think what you need to do is put in a check here to ensure that c is "acceptable as PathCollection facecolors". You can look at the Collections code to see what that entails.

@tacaswell

tacaswell Oct 30, 2016

Owner

Or push this checking in to collections and make sure a sensible exception is raised there?

@tacaswell

tacaswell Oct 30, 2016

Owner

Er, no we can't due to the masked point logic a few lines down.

lib/matplotlib/tests/test_axes.py
+@cleanup
+def test_color_length_mismatch():
+ N = 500
+ x, y = np.random.rand(N), np.random.rand(N)
@efiring

efiring Oct 30, 2016

Owner

In the interest of keeping tests minimal and direct, you could reduce N to 5, and use arange instead of rand.
You could also add a test for the handling of c when it is an rgba or sequence of them.

@@ -4014,14 +4014,15 @@ def scatter(self, x, y, s=None, c=None, marker=None, cmap=None, norm=None,
else:
try:
c_array = np.asanyarray(c, dtype=float)
- if c_array.size == x.size:
@efiring

efiring Oct 30, 2016

Owner

Looking back at @anntzer's comment in #7314, I think this line should be checking that c_array.shape matches the original shape of x and y. That implies that x and y should be checked earlier for matching shapes, and that their original common shape should be saved so it can be used in the c shape-check.

@tacaswell

tacaswell Oct 30, 2016

Owner

The docstrings say x/y must be (n, ) shaped, but it does not look like we check that anywhere. I suspect we have the ravel to in fact support (n, 1) and (1, n) shaped arrays arrays.

@story645

story645 Oct 31, 2016

Member

What about a preprocess step where x and y get squeezed? that'd convert both (n, 1) and (1, n) to (n, ) and leave anything else alone.

Owner

tacaswell commented Oct 30, 2016

I basically agree with @efiring on this review.

Contributor

bcongdon commented Oct 31, 2016

Thanks for the comments. I switched around the logic to test c_array against the original shape of x and y. I allowed it to match either one, as I wasn't sure how else to differentiate the (1,n) vs (n,1) case that @tacaswell described.

It looks like PathCollections eventually uses mcolors.to_rgba_array to check the validity of the facecolors, so I used that to assert the validity of c as a color sequence.

lib/matplotlib/axes/_axes.py
+ mcolors.to_rgba_array(colors)
+ except ValueError:
+ # c not acceptable as PathCollection facecolor
+ raise ValueError("c not acceptable as color sequence")
@tacaswell

tacaswell Oct 31, 2016

Owner

This error message should probably contain the x shape, y shape, and c shape. At this point we have tried all of the things the user could have meant, have not been able to work it out so we should give feedback on everything we tried.

Contributor

bcongdon commented Oct 31, 2016

Added the shape of c, x, and y to the ValueError message.

Member

story645 commented Nov 2, 2016 edited

@bcongdon I think the point is that arrays of shape (n, 1) and (1, n) should be treated as if they have equal shape, but checking on size is a bad idea 'cause an array of shape (2,2) has equal size with an array of (4,1) or (1,4).

Owner

tacaswell commented Nov 8, 2016

@bcongdon This seems to have conflicts in the test file, can you rebase?

@efiring can you please re-review (you have a request changes which I am pretty sure are addressed).

I agree with @story645 that using squeeze instead of ravel would make more sense, but worry that would break undocumented (but probably relied on) behavior (because users). If we do that it should be a follow up PR and not block this one as-is.

tacaswell added this to the 2.0.1 (next bug fix release) milestone Nov 8, 2016

tacaswell changed the title from [MRG+1] Add appropriate error on color size mismatch in `scatter` to [MRG+2] Add appropriate error on color size mismatch in `scatter` Nov 8, 2016

Contributor

bcongdon commented Nov 8, 2016

Please excuse my novice git skills. Not sure if I rebased correctly... Is there a better way to rebase the current master changes without having all the subsequent commits showing up in the PR?

Owner

tacaswell commented Nov 8, 2016

The last step needs to be a force-push. your last commit merges the just-rebased commits back into your local branch so they will show up twice. See http://matplotlib.org/devdocs/devel/gitwash/development_workflow.html?highlight=rebase#rebasing-a-pull-request-pr

Contributor

bcongdon commented Nov 8, 2016

Thanks! 👍

Member

QuLogic commented Nov 8, 2016

There are still 8 commits when there should be 4; when you rebase, be sure to drop the extra copy. Use git rebase -i for an interactive rebase and drop one of the copies.

bcongdon added some commits Oct 30, 2016

@bcongdon bcongdon Add appropriate error on color size mismatch in `scatter` 28b4f51
@bcongdon bcongdon Add scatter color len mismatch ValueError test e218bcf
@bcongdon bcongdon Refactor c_array checking; Lower N in color len mismatch test d03b0e0
@bcongdon bcongdon Add c, x, y shapes to unacceptable color sequence ValueError message
65e6a64
Contributor

bcongdon commented Nov 8, 2016

Whoops, didn't realize that. Thanks

lib/matplotlib/axes/_axes.py
+ # c not acceptable as PathCollection facecolor
+ msg = ("c of shape {0} not acceptable as a color sequence "
+ "for x with shape {1}, y with shape {2}")
+ raise ValueError(msg.format(c.shape, x.shape, y.shape))
@efiring

efiring Nov 8, 2016

Owner

I see possibilities for improvement in this block: first, the error message is based on the raveled x and y shapes, not the original shapes, so this could be confusing. Maybe just say 'for x with size...' etc. and use x.size etc. After all, x and y are documented as 1-D and treated as 1-D, so it is their size that matters, not their shapes. Second, I think that you could make it both more compact and more efficient by keeping the work done by mcolors.to_rgba_array. You can delete the colors = c, and replace the line inside the 'try' with 'colors = mcolors.to_rgba_array(c)'. It will be subject to another to_rgba_array call inside PathCollection, which will be able to recognize that it already has an rgba array, instead of using a list comprehension with a function call for each color.

@bcongdon

bcongdon Nov 9, 2016

Contributor

Sounds good. I made those changes.

@bcongdon bcongdon Scatter color ValueError now reports size; More efficient color valid…
…ation
49e7156
@efiring

efiring approved these changes Nov 9, 2016

efiring changed the title from [MRG+2] Add appropriate error on color size mismatch in `scatter` to Add appropriate error on color size mismatch in `scatter` Nov 9, 2016

@efiring efiring merged commit 5901b38 into matplotlib:master Nov 9, 2016

1 of 3 checks passed

continuous-integration/appveyor/pr AppVeyor build failed
Details
coverage/coveralls Coverage decreased (-4.5%) to 61.839%
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Owner

efiring commented Nov 9, 2016

When I tried to backport this, I realized that it contains an error: instead of adding a test, it is replacing an unrelated test. I should have noticed this before merging. I will fix it right now.

Owner

efiring commented Nov 9, 2016

I don't want to spend any more time thrashing around with the tests, which are incompatible between v2.x and master, so I am not going to backport this. Therefore I am moving the milestone to 2.1. That should be soon enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment