Feature stack base #1671

Merged
merged 9 commits into from Jan 27, 2013

Conversation

Projects
None yet
5 participants
Member

dmcdougall commented Jan 16, 2013

New version of #1517 with a test, typos fixed, and a CHANGELOG.

Member

dmcdougall commented Jan 16, 2013

@Tillsten Are you happy with the CHANGELOG and the whats_new?

Member

dmcdougall commented Jan 16, 2013

@Tillsten Instead of embedding an image to the whats_new file, I'll write an example and just refer to that. I think that's cleaner.

Member

dmcdougall commented Jan 16, 2013

Ok, I think this is good to go now. Tests pass locally except for tight_bbox (which has nothing to do with the stackplot baseline test, it's a weird font thing I occasionally get):

Expected:

bbox_inches_tight-expected

Computed:

bbox_inches_tight

Difference:

bbox_inches_tight-failed-diff

Owner

tacaswell commented Jan 17, 2013

I have also been intermittently getting this failure. (python2.7)

Member

dmcdougall commented Jan 17, 2013

@tacaswell Specifically on this branch, or in general?

Owner

tacaswell commented Jan 17, 2013

@dmcdougall in general (typically with in a few commits of master).

Sorry I wasn't clear.

Contributor

Tillsten commented Jan 17, 2013

@dmcdougall Just thanks for doing the final touches!

Member

dmcdougall commented Jan 17, 2013

@Tillsten No problem.

Feedback permitting, someone else should push the big green button!

@pelson pelson and 1 other commented on an outdated diff Jan 17, 2013

lib/matplotlib/tests/test_axes.py
@@ -863,6 +863,41 @@ def test_stackplot():
ax.set_xlim((0, 10))
ax.set_ylim((0, 70))
+
+@image_comparison(baseline_images=['stackplot_test_baseline'])
@pelson

pelson Jan 17, 2013

Member

Think we need to "remove_text" from the test results (to avoid freetype version differences across architectures)

@dmcdougall

dmcdougall Jan 17, 2013

Member

Yes! I think I meant to do this and forgot. Thanks.

@pelson pelson and 2 others commented on an outdated diff Jan 17, 2013

lib/matplotlib/stackplot.py
r = []
+ if baseline == 'zero':
+ first_line = 0.
+
+ elif baseline == 'sym':
+ first_line = -np.sum(y, 0) * 0.5
+ stack += first_line[None, :]
+
+ elif baseline == 'wiggle':
+ m = y.shape[0]
+ first_line = (y * (m - 0.5 - np.arange(0, m)[:, None])).sum(0)
+ first_line /= -m
+ stack += first_line
+
+ elif baseline == 'weighted_wiggle':
+ #TODO: Vectorize this stuff.
@pelson

pelson Jan 17, 2013

Member

I think that is attainable & desirable. Any chance of getting that in this PR?

@dmcdougall

dmcdougall Jan 17, 2013

Member

Working on it now. There's lots of optimisations I can make here, actually.

@Tillsten

Tillsten Jan 17, 2013

Contributor

@dmcdougall Allready done.

nstack = np.cumsum(y, axis=0)
m, n = y.shape
center = np.zeros(n)
total = np.sum(y, 0)
increase = np.hstack((y[:, 0:1], np.diff(y)))
below_size = np.vstack((np.zeros((1, n)), 
                        np.cumsum(y[:0:-1,:], 0)))[::-1,:] 
below_size += 0.5*y
move_up = below_size / total
move_up[:, 0] = 0.5
center = (move_up - 0.5) * increase
center = np.cumsum(center.sum(0))
first_line = center - 0.5 * total
nstack += first_line
@dmcdougall

dmcdougall Jan 17, 2013

Member

@Tillsten The test fails with your proposed changes.

Member

pelson commented Jan 17, 2013

Excellent feature. Only a few minor comments, but I think it looks great!

@dmcdougall dmcdougall and 1 other commented on an outdated diff Jan 17, 2013

lib/matplotlib/stackplot.py
+
+ elif baseline == 'weighted_wiggle':
+ #TODO: Vectorize this stuff.
+ m, n = y.shape
+ center = np.zeros(n)
+ total = np.sum(y, 0)
+ move_up = np.zeros((m, n))
+ increase = np.append(np.zeros((m, 1)), np.diff(y), axis=1)
+
+ below_size = np.vstack((np.zeros((1, n)), np.cumsum(y[:0:-1,:], 0)))[::-1,:]
+ below_size += 0.5 * y
+ move_up = below_size / total.T - 0.5
+
+ for i in range(1, n):
+ center[i] = center[i - 1]
+ center[i] += np.dot(move_up[:, i], increase[:, i])
@dmcdougall

dmcdougall Jan 17, 2013

Member

@Tillsten I used your idea to construct below_size and it works nicely! But now I'm stuck here. Any ideas on how to move out the center with that inter-dependence inside the loop? I tried modifying your approach but it didn't work.

@Tillsten

Tillsten Jan 17, 2013

Contributor

Are you sure my code dient work? I also fester the code and the max difff was 1e-17 maybe enough to let it fail. I am at my phone at the moment but I think using multiply, sum and than cumsum should do the trick.

@dmcdougall

dmcdougall Jan 17, 2013

Member

I copied and pasted verbatim into the else block, replacing all I had there already. Maybe I fluffed it up. You could try it independently and run the test for feedback. That would actually be helpful.

Member

dmcdougall commented Jan 17, 2013

Wooooo vectorised!

Thanks for you help and advice @Tillsten. Do you like the proposed solution? Is there anything you'd like to add/change before it is rebased and merged?

Contributor

Tillsten commented Jan 18, 2013

@dmcdougall I checked both versions, both are getting excat the same result on my machine. I prefer mine, because
your diag generates an n-n-matrix, which could be quite big.

edit: I am wrong, it just extract the diagonal (the dot product is (m, m)). But still, some unnecessary terms are calculated. Some simple timing shows a 2.5 increase for my method.

Contributor

Tillsten commented Jan 18, 2013

Also i was able to simplify below_size.

stack = np.cumsum(y, axis=0)
m, n = y.shape
center = np.zeros(n)
total = np.sum(y, 0)
increase = np.hstack((y[:, 0:1], np.diff(y)))
below_size = total - nstack
below_size += 0.5*y
move_up = below_size / total
move_up[:, 0] = 0.5
center = (move_up - 0.5) * increase
center = np.cumsum(center.sum(0))
first_line = center - 0..5 * total
stack += first_line
Contributor

Tillsten commented Jan 18, 2013

Test is here:

https://www.wakari.io/usermgmt/nb/tillsten/Vectorizing_the_loop

Also note that this just probably premature optimizing, but vetorizing stuff is surprisingly fun :)

Member

dmcdougall commented Jan 18, 2013

@Tillsten Ok. Your method is faster, so let's use that! I must've fluffed something up in the copy-paste. Thanks for collaborating with me. I'll update the PR with your approach.

Also note that this just probably premature optimizing, but vetorizing stuff is surprisingly fun :)

Yes it is! What's more fun is to work with somebody on it. Thanks for all your hard work and guidance, I really appreciate it.

Member

dmcdougall commented Jan 18, 2013

@Tillsten Out of interest, would you be able to add a timing test comparing against the origin double for loop too? Not for any reason other than I am curious.

Member

dmcdougall commented Jan 18, 2013

@Tillsten I get a local test failure with your patch (implemented in c356f44). Could you perhaps take a look? Here's the diff from the test output.

Expected:

stackplot_test_baseline-expected

Computed:

stackplot_test_baseline

Diff:

stackplot_test_baseline-failed-diff

Contributor

Tillsten commented Jan 18, 2013

@dmcdougall After including the orginal loop, i am very happy that we verctorize it. There is a factor of 10 000 of differnce.

Member

dmcdougall commented Jan 18, 2013

There is a factor of 10 000 of differnce.

Down comes the house! That's what I like to see. Good work. Tests are now passing locally, too.

Member

dmcdougall commented Jan 18, 2013

I'm going to rebase this pull request so it will merge cleanly. During that process, I will squash down the vectorisation commits into a single vectorisation commit. I'll then check for PEP8 compliancy. After that I'll run the tests locally and report back.

After that, this should be good to go.

Contributor

Tillsten commented Jan 18, 2013

I have not too experience with the workflow, but why not squash everything together?

Member

dmcdougall commented Jan 18, 2013

@Tillsten I can do that, too. The reason I did it this way is because then each commit adds something logical and complete in bite-size chunks. This makes tools like git bisect easier to use should something go wrong.

Perhaps this is better thrown out for a consensus. @pelson and @mdboom What do you guys think?

Member

WeatherGod commented Jan 18, 2013

My personal feelings about squashing commits is to be conservative with
it. Squashing everything together makes for very big diffs, but not
squashing can lead to many confusing commit messages and clutter
histories. Squashing the vectorization commits makes sense here.

Member

dmcdougall commented Jan 20, 2013

This seems reasonably stable now, so I'll merge it if there's no more feedback by, say, tomorrow. It'd be good to get this into master so people can play with it.

@pelson pelson commented on the diff Jan 21, 2013

examples/pylab_examples/stackplot_demo2.py
@@ -0,0 +1,23 @@
+import numpy as np
+import matplotlib.pyplot as plt
+
+np.random.seed(0)
+def layers(n, m):
@pelson

pelson Jan 21, 2013

Member

I know its only an example, but a docstring would be nice here - something which focuses on the purpose of the function, rather than its args/kwargs.

@WeatherGod

WeatherGod Jan 21, 2013

Member

Actually, this is not uncommon to do in the examples. IIRC, the sphinx
doc-builder will take the docstring portion of the example, and render it
as ReST text above the source code portion. If one feels like having a
docstring there, feel free. The more the better.

@dmcdougall

dmcdougall Jan 21, 2013

Member

Sure thing :)

@Tillsten

Tillsten Jan 22, 2013

Contributor

Explanation of the code:
This is just a direct copy of the original code. All it does is calculating random Gaussians.

@dmcdougall

dmcdougall Jan 22, 2013

Member

Done in bd204e1.

@pelson pelson commented on the diff Jan 21, 2013

lib/matplotlib/tests/test_axes.py
@@ -863,6 +863,42 @@ def test_stackplot():
ax.set_xlim((0, 10))
ax.set_ylim((0, 70))
+
+@image_comparison(baseline_images=['stackplot_test_baseline'],
+ remove_text=True)
+def test_stackplot_baseline():
+ np.random.seed(0)
+ def layers(n, m):
@pelson

pelson Jan 21, 2013

Member

I would prefer we were testing the actual example function. Something like:

from mpl_examples.pylab_examples.stackplot_demo2 import layers

should do the trick.

This comment is probably controversial - so some agreement/disagreement from others on whether this should be done would be useful.

@dmcdougall

dmcdougall Jan 21, 2013

Member

Your comment alludes to the DRY principle, which I'm in favour of.

@dmcdougall

dmcdougall Jan 22, 2013

Member

Actually, after some thought, I've decided against this. Importing from the example like this is not a stable way to maintain the test. The tests should not change. The examples might change.

@dmcdougall

dmcdougall Jan 26, 2013

Member

Also, importing like that will actually run the example at import-time, i.e. during the test.

Member

pelson commented Jan 21, 2013

This is an excellent PR, and from my point of view, is on the cusp of being merged. I've added two questions/actions - one of which I would appreciate others either agreeing, or throwing it out of the park.

Cheers,

Contributor

Tillsten commented Jan 25, 2013

merge it?

Member

dmcdougall commented Jan 26, 2013

@Tillsten I would rather wait until @pelson confirms his concerns have been addressed before merging.

In general I like to wait for at least one of the other developers to give it a once-over, but I realise that's not always possible with people being busy.

Member

dmcdougall commented Jan 26, 2013

Also, the tests that were added here are now failing. Curious.

Member

dmcdougall commented Jan 26, 2013

Locally I don't get the stack_base errors with python 2.7. The error bar and tight_bbox failures have nothing to do with these changes.

Member

dmcdougall commented Jan 26, 2013

Rebasing addresses the error_bar failures locally. Now all tests pass locally with py2.7. Travis will probably have a different story.

@dmcdougall dmcdougall added a commit that referenced this pull request Jan 27, 2013

@dmcdougall dmcdougall Merge pull request #1671 from dmcdougall/feature_stack_base
Feature stack base
daf551d

@dmcdougall dmcdougall merged commit daf551d into matplotlib:master Jan 27, 2013

1 check failed

default The Travis build failed
Details

dmcdougall deleted the dmcdougall:feature_stack_base branch Jan 27, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment