Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Sprint] Matlab fplot #1143

Closed
wants to merge 2 commits into from
Closed

[Sprint] Matlab fplot #1143

wants to merge 2 commits into from

Conversation

dmcdougall
Copy link
Member

Addressing #330.

Right, so I have tried this once (#737), but the consensus seemed that I should encapsulate into a class to allow some kind of 'zooming' effect like in the Mandelbrot example.

This code is not complete, needs improving, and there are probably better ways of doing the things I have done. I'm making this a PR so people can chime in with feedback and such like.

@akhmerov
Copy link
Contributor

@dmcdougall:
I have recently implemented adaptive function evaluation for an unrelated project. My code uses parallel evaluation, so it is not directly suitable for matplotlib (and most likely it would be an overkill). Nevertheless let me share some ideas about the algorithm to be used for adaptive computation. First of all, adaptive plots both in 2D and 3D were implemented in sage, so you can look at what they do. My considerations were as follows.

We want to achieve a set of points that 'look nice', and to sample the function more where its behavior is more complicated. First of all, 'looking nice' means that all the data will be rescaled to fit into the plot, and that automatically means that absolute tolerance is not too useful. Instead both x-coordinate and y-coordinate should be scaled to fit a 1x1 square. The scaling should be recalculated when new data points are added, and we expect singularities, we may limit the total scale to some hundred times the initial scale, and not plot any points that do not fit into the maximal scale, because these are due to a singularity.

Another thing, which I consider important is that rather than gradients, we see angles. For a human a sequence of points [1000, 1999, 3000] is much smoother than [1, 0, 1]. This means that we should end up with a list of segments such that all the consecutive pairs of segments from the list must have difference of angles they make with x-axis smaller than a certain tolerance (of course only after x and y are rescaled). As an additional quality criterion I would select that the length of each segment should be smaller than a threshold.

I think the preferred data structure for this is list rather than array, since it's much easier to insert entries inside a list using list.insert. Then on each iteration step I would run over triples of elements and check whether they have angle difference within threshold. In order to have the speed reasonable, I'd also keep a list of coordinates of triples that were modified after the previous step, because only those have to be checked.

EDIT: It turns out that list item insertion is O(N), and so an ordered set (http://code.activestate.com/recipes/576696/) is a better data structure.

@dmcdougall
Copy link
Member Author

@akhmerov Thanks for your suggestions, I'm really glad someone a little more experienced in this area has looked at the algorithm.

Firstly, re-scaling to fit into the unit square is a good option, though I have accounted for this when taking the derivative. The minimal step size you should use grows as the abscissae grow, this is a numerical consideration and I think that both of us are solving the same problem with two different methods. I take your point, though.

Secondly, yes absolute error is not ideal. As you point out, relative error would be better but I seem to recall (when I started this a while ago) that I had some problems with relative error: I'll look into it again.

Thirdly, your point on angles rather than derivatives -- I think -- a great idea. Derivatives look different at different scales. However, angles do not capture regularity of a function. As a consequence, we may 'miss' certain more ill-behaved aspects of the function we wish to plot. This raises the question, "Should fplot produce something that looks good or that is more true to the function's actual behaviour?" Having said that, I can't really think of any non-trivial examples where taking derivatives would be better that are not also horribly contrived.

Lastly, regarding data structure, checking only the previously modified abscissae would boost performance, I agree. Out of interest, what version of Python were sets introduced? Matplotlib supports python version 2.6 and upwards. So we'd have to be careful here...

Thanks again, for your comments. It's people like you who test change-sets and give advice that make software better. I appreciate you taking the time to do that.

@akhmerov
Copy link
Contributor

@dmcdougall

Firstly, re-scaling to fit into the unit square is a good option, though I have accounted for
this when taking the derivative.

I don't think that taking derivative is sufficient. It does transform everything to the dimension (y_max - y_min)/(x_max - x_min), but that scale isn't bounded by any value. Consider exponent as an example, plotted on an interval 0 to 20. Your algorithm will encounter derivative larger than 1e6 on each of the segments that involve coordinates larger than 13, and will consider each of these intervals singular. At the same time, if you were rescaling, there would be nothing ill-behaved.

This raises the question, "Should fplot produce something that looks good or that is more true to
the function's actual behaviour?"

If you want to sample the function more where it changes rapidly, you can also make a threshold for the rescaled length of each interval.

Out of interest, what version of Python were sets introduced?

Sets were introduced in their module in python 2.3, and migrated to built-ins in 2.6.

import numpy as np
import matplotlib

class FPlot:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should subclass object.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nudge.

@dmcdougall
Copy link
Member Author

On Sun, Aug 26, 2012 at 08:35:34AM -0700, Anton Akhmerov wrote:

@dmcdougall

Firstly, re-scaling to fit into the unit square is a good option, though I have accounted for
this when taking the derivative.

I don't think that taking derivative is sufficient. It does transform everything to the dimension (y_max - y_min)/(x_max - x_min), but that scale isn't bounded by any value. Consider exponent as an example, plotted on an interval 0 to 20. Your algorithm will encounter derivative larger than 1e6 on each of the segments that involve coordinates larger than 13, and will consider each of these intervals singular. At the same time, if you were rescaling, there would be nothing ill-behaved.

Ah ok, good point. Continuous functions don't necessarily have a bounded
derivative on the whole line. So let's say I want to plot tan between
0 and pi, say. Your suggestion would be to scale it to 0 and 1, and
refine in the interval [0, 1]. How do you determine when a function has
a discontinuity? Personally, if I wasn't allowed to take the derivative,
I would look at f(x) - f(c) for x and c very close together. By 'close'
I mean let c be such that x - c is the current step size. This would
would for functions that are discontinuous but have bounded derivative,
like the Heaviside function, for instance.

Another question, let's say I scale the x-coordinates to be between 0
and 1. When the function is unbounded, like tan, how do you determine
when to stop in the y-direction?

Out of interest, what version of Python were sets introduced?

Sets were introduced in their module in python 2.3, and migrated to built-ins in 2.6.

Thanks for checking this!

Damon McDougall
http://www.damon-is-a-geek.com
B2.39
Mathematics Institute
University of Warwick
Coventry
West Midlands
CV4 7AL
United Kingdom

@akhmerov
Copy link
Contributor

@dmcdougall:
The discontinuities in derivative or the function do not really hurt you (or likewise intervals where the function is not defined): one can just go to the dissection limit and this will be very reasonable. The problem arises with quickly divergent functions. There the singular values would just make the vertical scale completely unreasonable.

As I wrote earlier, a way to deal with unbounded functions, in my opinion, is to limit y-scale to e.g. a 100 times initially sampled y-scale. One can do cleverer things, like seeing that the plotted information is maximized. This, however, would have a more general usability for determining the plotting limits in any plot and it would require understanding what's a good definition of amount of information read out from a plot.

Anton

@pelson
Copy link
Member

pelson commented Mar 30, 2013

Needs a little bit of a polish, some tests, and a fan-fare (documentation), but other than that, I'm behind this - nice feature!

@dmcdougall
Copy link
Member Author

@pelson I think @akhmerov's suggestions here are too good to pass up. We have two options on pushing this forward: a) we can merge this now (with polish, tests, and docs) and work on implementing @akhmerov's suggestions incrementally after the fact; or b) I can work @akhmerov's suggestions directly into this PR and have the whole lot merged.

I think I prefer b), since we'll be merging higher quality code. I'd rather have it work right and to an extremely high quality from the get go. I think 1.3.x is still a reasonable target in either case.

@pelson
Copy link
Member

pelson commented Apr 2, 2013

I'd rather have it work right and to an extremely high quality from the get go.

I'm fine with it either way - the merge and improve approach is equally valid in my eyes, but whichever you're most comfortable with is great (it's your feature after-all!).

@pelson
Copy link
Member

pelson commented May 13, 2013

@dmcdougall: Go or no go for v1.3? There is always v1.4... 😄

@mdboom
Copy link
Member

mdboom commented May 24, 2013

@dmcdougall: I'm going to move the milestone on this to 1.4.x, since that's what I'm gathering from the discussion. Looking forward to this feature!

@pelson
Copy link
Member

pelson commented May 28, 2013

@dmcdougall as your wanting to get some more development on this PR done, I suggest we close the PR until its ready for more attention/review. My only concern with doing that would be that it is easy to forget about it (which I do not want to do). Perhaps we should keep a wiki of longer term activities and a feature wishlist which can be converted to PR when somebody gets around to doing them...

@mdboom
Copy link
Member

mdboom commented May 28, 2013

In the 1.3 cycle we've had a number of issues and prs that were tagged as 1.3.x for things that were "punted" from 1.2, and this did a relatively good job of helping those things get done. If we keep milestones for things that are almost ready for a release, or punted from a previous release, and then leave unmilestoned issues for "ongoing work on master", I think it helps the important things rise to the surface. I don't know if adding another index will actually make things any easier wrt finding things. What I don't think we should do is close it -- I think marking it as 1.4.x is the best way to remind ourselves to get it down. But we can take that discussion to matplotlib-devel if there's disagreement.

@dmcdougall
Copy link
Member Author

Yeah I'd like to leave this open, PRs that are still open are easier for me to follow. I'm really sorry this has festered. I'm hoping to find time perhaps during the sprint this week to hit this out of the park. @akhmerov's suggestions shouldn't be too hard to implement. Just got to sit down and do them.

Thanks for the ping. Will get down to it this week.

@dmcdougall
Copy link
Member Author

@pelson Can you play with this interactively? I think your idea of trying to get this merged and making incremental improvements atop is probably better. A separate issue can be opened for improvements.

@mdboom
Copy link
Member

mdboom commented Jun 29, 2013

Just to be consistent, the demo should probably call show and be interactive. It's a bit surprising that it just writes files. Plus, using this stuff interactively is an important part of its "coolness" factor.

@@ -0,0 +1,22 @@
import numpy as np
import matplotlib
matplotlib.use('agg')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems a little unnecessary.

@mdboom
Copy link
Member

mdboom commented Jun 29, 2013

Also -- maybe the example could use a cooler looking function? tan(x) is boring 😦

@dmcdougall
Copy link
Member Author

This is now awesome. I decided to keep the number of points with which to partition the domain fixed and it's much smoother.

I think I can still do the adaptive refinement, but I'll do it with a fixed number of x-coordinates. That way the window doesn't become over-resolved and sluggish.

Edit: Sprinty McSprintington

1D Callable function plotting.
"""

import sys
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: sys isn't used

@NelleV
Copy link
Member

NelleV commented Jun 30, 2013

Good work on this PR! I didn't have time to test it, but it looks pretty cool.

I've noticed that you created a new module fplot, in the lib/matplotlib folder. Shouldn't this module be private? We probably don't want people to directly import from it. Also, now that we've refactored the axes module, maybe all plotting functions should be in lib/matplotlib/axes, or under their own private folder lib/matplotlib/_plotting? This is something we will need to think about in the next steps of the refactoring of the Axes class.

Plotting functionality for Python callables, ala Matlab's 'fplot'
function.

A fixed step-size is used, but is recalculated to fit the viewport on
zooming and panning.  This has the effect of 'discovering' more of the function
when the user zooms in on the plot.

Only one callable is supported so far, it would be desirable to support
an array/list of callables and plot them all simultaneously.
@dmcdougall
Copy link
Member Author

@NelleV Should the private module thing be done here or in a new pull request that refactors all of them together? I'm thinking I'd like to keep those separate, but there's certainly an argument to be made otherwise.

@NelleV
Copy link
Member

NelleV commented Jul 1, 2013

@NelleV https://github.com/NelleV Should the private module thing be
done here or in a new pull request that refactors all of them together? I'm
thinking I'd like to keep those separate, but there's certainly an argument
to be made otherwise.

I think as long as we do it before releasing 1.4, it should be fine to
leave it. Let's open a ticket and bother about this later.


Reply to this email directly or view it on GitHubhttps://github.com//pull/1143#issuecomment-20262389
.

@dmcdougall
Copy link
Member Author

I'm going to add another demo to this so we aren't stuck with tan. Once I've done that, I'd really like people to get playing with it so we can merge incremental improvements based on community feedback.

@mdboom
Copy link
Member

mdboom commented Jul 1, 2013

It seems like in an ideal world, we'd want to base the number of points used on the number of pixels in the axes and then res is passed in in terms of subpixels. I'm not sure that's easy to find out given the current layout, but I haven't spent any time looking.

@akhmerov
Copy link
Contributor

akhmerov commented Jul 1, 2013

Actually, for most functions that would be an overkill: the highest resolution should be based on that, while for simple functions there should be way less points.

@mdboom
Copy link
Member

mdboom commented Jul 1, 2013

@akhmerov: Can you elaborate? My motivation here is to not have a plot that may look fine on-screen have visible segments when printed to a higher resolution device.

@dmcdougall
Copy link
Member Author

Actually my longer term goal is to utilise #2189 and use self.n points while panning and zooming and then do a refinement only when the event hasn't fired for a while ('a while' needs discussion, but I think my point is clear). This way there's no refinement done on the fly; it's just too sluggish.

@dmcdougall
Copy link
Member Author

And by 'event', I mean the {x,y}lim_changed events.

@akhmerov
Copy link
Contributor

akhmerov commented Jul 1, 2013

Number of points not really a valuable characteristic of a plot. If the function is linear, you may be fine with just two points. If you have a certain maximal resolution in mind, that is defined by the device, the number of points required to render a curve smoothly on this device will depend on the curve very strongly.

@mdboom
Copy link
Member

mdboom commented Jul 1, 2013

The renderer includes a simplification algorithm to remove points that are colinear between their neighbors, and would, in fact, reduce a line to two points. The issue at hand here is to get enough points so that that algorithm can be left to do its job. (But apologies since I was extremely implicit about that).

@akhmerov
Copy link
Contributor

akhmerov commented Jul 1, 2013

One should still keep in mind that in many cases evaluating the function itself may be expensive. Cheap functions usually don't require adaptiveness: there you can just specify a lot of points from the start.

@pelson
Copy link
Member

pelson commented Jan 9, 2014

PR is a year old so I'm closing it. Please feel free to update and re-invigorate this PR and it can be re-opened.

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants