Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Plot limit with transform #731

Merged
merged 11 commits into from Aug 21, 2012

Conversation

Projects
None yet
3 participants
Member

pelson commented Mar 1, 2012

This pull request represents a significant chunk of work to address a simple bug:

import matplotlib.pyplot as plt
import matplotlib.transforms as mtrans

ax = plt.axes()
off_trans = mtrans.Affine2D().translate(10, 10)

plt.plot(range(11), transform=off_trans + ax.transData)

print(ax.dataLim)

The result should be [10, 10, 20, 20], but the offset transform has not been taken into account.

Since a path transformation can be costly, it made sense to use the created Line's cached transform concept.
This threw up another, quite confusing bug:

import matplotlib.projections.polar as polar
import matplotlib.transforms as mtrans
import matplotlib.path as mpath
import numpy as np

full = mtrans.Affine2D().translate(1, 0) + polar.PolarAxes.PolarTransform()

verts = np.array([[0, 0], [5, 5], [2, 0]])
p = mpath.Path(verts)

tpath = mtrans.TransformedPath(p, full)
partial_p, aff = tpath.get_transformed_path_and_affine()

print full.transform_path_affine(full.transform_path_non_affine(p))
print full.get_affine().transform_path_affine(full.transform_path_non_affine(p))

The numbers themselves aren't important, suffice to say that the former is correct.

Additionally, the need for non-affine Transform subclasses to implement transform_non_affine and also copy this definition into transform too is confusing/obfuscating e.g.:

class PolarTransform(Transform):
        def transform(self, tr):
            # ...
            # do some stuff
            # ...
            return xy
        transform.__doc__ = Transform.transform.__doc__

        transform_non_affine = transform
        transform_non_affine.__doc__ = Transform.transform_non_affine.__doc__

This latter complaint is the result of an optimisation that will see little benefit (transform stacks are typically mostly Affine, and the non-affine part is easily cached).

Therefore this pull request represents a simplification (at the cost of a couple more function calls) of the current Transform framework. Whilst it is my opinion that the Transform class heirachy remains non-optimally representative of the problem space, I have tried to be pragmatic in my changes for both backwards compatibility and size of review considerations.

This pull request is independent of the invalidation mechanism upgrade being discussed in #723, and a merge between the two should be straight forward.

The tests run exactly the same as they did before commencing this work (they weren't passing on my machine in the first place, but the RMS values have not changed at all). The run time has gone up 5 seconds up to 458 seconds (~1% slower), but this includes the new tests as a result of this pull.

Note this change subtly affects the way one should implement a Transform. If you are implementing a non affine transformation, then you should
override the transform_non_affine, rather than overriding the transform & copying the transform into transform_non_affine too. e.g.:

class PolarTransform(Transform):
        def transform_non_affine(self, tr):
            # ...
            # do some stuff
            # ...
            return xy
        transform_non_affine.__doc__ = Transform.transform_non_affine.__doc__

The documentation is still representative of this change, hence there are few documentation changes included in this pull request.

@ghost ghost assigned mdboom Mar 12, 2012

Collaborator

jdh2358 commented Mar 17, 2012

Hey Phil,

I just did a quick read through of this PR and it will take me much more time for me to have something intelligent to say. While I wrote the original transformations infrastructure, @mdboom did a thorough revamp of it for mpl 0.98 and he is the resident expert. So hopefully he will chime in here shortly. From my read, it is obvious that this is very careful, well thought out and well tested work, so congrats on successfully diving into one of the hairiest parts of the codebase.

I don't use any exotic non-affine transformations in my own work, just the plain vanilla logarithmic scales, so I am not intimately familiar with many of the issues arising there. I do use the blended transformations quite a bit, and this leads me to my question. In the original bug you discuss:

off_trans = mtrans.Affine2D().translate(10, 10)
plt.plot(range(11), transform=off_trans + ax.transData)
print(ax.dataLim)

which produces incorrect datalim, you write "The result should be [10, 10, 20, 20]". It is not obvious to me that mpl is producing the incorrect result here. I'm not really disagreeing with you here, mainly looking to be educated.

In the simple case of a "blended transformation" produced by axvline:

In [46]: ax = plt.gca()

In [47]: line, = ax.plot([0, 1], [.5, .7])

In [48]: ax.dataLim
Out[48]:
Bbox(array([[ 0. ,  0.5],
       [ 1. ,  0.7]]))

In [49]: linev = ax.axvline(0.5)

In [50]: ax.dataLim
Out[50]:
Bbox(array([[ 0. ,  0.5],
       [ 1. ,  0.7]]))

In [51]: linev.get_ydata()
Out[51]: [0, 1]

In this case, even though the axvline has y coords in [0, 1], this does not affect the axes datalim y limits, because using the blended transform (x is data, y is axes) we do not consider the y coordinate to be in the data system at all. The line will span the y space regardless of the x data limits or view limits. Obviously when presented with a generic transformation, there is no way for mpl to infer this, so we give it a hint with the x_isdata and y_isdata which is then respected by update_from_path. I see you support this behavior in your comment about backwards compatibility in _update_line_limits

Now this breaks down if I create the blended transformation on my own and do not set these x_isdata/y_isdata hints:

In [53]: ax = plt.gca()

In [54]: trans = transforms.blended_transform_factory(
    ax.transData, ax.transAxes)

In [55]: line, = ax.plot([1,2,3], transform=trans)

In [56]: ax.dataLim
Out[56]:
Bbox(array([[ 0.,  1.],
       [ 2.,  3.]]))

In the original (pre-Michael refactor) transformation scheme, the dataLim gave the bounding box of the untransformed data, the viewLim gave the bounding box of the view port, and the transformation took you from data limits to figure pixels. The reason the y-data in the blended axes lines like axvline were not considered was because these were already considered "pre-transformed" if you will, and hence not part of the data limits. Ie, only the raw, untransformed data were included in the datalim.

This early scheme obviously wasn't robust enough to handle all the wonderful transformations we can now support following Michael's refactor, but he left those attributes (dataLim, viewLim) in for backwards compatibility, and I think some of the weirdness we are now seeing is a result of a more thorough treatment of transformations trying to maintain as much backwards compatibility as possible.

I mention all of this just for historical context so you might understand why some of the weird things you are seeing are in there. I am not advocating slavish backwards compatibility, because I'd rather have a correct and robust system going forward, and you've done an admirable job in your patch supporting these idiosyncracies already. What I'm trying to understand is why the dataLim in the case of the initial bug you proposed should utilize the transform at all, when the dataLim in the other cases we have considered (log scales, polar transforms, funky blended transforms from axvline etc), all ignore the transformation.

I haven't yet gotten to your second "quite confusing bug" :-)

Owner

mdboom commented Mar 19, 2012

I have to agree with John that a lot of work has clearly been put into this pull request on some of the trickiest parts of the code, without much guidance from the original author (I've been busy on many non-matplotlib things lately). I hope we can (over time) simplify rather than further complicate the transforms system -- in the long term even by breaking backward compatibility if there's good benefits to doing so.

Let me address each bug independently.

  1. I'm not sure it's actually a bug. I don't think it's ever been the case that the auto view limit algorithm takes into account the transformation of the data. The transform member of the artist is intended to be a way to get from data coordinates to the page, not to be a way to pre-transform the data coordinates into effectively other data coordinates. To put this another way, the "actual" data coordinates should have an impact on the ticking the view box, "transformed" data coordinates should not, and are not intended to. That's what the scale and projection infrastructure is for. The transform member is used, for example, by the "Shadow" patch to slightly offset the drawing of the data without affecting the data itself. (And I think it's the fault of documentation that that isn't very clear -- the first commit here says it's fixing it to work as documented, but I'm not sure what documentation you're referring to -- we should clarify anything that is misleading). Consider the case of a log transform -- if that was done by prepending a log transform to the line's transform member, there would be no way of communicating to the axes ticker that ticks should be handled differently. I think what is more appropriate, given the current imperfect framework, is to define a new scale or projection along these lines:

http://matplotlib.sourceforge.net/devel/add_new_projection.html?highlight=new%20projection

And, as defining a scale is somewhat heavyweight, providing a new interface to handle the simple case where one just wants to provide a 1D transformation function would be a nice new features, IMHO. It's is also possible to transform the data before passing it to matplotlib. But of course, understanding your use case here better may help us arrive at an even better solution.

As for 2): I agree it's a bug. Is there anyway you could pull out independently the solution just for that part? Also note you say the first result is correct -- I tend to agree, but with this pull request the former returns the same erroneous results as the latter.

  1. Changing how transforms are defined: The current approach leaves some flexibility in that one can define three implementations, affine, non-affine and combined. "Combined" in most cases will be equivalent to affine + non-affine, but there may be cases where it is more efficient (particularly in code not written using numpy) to do both in a single swoop, and I wouldn't want to lose that flexibility. As it stands now, if a Transform subclass defines transform and not transform_non_affine, transform_non_affine implicitly delegates transform. (See transforms.py:1098). So the micro-optimization where transforms add transform_non_affine = transform is not strictly necessary, but it doesn't change the behavior. So, unless I'm missing something, I don't think your change is necessary (even if only for reducing keystrokes). One might argue that we should have transform delegate to transform_non_affine instead of the reverse, but that seems like change for change's sake. The reason it is the way it is now is that I assumed most people writing custom transformations would be writing non-affine ones, and this allows them to be ignorant of the whole affine/non-affine distinction and just write a method called transform.

Thanks for all your work on this -- it's great to have someone picking it apart and this level of detail. I hope I don't come across as discouraging pull requests -- in fact I'd love to pass this baton along and I think there's a lot of food for thought here. As for next steps, I think it might be most helpful to have an independent pull request just for #2, and continue to discuss ways of supporting the use case suggested by #1 (perhaps on a new issue).

Member

pelson commented Mar 20, 2012

Mike, John,

Firstly, thank you very much for all of your time looking at this non trivial pull request, your feedback is massively valuable and I really appreciate it. I should add that normally when I say the word bug, it tends to mean "it is not behaving as I would expect"; I guess that is not its strict definition in the dictionary :-) .

I hope I don't come across as discouraging pull requests

Not at all. The beauty of github is that we can have in-depth discussions about how code should look and behave and I only see benefit from your input.

John, your example is a good one. In my opinion the current behaviour is undesirable:

>>> import matplotlib.pyplot as plt
>>> ax = plt.gca()
>>> line, = ax.plot([0.2, 0.3], [0.6, 0.7]) 
>>> ax.dataLim
Bbox(array([[ 0.2,  0.6],
       [ 0.3,  0.7]]))
>>> line_ax, = ax.plot([0, 1], [0.1, 0.9], transform=ax.transAxes) 
>>> ax.dataLim
Bbox(array([[ 0. ,  0.1],
       [ 1. ,  0.9]]))

It is my opinion that the dataLim should be unaffected by anything which does not involve the transData. If a transform only involves a part (be it x or y components) then only that part should affect the dataLim. This information is derivable (trans.contains_branch(self.transData) in this pull request) and doesn't need to be limited to the easily controlled cases such as axvline. I haven't yet overriden the behaviour of contains_branch for blended transforms, partially as it seems the result needs to be a 2-tuple rather than a single boolean, but certainly the capability should be fairly straight forward.

The transform member of the artist is intended to be a way to get from data coordinates to the page, not to be a way to pre-transform the data coordinates into effectively other data coordinates.

I guess this is the fundamental shift that this pull request is trying to achieve. The way the transform framework has been implemented means that there is great flexibility when it comes to handling data in different coordinate systems on the same plot (or projection in your terms) without having to "re-project" in advance (i.e. I can work in each datum's "native" coordinate system). Thanks to this I am able to plot polar data with a theta origin at pi/4 from north on the same plot as data with a different theta origin. Similarly, I am able to work with geospatial data from multiple map projections and put them all on to an arbitrary map projection - without having to re-project the data first:

ax = plt.axes(projection='map_projection1')
plt.plot(xdata_in_proj2_coords, ydata_in_proj2_coords, 
           transform=map_projection2_to_1_transform + ax.transData)

This code pretty much just works with v1.1, except for the data limit calculation which currently assumes that xdata_in_proj2_coords is in projection1's coordinate system.

I don't want to make this post to long, so I will leave it there for now with the hope that this has been sufficient to explain why I made this pull request and that it will help inform our discussion further.

All the best,

Member

pelson commented May 1, 2012

Woops, I added (and have subsequently removed) a commit which I didn't intend to include.

Member

pelson commented Aug 5, 2012

@mdboom: I am still keen to get this functionality in before the 20th. It will need a little work to address some of your concerns, and I hope to avoid the need for x_isdata and y_isdata.

Member

pelson commented Aug 13, 2012

@mdboom: I have rebased and removed the use of x_isdata in favour of deriving this information from the transforms. This was surprisingly easy with the changes being proposed here, and for me is a good sign that the changes are valuable.

The things which I think need discussion (please add more if you have anything) are:

  • a usecase for "pre-transforming" ones data and allowing the dataLim and viewLim to reflect those values
  • undoing the "Changing how transforms are defined" (3)
  • extracting the fix to the transform non-affine + transform affine to a separate request

I suggest we have these discussions in this PR, but try to keep the posts short-ish. If it gets a bit noisy, we can always use the devel mailing list.

Member

pelson commented Aug 13, 2012

A usecase for "pre-transforming" ones data

The example I gave previously is my primary usecase for making it possible to plot data which is not necessarily in the ax.transData coordinate system. I would like to be able to define matplotlib axes subclasses which represent map projections (e.g. Hammer) and be able to add lines, polygons and images from other cartographic projections (e.g. PlateCarree). I do not want to limit the user as to which cartographic projection they use for their data. To take a tangible case, suppose a user as an image in Mercator that they want to put next to a line in PlateCarree, onto a Robinson map. The syntax that I would like to achieve is:

plate_carree = MyProjectionLibrary.PlateCarree()
robinson = MyProjectionLibrary.Robinson()
mercator = MyProjectionLibrary.Mercator()

ax = plt.axes(projection=robinson)
plt.plot(lons, lats, transform=plate_carree, zorder=100)
ax.imshow(img, transform=mercator)

The beauty of this interface is that it is so familiar to a mpl user that they can just pick it up and run. Apart from the need for me to expose a _as_mpl_transform api which would provide parity with the _as_mpl_projection interface, this all just works (my axes subclass does the image transformations), except from the fact that the dataLim has been incorrectly set to be in plate_carree coordinates rather than robinson ones.

Member

pelson commented Aug 13, 2012

Owner

mdboom commented Aug 14, 2012

Ok -- @pelson: does that mean you're working on a fix?

I think, if we can, it would be nice to include this before the freeze to get it lots of random user testing before the release. This is one of the more "open heart surgery" PR's in the pipe.

Member

pelson commented Aug 14, 2012

Does that mean you're working on a fix?

No, but I will do in the next 3 hours or so.

This is one of the more "open heart surgery" PR's in the pipe.

Ha. I see what you mean. I agree that, because the unit tests don't have full coverage, the only way we can have confidence with code is to put it out in the wild.

@pelson pelson and 1 other commented on an outdated diff Aug 14, 2012

lib/matplotlib/patches.py
return artist.Artist.get_transform(self)
def get_patch_transform(self):
+ """
+ Return the :class:`~matplotlib.transforms.Transform` ... I'm not sure
@pelson

pelson Aug 14, 2012

Member

@mdboom: Would you know what these (get_patch_transform and get_data_transform) are for? I would like to get a one liner for their purpose.

@pelson

pelson Aug 20, 2012

Member

Needs resolving before merging.

@mdboom

mdboom Aug 20, 2012

Owner

data_transform maps the data coordinates to physical coordinates.

patch_transform maps the native coordinates of the patch to physical coordinates. For example, to draw a circle with a radius of 2, the original circle goes from (-1, -1) to (1, 1) (i.e. radius == 1), and the patch transform would scale it up to 2.

Member

pelson commented Aug 14, 2012

Ok. I think this is in a good state now (changelog and tests polished a little). travis-ci seems to be having a little bit of a problem atm, which I don't think is related, so I haven't actually been able to test this on python3 just yet.

@pelson pelson commented on an outdated diff Aug 20, 2012

doc/api/api_changes.rst
+ >>> print(ax.viewLim)
+ Bbox('array([[ 0., 0.],\n [ 90., 90.]])')
+
+* One can now easily get a transform which goes from one transform's coordinate system
+ to another, in an optimized way, using the new subtract method on a transform. For instance,
+ to go from data coordinates to axes coordinates::
+
+ >>> import matplotlib.pyplot as plt
+ >>> ax = plt.axes()
+ >>> data2ax = ax.transData - ax.transAxes
+ >>> print(ax.transData.depth, ax.transAxes.depth)
+ 3, 1
+ >>> print(data2ax.depth)
+ 2
+
+ for versions before 2.0 this could only be achieved in a sub-optimal way, using
@pelson

pelson Aug 20, 2012

Member

version number is wrong. Should be 1.2

@pelson pelson commented on an outdated diff Aug 20, 2012

lib/matplotlib/transforms.py
has_inverse = False
+ """True if this transform as a corresponding inverse transform."""
@pelson

pelson Aug 20, 2012

Member

as -> has

@pelson pelson commented on an outdated diff Aug 20, 2012

lib/matplotlib/transforms.py
self._a = a
self._b = b
self.set_children(a, b)
self._mtx = None
+ if DEBUG:
+ def __str__(self):
+ return '(%s, %s)' % (self._a, self._b)
+
+ @property
+ def depth(self):
+ return self._a.depth + self._b.depth
+
+ def _iter_break_from_left_to_right(self):
+ for lh_compliment, rh_compliment in self._a._iter_break_from_left_to_right():
+ yield lh_compliment, rh_compliment + self._b
+ for lh_compliment, rh_compliment in self._b._iter_break_from_left_to_right():
+ yield self._a + lh_compliment, rh_compliment
+
+
@pelson

pelson Aug 20, 2012

Member

One too many newlines

Member

pelson commented Aug 20, 2012

@mdboom: I don't know when you last read this, so I am holding back from rebasing as the only way you can have confidence in the rebase is to re-read the whole lot.

Merging by hand at this point is probably a better bet. Are you happy to do this, or would you like me to do it?

Owner

mdboom commented Aug 20, 2012

Why don't you address the small documentation changes and do a rebase -- it's probably easiest to comment on the rebase here as a pull request than in a manual merge.

Member

pelson commented Aug 20, 2012

Ok. Will do shortly. Just working on pickle PR.

Member

pelson commented Aug 20, 2012

Ok. The conflicts were to do with the test_transform.py and api_changes.rst and were pretty straight forward. This is now good to go as far as I can see.

Owner

mdboom commented Aug 20, 2012

Thanks for the rebase -- I'll have another look at this today, but it may not be until later in the day.

@mdboom mdboom commented on an outdated diff Aug 21, 2012

lib/matplotlib/lines.py
@@ -446,6 +446,10 @@ def recache(self, always=False):
self._invalidy = False
def _transform_path(self, subslice=None):
+ """
+ Puts a TransformedPath instance at self._transformed_path,
+ all invalidation of the transform is then handled by the TransformedPath instance.
@mdboom

mdboom Aug 21, 2012

Owner

Too long line.

Owner

mdboom commented Aug 21, 2012

Ok -- I think other than my nit about line length and filling in the docstrings for get_data_transform and get_patch_transform, I think this is good to go.

Member

pelson commented Aug 21, 2012

Thanks Mike. All done.

@mdboom mdboom added a commit that referenced this pull request Aug 21, 2012

@mdboom mdboom Merge pull request #731 from pelson/plot_limit_with_transform
Plot limit with transform
98f6eb2

@mdboom mdboom merged commit 98f6eb2 into matplotlib:master Aug 21, 2012

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment