Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pylab.plot markers aren't independent from lines (pylab: 1.9.2) #4338

Closed
enfeizhan opened this issue Apr 15, 2015 · 20 comments
Closed

pylab.plot markers aren't independent from lines (pylab: 1.9.2) #4338

enfeizhan opened this issue Apr 15, 2015 · 20 comments
Milestone

Comments

@enfeizhan
Copy link

pylab.plot is supposed to draw lines and/or markers but it turns out markers aren't independent from lines. Here is an example:

import matplotlib.pylab as plt
marker_size=10
plt.figure()
plt.subplot(221)
plt.plot(
        range(10),
        linestyle='--',
        )
# drawing a line looks fine
plt.subplot(222)
plt.plot(
        range(10),
        marker='o',
        markersize=marker_size,
        markeredgewidth=2,
        markeredgecolor='r',
        )
# drawing markers is a bit problematic. There is always a line associated with markers.
plt.subplot(223)
plt.plot(
        range(10),
        linestyle=' ',
        marker='o',
        markersize=marker_size,
        markeredgewidth=2,
        markeredgecolor='r',
        )
# In order to ONLY draw markers, 'linestyle' should be set an empty string.
plt.subplot(224)
plt.plot(
        range(10),
        linestyle='--',
        linewidth=3,
        marker='o',
        markeredgewidth=2,
        markeredgecolor='r',
        markersize=marker_size,
        )
# It's even more strange that when you do want makers with a line, 'linestyle' also affects marker
# edges. In this example, line is set dashed. Then the marker's edges get dashed too!
plt.draw()
plt.show()

There are of-course ways to get around these problems, but wondering maybe this is a bug unexpected by maintainers.

@tacaswell tacaswell added this to the next point release milestone Apr 15, 2015
@tacaswell
Copy link
Member

Case 2 and 3 are the intended behaviour, if a linestlye is not specified, use the default (which by default is '-'). Can you point to where in the documentation leads you to expect that case 3 would not draw a line?

The linestlye getting applied to the markers I don't think is the intended behaviour and is probably a bug.

@WeatherGod
Copy link
Member

indeed, there is plot() and there is scatter(). Plot draws lines with
markers denoting data, while scatter draws just markers. There are also
other important differences such as being able to apply colormaps to
scatter markers or to have many different sizes of markers. All of these
things can be done with scatter(), but not plot(). The fact that plot() can
be done without a line is a (accidental?) feature that used to be heavily
used back when scatter plots were not optimized for large number of points
(that has been mostly fixed now).

As for the linestyle question, I can't reproduce it with a somewhat recent
checkout of the code. The red outlines are solid to me. I even increased
the size of the marker just to be sure, and they are solid, not dashed.
Which version are you using?

On Wed, Apr 15, 2015 at 10:14 AM, Thomas A Caswell <notifications@github.com

wrote:

Case 2 and 3 are the intended behaviour, if a linestlye is not specified,
use the default (which by default is '-'). Can you point to where in the
documentation leads you to expect that case 3 would not draw a line?

The linestlye getting applied to the markers I don't think is the intended
behaviour and is probably a bug.


Reply to this email directly or view it on GitHub
#4338 (comment)
.

@efiring
Copy link
Member

efiring commented Apr 15, 2015

On 2015/04/15 4:46 AM, Benjamin Root wrote:

The fact that plot() can
be done without a line is a (accidental?) feature that used to be heavily
used back when scatter plots were not optimized for large number of points
(that has been mostly fixed now).

No. It has always been intended that there are three use cases for plot:
with a line only, with markers only, and with line and markers.
Consider the shorthand plot arguments: 'ro', for example, gives red
circular markers, with no line, 'ro-' markers and line, 'r-' or 'r',
line only. There has never been any intention that scatter should be
used as a substitute for the first of these.

If dashed linestyle is being applied to marker boundaries in plot, I
believe it should be considered a bug. If support for other than solid
linestyles is desired for markers, then it should be controlled by a new
"markerlinestyle" to be consistent with "markeredgewidth" etc.

Eric

@WeatherGod
Copy link
Member

It may have always been intended that way, but that isn't how users see it.
From the user's perspective, line plots are done using plot(), while
scatter plots are done through scatter(). The default plot() call makes the
line, so not having the line is the exceptional behavior. The fact that so
many people ran into the old performance issue in scatter() that was easily
worked around by using plot-without-lines indicates to me that many other
people see it that way, too.

In any case, I can not reproduce the bug indicated by the OP, and I see no
reason why plot markers should be independent from lines, insofar as if I
specify just "color", then I expect both the line and the marker to follow
that color. If I want a different color for the marker, then I have
"markerfacecolor" and "markeredgecolor". So, the same logic would follow
for linestyles, if one would ever want to have a linestyle applied to
marker edges (doubtful).

On Wed, Apr 15, 2015 at 2:26 PM, Eric Firing notifications@github.com
wrote:

On 2015/04/15 4:46 AM, Benjamin Root wrote:

The fact that plot() can
be done without a line is a (accidental?) feature that used to be heavily
used back when scatter plots were not optimized for large number of
points
(that has been mostly fixed now).

No. It has always been intended that there are three use cases for plot:
with a line only, with markers only, and with line and markers.
Consider the shorthand plot arguments: 'ro', for example, gives red
circular markers, with no line, 'ro-' markers and line, 'r-' or 'r',
line only. There has never been any intention that scatter should be
used as a substitute for the first of these.

If dashed linestyle is being applied to marker boundaries in plot, I
believe it should be considered a bug. If support for other than solid
linestyles is desired for markers, then it should be controlled by a new
"markerlinestyle" to be consistent with "markeredgewidth" etc.

Eric


Reply to this email directly or view it on GitHub
#4338 (comment)
.

@mdboom
Copy link
Member

mdboom commented Apr 15, 2015

The bug -- that markers are picking up the linestyle -- may be backend-specific. What backend are you using?

@WeatherGod
Copy link
Member

I am using TkAgg.

On Wed, Apr 15, 2015 at 2:57 PM, Michael Droettboom <
notifications@github.com> wrote:

The bug -- that markers are picking up the linestyle -- may be
backend-specific. What backend are you using?


Reply to this email directly or view it on GitHub
#4338 (comment)
.

@efiring
Copy link
Member

efiring commented Apr 15, 2015

On 2015/04/15 8:54 AM, Benjamin Root wrote:

It may have always been intended that way, but that isn't how users see it.

Some users. Your perspective is new to me.

From the user's perspective, line plots are done using plot(), while
scatter plots are done through scatter(). The default plot() call makes the
line, so not having the line is the exceptional behavior. The fact that so

I wouldn't call non-default "exceptional".

I don't want the plot-with-markers-only to be discouraged, or
deprecated, or typically replaced with scatter. Scatter has a different
interface and vastly different range of use cases.

Being able to switch between line only and markers only in plot is an
important capability, fundamental to use cases that are typical in at
least some areas.

many people ran into the old performance issue in scatter() that was easily
worked around by using plot-without-lines indicates to me that many other
people see it that way, too.

But that's not a valid argument for saying scatter should be used when
plot-without-lines would be perfectly adequate.

In any case, I can not reproduce the bug indicated by the OP, and I see no
reason why plot markers should be independent from lines, insofar as if I
specify just "color", then I expect both the line and the marker to follow
that color. If I want a different color for the marker, then I have
"markerfacecolor" and "markeredgecolor". So, the same logic would follow
for linestyles, if one would ever want to have a linestyle applied to
marker edges (doubtful).

That's the point: one doesn't want that, so if it occurs, it's a bug.

@WeatherGod
Copy link
Member

I completely realize that it is some users. I make zero claim that all
users see it that way. I am also not discouraging plot-with-markers-only.
Nor am I providing an argument for saying that scatter() should be used.
I am only pointing out that there are many who do see the the distinction
between plot() and scatter() differently, and the performance regression
situation was presented as evidence that a not insignificant number of
users see it differently.

I should point out that similar discussions have occurred over in the
MATLAB community, so the whole plot vs scatter distinction argument isn't
even specific to matplotlib. Even MATLAB had a performance regression with
regards to scatter(), go figure...

On Wed, Apr 15, 2015 at 3:30 PM, Eric Firing notifications@github.com
wrote:

On 2015/04/15 8:54 AM, Benjamin Root wrote:

It may have always been intended that way, but that isn't how users see
it.

Some users. Your perspective is new to me.

From the user's perspective, line plots are done using plot(), while
scatter plots are done through scatter(). The default plot() call makes
the
line, so not having the line is the exceptional behavior. The fact that
so

I wouldn't call non-default "exceptional".

I don't want the plot-with-markers-only to be discouraged, or
deprecated, or typically replaced with scatter. Scatter has a different
interface and vastly different range of use cases.

Being able to switch between line only and markers only in plot is an
important capability, fundamental to use cases that are typical in at
least some areas.

many people ran into the old performance issue in scatter() that was
easily
worked around by using plot-without-lines indicates to me that many other
people see it that way, too.

But that's not a valid argument for saying scatter should be used when
plot-without-lines would be perfectly adequate.

In any case, I can not reproduce the bug indicated by the OP, and I see
no
reason why plot markers should be independent from lines, insofar as if I
specify just "color", then I expect both the line and the marker to
follow
that color. If I want a different color for the marker, then I have
"markerfacecolor" and "markeredgecolor". So, the same logic would follow
for linestyles, if one would ever want to have a linestyle applied to
marker edges (doubtful).

That's the point: one doesn't want that, so if it occurs, it's a bug.


Reply to this email directly or view it on GitHub
#4338 (comment)
.

@enfeizhan
Copy link
Author

@tacaswell the documentation says,

Plot lines and/or markers to the ...

When "or" is applied here, I would interpret as plot-line-only or plot-markers-only.

@WeatherGod matplotlib.__version__ gives me 1.4.0; matplotlib.pylab.__version__ gives me 1.9.2.

@mdboom matplotlib.get_backend() and 'matplotlib.pylab.get_backend()` all gives me MacOSX.

While typing this reply, I got a breakthrough. I think @mdboom is correct, because the result looks differently before and after being saved!
Here is what it looks like before saved (Attention to the markers' edge in the lower right subplot):

screen shot 2015-04-16 at 10 43 34 am

And here is after saved (Now the markers' edge looks different from it in the last figure):
figure_2

Thank you guys for your attention! Though this may not be serious, I appreciate plot can produce three kinds of plot, as @efiring mentioned. In my opinion scatter shouldn't be a substitute for plot-marker-only, because under pandas scatter explicitly needs one column for x-axis and a second for y-axis but plot assumes the row index is for x-axis and all the input columns are for y-axis, which is very nice for datetimeIndex and each columns give a different curve.

@tacaswell
Copy link
Member

I have a history of encouraging users to use plot-with-out-a-line in any case where they wanted just markers and only to use scatter if they wanted to scale either the size or color of the markers as a function of a third (or fourth) dimension.

attn @mdehoon this looks like a macosx specific issue.

@tacaswell
Copy link
Member

I would also point out that ax.plot and DataFrame.plot are only roughly related.

@enfeizhan
Copy link
Author

Thanks @tacaswell! I'll migrate to using ax.plot instead of DataFrame.plot:)

@mdehoon
Copy link
Contributor

mdehoon commented Apr 16, 2015

I believe the bug is in the Line2D class, specifically the draw() method, in this section:

        funcname = self._lineStyles.get(self._linestyle, '_draw_nothing')
        if funcname != '_draw_nothing':
            tpath, affine = transf_path.get_transformed_path_and_affine()
            if len(tpath.vertices):
                self._lineFunc = getattr(self, funcname)
                funcname = self.drawStyles.get(self._drawstyle, '_draw_lines')
                drawFunc = getattr(self, funcname)
                drawFunc(renderer, gc, tpath, affine.frozen())    # (A) 

        if self._marker:     # (B)
            gc = renderer.new_gc()
            self._set_gc_clip(gc)
            rgbaFace = self._get_rgba_face()

At line (A), we call drawFunc and give it the graphics context. drawFunc tells the graphics context to draw lines as dashed lines. At line (B), we use the same graphics context to draw the marker. So it also gets drawn with dashed lines.

The bug can be fixed by saving and restoring the graphics context around line (A):

                gc = renderer.new_gc()
                drawFunc(renderer, gc, tpath, affine.frozen()) # (A)
                gc.restore()

@tacaswell
Copy link
Member

In the code block in line (B) there is a call to render.new_gc() which in the base implementation returns a new GraphicsContextBase (https://github.com/matplotlib/matplotlib/blob/master/lib/matplotlib/backend_bases.py#L711) so I do not see restoring the gc used to draw the line will affect anything (and why this currently works with Agg).

It looks like the OSX backend returns the same context (not a new one) when new_gc) is called which I think rather solidly makes this a bug is the the osx code (https://github.com/matplotlib/matplotlib/blob/master/lib/matplotlib/backends/backend_macosx.py#L98). From a quick survey, it looks like only osx and cairo do not return a new gc, Agg, ps, pdf, and pgf do.

If this is related the inherent limitations of the osx backend (that it can't have more than one gc?) we need to have a discussion about how much we want to complicate/restrict/rewrite the rest of the library to deal with it.

I suspect that we could work around this by having the OSX backend return proxy objects and keep a weakref to them (so it can tell when the clients of the gc have gone out of scope) and maintain a stack of states internally.

It might be simpler to use proxy objects which just keep a copy of the state the GC should have, a ref to the osx gc, and updates the osx gc dynamically every time it is used.

@tacaswell tacaswell modified the milestones: proposed next point release, next point release Apr 16, 2015
@mdehoon
Copy link
Contributor

mdehoon commented Apr 16, 2015

We had this discussion several years ago. new_gc is a bit of a misnomer; it does not guarantee to create a new graphics context. In fact, on Mac OS X it never returns a new graphics context, as the OS has no way of creating a new graphics context from scratch. This is not a problem as long as the calling code does not assume that the graphics context is actually new. Matplotlib abides by this rule whenever new_gc is used. I think that there are very few cases in matplotlib where this rule is not followed; if there were, we'd be running into bugs all the time.

@WeatherGod
Copy link
Member

Maybe it is time to consider rewriting graphics contexts as context
managers?

On Thu, Apr 16, 2015 at 9:23 AM, mdehoon notifications@github.com wrote:

We had this discussion several years ago. new_gc is a bit of a misnomer;
it does not guarantee to create a new graphics context. In fact, on Mac OS
X it never returns a new graphics context, as the OS has no way of creating
a new graphics context from scratch. This is not a problem as long as the
calling code does not assume that the graphics context is actually new.
Matplotlib abides by this rule whenever new_gc is used. I think that there
are very few cases in matplotlib where this rule is not followed; if there
were, we'd be running into bugs all the time.


Reply to this email directly or view it on GitHub
#4338 (comment)
.

@mdehoon
Copy link
Contributor

mdehoon commented Apr 16, 2015

Btw it seems to me that we can get rid of the call to new_gc in the line following (B), and the corresponding gc.restore() further down.

@mdehoon
Copy link
Contributor

mdehoon commented Apr 17, 2015

I am seeing the same bug with gtkcairo on Linux.

Anyway, I am a bit confused about the draw() method of Line2D. Modifying the graphics context in lines 692-708 is only relevant if funcname != '_draw_nothing', as tested in line 711. Then wouldn't it make sense to perform this test first, and then modify the graphics context?

@mdehoon
Copy link
Contributor

mdehoon commented Apr 17, 2015

@enfeizhan Can you try pull request #4348 to confirm it fixes this bug, and that at doesn't introduce new bugs? It would be great if you could check both with the Mac OS X native backend and with the Tkagg backend.

tacaswell added a commit that referenced this issue Apr 17, 2015
FIX : Reorder the code in the draw() method of Line2D

fixes #4338
@mdehoon
Copy link
Contributor

mdehoon commented Apr 18, 2015

Once again the Mac OS X native backend is absolved of all blame :-).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants