Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

matplotlib pcolormesh seems to slide some data around on the plot #6331

Closed
mnky9800n opened this issue Apr 25, 2016 · 11 comments
Closed

matplotlib pcolormesh seems to slide some data around on the plot #6331

mnky9800n opened this issue Apr 25, 2016 · 11 comments

Comments

@mnky9800n
Copy link

Note, this question was originally posted by me on stackoverflow

Matplotlib Version = 1.5.1
Installation method = anaconda

I'm plotting data using the matplotlib functions pcolormesh and imshow and when I use pcolormesh it produces artifacts where it seems to slide some of the data around:

pcolormesh

whereas imshow does not:

enter image description here

I was able to produce an example that has the same artifacting

import numpy as np
import pandas as pd

data = pd.DataFrame({'x':np.random.normal(loc=0.5, size=5000)
                     , 'y':np.random.normal(loc=0.5, size=5000)
                     , 'z':np.random.normal(loc=0.5, size=5000)})

data_pivot = data.pivot(index='x', columns='y', values='z')
x = data_pivot.index.values
y = data_pivot.columns.values
z = data_pivot.values
masked_data = np.ma.masked_invalid(z)

which produces the following figures like so:

fig, ax = plt.subplots(1, figsize=(8,8))
ax.pcolormesh(x, y, masked_data)

enter image description here

Where do these artifacts come from? There isn't anything wrong with the data as far as I can tell since the original data and the made up data produce the same result.

@mnky9800n mnky9800n changed the title matplotlib pcolormesh creates data artifacts matplotlib pcolormesh seems to slide some data around on the plot Apr 25, 2016
@efiring
Copy link
Member

efiring commented Apr 25, 2016

Pcolormesh requires X and Y to define a quadrilateral grid. Each Zij determines the color of a quadrilateral defined by X and Y at ij, i,j+1, i+1,j, i+1,j+1. In other words, X and Y define boundaries, not centers. I don't understand how you are generating your X and Y, but I suspect the result is not consistent with what pcolormesh requires.

@mnky9800n
Copy link
Author

Does this mean that Z always needs to be the same size as the layer beneath it? For instance, when plotting basemap maps?

@WeatherGod
Copy link
Member

No, each pcolormesh "object" is independent of each other.

On Mon, Apr 25, 2016 at 3:44 AM, John Aiken notifications@github.com
wrote:

Does this mean that Z always needs to be the same size as the layer
beneath it? For instance, when plotting basemap maps?


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#6331 (comment)

@WeatherGod
Copy link
Member

also, which version of matplotlib are you using. Older versions had some very weird corner cases that might explain this problem (it was an oddity with handling polygons, but it was only exhibited in contourf3d() plots, IIRC).

@WeatherGod
Copy link
Member

and I just noticed that you already stated that (v1.5.1), so that isn't the problem...

@mnky9800n
Copy link
Author

I thought that there was a problem with parsing nans, so I added some using an example from the basemap documentation for pcolormesh:

from osgeo import gdal
import numpy as np
import matplotlib.pyplot as plt
ds = gdal.Open("/home/max/Downloads/BasemapTutorial-master/code_examples/sample_files/dem.tiff")
data = ds.ReadAsArray()

data[data < data.mean()] = np.nan

x = linspace(0, map.urcrnrx, data.shape[1])
y = linspace(0, map.urcrnry, data.shape[0])

xx, yy = meshgrid(x, y)
masked_data = np.ma.masked_invalid(data)

fig, ax = plt.subplots(1, figsize=(4,4))

ax.pcolormesh(xx, yy, masked_data)
ax.set_ylim(0, y.max())
ax.set_xlim(0, x.max())

image

However, as you can see this doesn't impact the figure the way it did before.

So I tried to replicate this as closely as possible in the code I actually want to get to work and I still get artifacts:

In[5]: fmd_df.head()
Out[5]: lat     lon     b_value
1067    21.8    121.6   0.429033
1068    21.8    121.7   0.443072
1143    21.9    121.3   0.427714
1144    21.9    121.4   0.431214
1145    21.9    121.5   0.424853

val_df = fmd_df.pivot(index='lat', columns='lon', values='b_value')

fig, ax = plt.subplots(1, figsize=(8,8))

m = Basemap(projection='merc', 
              llcrnrlon=val_df.columns.min(), 
              llcrnrlat=val_df.index.min(), 
              urcrnrlon=val_df.columns.max(), 
              urcrnrlat=val_df.index.max(),
           )
m.drawcoastlines()

lats = val_df.index
lons = val_df.columns
data = np.ma.masked_invalid(val_df.values)

lons, lats = np.meshgrid(lons, lats)

m.pcolormesh(lons, lats, data, latlon=True)

image

I'm really not clear as to why this is happening.

@mnky9800n
Copy link
Author

mnky9800n commented Apr 25, 2016

Two additional images for reference comparing imshow and pcolormesh.

image

image

And if we zoom in on the artifact:

fmd_slice = fmd_df[(fmd_df.lat >= 45) & (fmd_df.lon >= 150)]
val_df = fmd_slice.pivot(index='lat', columns='lon', values='b_value')

fig, ax = plt.subplots(1, figsize=(8,8))

m = Basemap(projection='merc', 
              llcrnrlon=val_df.columns.min(), 
              llcrnrlat=val_df.index.min(), 
              urcrnrlon=val_df.columns.max(), 
              urcrnrlat=val_df.index.max(),
           )
m.drawcoastlines()
m.drawmeridians(np.arange(0,180,5), labels=np.arange(0,180,5))
m.drawparallels(np.arange(0,180,5), labels=np.arange(0,180,5))

lats = val_df.index
lons = val_df.columns
data = np.ma.masked_invalid(val_df.values)

lons, lats = np.meshgrid(lons, lats)

# m.pcolormesh(lons, lats, data, latlon=True, vmin=0, vmax=1.5)
m.imshow(data, vmin=0, vmax=1.5, interpolation='none')
# m.colorbar()
ax.set_title('imshow')

# ax.set_title('pcolormesh')

image
image

@efiring
Copy link
Member

efiring commented Apr 25, 2016

In your last example, with 1-D lats and lons variables: are they sorted and strictly increasing, with uniform spacing, no gaps, no nans? It looks like this is not the case, and that is the source of the difference. For example, where you have the horizontal stretching in the "artifacts", this is simply because you have a larger interval in your lons array. Imshow can only make rectangular blocks, all the same size and shape. Pcolormesh uses the grid you give it; you are giving it an irregular grid. Try plotting your meshgrid output with dots and you will see it.

As for the dimensions of the original 1-D lons and lats that go into the meshgrid: ideally, lons should be 1 element longer than the second dimension of Z, and lats should be 1 longer than the first dimension of Z, so that there is a boundary quadrilateral for each element of Z. If this is not the case, you lose the last row and column of Z.

@mnky9800n
Copy link
Author

I'm glad you confirmed this because I was inspecting the data I was feeding the plots and it is missing some steps. Nice to know that the problem is typically sitting in front of the computer.

@efiring
Copy link
Member

efiring commented Apr 25, 2016

It looks like no further action is needed so I am closing this.

@efiring efiring closed this as completed Apr 25, 2016
@efiring
Copy link
Member

efiring commented Apr 25, 2016

@mnky9800n You might add a note to your stackoverflow post to completely resolve it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants