Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GridInterface bin support for Histogram and QuadMesh #2160

Merged
merged 25 commits into from Dec 22, 2017
Merged

Conversation

philippjfr
Copy link
Member

@philippjfr philippjfr commented Nov 28, 2017

This PR adds support for binned coordinates to the GridInterface addressing #547. This is one of the final conversions of Elements to a Dataset type.

  • Get unit tests passing
  • Add __setstate__ for pickle compatibility
  • Move Histogram and QuadMesh tests into testbinneddataset.py or similar
  • Add examples of irregular meshes to reference documentation and gallery

Several nice to haves, which don't necessarily have to hold up merging this PR:

  • Support for xarray Datasets
  • Support for irregular meshes (i.e. 2D array coordinates)

I've now managed to also support irregular meshes, which are a commonly requested feature particularly for GeoViews. Support for datashading quadmeshes will come in a later PR (probably after TriMesh is merged). Here are some irregular meshes:

And all the usual machinery also works to explore high-dimensional datasets, e.g. here is an example of a multi-dimensional irregularly gridded dataset (requested in holoviz/geoviews#57 and holoviz/geoviews#73).

lon, lat = np.meshgrid(np.linspace(-20, 20, 5), np.linspace(0, 30, 4))
lon += lat/10
lat += lon/10

da = xr.DataArray(np.arange(40).reshape(2, 4, 5), dims=['z', 'y', 'x'],
                  coords = {'lat': (('y', 'x'), lat),
                            'lon': (('y', 'x'), lon),
                            'z': [0, 1]}, name='A')

ds = hv.Dataset(da, ['lat', 'lon', 'z'])
ds.to(hv.QuadMesh, groupby=['z'])

@philippjfr philippjfr changed the title Added GridInterface bin support for Histogram and QuadMesh GridInterface bin support for Histogram and QuadMesh Nov 28, 2017
@philippjfr
Copy link
Member Author

philippjfr commented Nov 30, 2017

Here is an overview of the new functionality.

@jbednar
Copy link
Member

jbednar commented Nov 30, 2017

Nice!

@philippjfr philippjfr added this to the v1.10 milestone Nov 30, 2017
@philippjfr
Copy link
Member Author

philippjfr commented Nov 30, 2017

@jbednar One question I have is how we can properly support datashading a QuadMesh, currently I see two options:

  1. We can treat them like points which means the resulting output cannot have a density higher than the lowest density portion of the mesh because otherwise the output will have sampling artifacts.

  2. We can treat them like rasters, which I think means that the areas will not be appropriately weighted and I haven't figured out how I'd go about generating appropriate coordinates for each pixel after regridding.

Both don't seem quite right so the real solution is getting actual support for quadmeshes into datashader.

@jbednar
Copy link
Member

jbednar commented Nov 30, 2017

What about slicing every bin in half and rendering it as a TriMesh?

@philippjfr
Copy link
Member Author

image

Neat idea!

@philippjfr
Copy link
Member Author

philippjfr commented Dec 19, 2017

Now implemented datashader rasterization of QuadMesh types, based on your suggestion. It's fairly fast once precompute is enabled.

screen shot 2017-12-19 at 10 47 42 pm

screen shot 2017-12-19 at 10 46 16 pm

@philippjfr
Copy link
Member Author

@jbednar @jlstevens Requesting review.

"Qz = np.sin(Y) + np.sin(X)\n",
"Z = np.sqrt(X**2 + Y**2)\n",
"\n",
"print(Qx.shape, Qz.shape, Z.shape)\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stray print? If not it needs to print a proper message.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, just meant to show how the data differs between unevenly sampled and irregularly sampled QuadMeshes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Then I would draw attention to that more explicitly including a proper print message, maybe in a separate cell.

t3s = (js+1)*(s0+1)+t1%s0
t4s = t2s
t5s = t3s
t6s = t3s+1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty unreadable and the variable names are uninformative.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is doing some self-contained computation, it would be better as a utility which can at least have a docstring explaining what this is supposed to be doing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, a utility is fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

End up with all kinds of circular import issues if I try to move it to element.util, I've added various comments which should clarify.


Note: Deprecate as part of 2.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this being tested via the current notebook tests?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was, but once tests are rebuilt just once it won't be.

raise error('Key dimension values and value array %s '
'shapes do not match. Expected shape %s, '
'actual shape: %s' % (vdim, expected[::-1], shape), cls)
return data, {'kdims':kdims, 'vdims':vdims}, {}


@classmethod
def irregular(cls, dataset, dim):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've used the word irregular a lot. I am wondering if uneven would have been clearer though it probably isn't worth changing now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

irregular != uneven, they are different concepts.

if edges and not isedges:
data = cls._infer_interval_breaks(data)
elif not edges and isedges:
data = np.convolve(data, [0.5, 0.5], 'valid')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is there a convolution here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Easiest way to compute a rolling mean, i.e. edges -> bin centers.

@jlstevens
Copy link
Contributor

Looks fine and happy to merge once you've addressed my questions above.

@jlstevens
Copy link
Contributor

After discussing 'uneven' versus 'irregular', I think an irregular mesh example where the sample positions have a small (but obvious) random jitter would make the meaning of 'irregular' a lot clearer.

@jlstevens
Copy link
Contributor

Tests have passed. Ready to merge?

@philippjfr
Copy link
Member Author

Yes, let's merge, should get as much testing and exercising as possible before the next release.

@jlstevens jlstevens merged commit 0f5ad5d into master Dec 22, 2017
@jlstevens
Copy link
Contributor

Merged!

@philippjfr philippjfr deleted the binned_datasets branch January 13, 2018 13:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants