Basic folder and data structures and pyvista plotting #12

banesullivan · 2020-06-13T21:41:20Z

Just opening this PR to show the diff

In summary, there is a new geometry module that defines the spatial data structures of subsurface data. These data structures draw much inspiration from PyVista yet there is a fundamental difference we are implementing here: the idea of "meshes" and "grids".

There are two submodules for mesh and grid data structures. Mesh data structures will track all of their data/spatial references in Pandas DataFrame's while grid data structures will track all of their data/spatial reference in an xarray dataset.

Meshes are any subsurface data where all of the points of that data need to be explicitly defined. Examples of data that would be in the mesh form include: point clouds, line sets (wells), triangulated surfaces, structured grids (e.g. discretize.CurviMesh or pyvista.StructuredGrid), tetrahedralized volumes.

Grids are any subsurface data with implicitly defined points. Examples of these kinds of data include: rasters, rectilinear grids, (think seismic volumes), etc. The idea being here that the entire dataset can be defined by a few parameters like origin, U/V direction (orientation), and cell spacings. These kinds of data make sense to track in xarray.

Its important to note that I am classifying structured grids as meshes here because all of the points of those types of data structures must be known and cannot be implicitly defined.

bluetyson · 2020-06-14T03:35:29Z

So a reason to be a dataframe, not a geodataframe if tracking coordinates?

banesullivan · 2020-06-14T03:54:09Z

Geodataframes from geo pandas only supports 2D data AFIK. Is there 3D support?

bluetyson · 2020-06-14T07:05:27Z

Ah, yes, right!

Leguark

Thanks for the work @banesullivan I really like it. The only doubts I have at the moment is how many dataframes we really need.

As I see it we have:

points df: 0D properties including XYZ
cell data df: Anything but 0D
df_tri: connects the points df elements to create higher dimensional objects that have cell data

Am I missing some construct here?

I think if we are able to construct the classes only with those - well defined - three frames, it would make the design really clean since any higher level object would be some combination of them!

I am getting very pumped with this project :D

Leguark · 2020-06-14T07:19:35Z

subsurface/geometry/mesh.py

+
+        """
+        self._df_points = pd.DataFrame(columns=['X', 'Y', 'Z'])
+        self._df_point_data = pd.DataFrame()


is it worth it to have two dataframes or should we just make one and we just make views:

self._df = pd.Data...
self._df_points is self._df['X', 'Y', 'Z']
self._df_point data is self._df[the rest]

florian-wagner · 2020-06-15T14:55:01Z

Hi @banesullivan,

thanks for kicking of the data structure API by bringing in your valuable pv-experience. I agree that xarray's are predestined for structured grids and that a collection of pandas data frames (at least one for the nodes and one for the cells) are a good and flexible representation of general cell-based data sets.

With regard to the latter: Do we need different classes depending on the cell type? I agree that common functionality could be shared within something like _CellDataMixin to avoid redundancy, but is that necessary or could we directly aim for a more general UnstructuredMesh object (similar as in pyvista/VTK)?

pyGIMLi for example can handle various cell types (hex, tets, triangular prisms, quadratic pyramids, etc.) but more importantly a mixture of those (as the shape functions for FEM calculations are automatically generated based on the cell type). Are there reasons against a more general object where the number of columns in the cell data frame represents the maximum number nodes of a cell (let's say 6), and a tetrahedron would only use 4 of them with the unused two being NaNs?

Just thinking out loud here ;-)

Cheers
Florian

Leguark · 2020-06-16T07:21:47Z

I like that idea @florian-wagner. I have been thinking since Saturday how we can represent all geometries with the minimum number of - in this case - Dataframes (but as many columns as we need) and I reached to similar idea:

Point data Dataframe [X, Y, Z] and properties
Element data Dataframe [Int1, Int2, Int3, Int4, ..] and properties
- Depending on the number of points per element (How many ints or whatever the official name is) we would have lines, triangles...
- No matter the dimensionality the property will be located always in the center and will represent the whole volume

Am I missing something Florian?

Hopefully I will get a week or two to work on this in July

florian-wagner · 2020-06-16T07:33:02Z

No, that's exactly what I had in mind. This would allow arbitrary geometries with point- and cell-based values in two compact pandas data frames. The first 8 columns in the cell dataframe would be reserved for node indices (not six as I wrote earlier...) and could be NaN if not used or columns Int5-Int8 could even be dropped when a mesh only containes tets for example.

When all libraries could read and write such a format + VTK I/O + pyvista viz, I would already be a happy user of subsurface.

- [CLN] Refactor modules to new terminology - [CLN] Split requiremtes

- [ENH] Welly is optional - [ENH] structured_elements.py and unstructured_elements.py have been updated - [ENH] subpackage for visualiztion. pyvista code moved there - [DOC] Improved definition of the interfaces subpackage

- [ENH] Pyvista plotting

Leguark · 2020-09-24T10:40:15Z

I think it is time to merge this. Some of the ideas that based this changes: https://github.com/softwareunderground/subsurface/blob/mig_dev/sdd.md

This is the summary of features:

Created the full folder structure.
Added base_structures (StructuredData and UnstructuredData)
Adapted elements (points, lines, trisurf etc)
Pyvista plotting for all the elements

prisae

Just a few comments. I guess we will discuss more and Monday.

requirements.txt

sdd.md

- [DOC] added geoh5py to the sdd.md

Leguark added 3 commits June 13, 2020 10:20

- a bit of structure

649bd31

- More cleaning

a380a80

- Small hacks

3d010fc

banesullivan mentioned this pull request Jun 13, 2020

mixed tab/space indentation #13

Open

banesullivan added 9 commits June 13, 2020 17:46

Cleanup tabs to spaces

d2f5f97

geometric_structures -> geometry

6516dc7

Rename modules in geometry

b0b1b46

Add some more basic features

95e5d10

pointset.py -> mesh.py

d2319f0

Add basic PolyData classes

7e86897

Add tetra mesh and fix data dict

9fd38e4

Add curvi mesh

04c41ce

Add reminder for Octrees

e453430

Leguark reviewed Jun 14, 2020

View reviewed changes

- Welly to Subsurface first prototype

08fa01a

hackmd-deploy and others added 7 commits June 16, 2020 09:00

first_commit

e3685eb

Data Structures definition

1e48f53

Data Structures definition

03df684

- [ENH] Added primary structures

f1d1891

- [CLN] Refactor modules to new terminology - [CLN] Split requiremtes

- [CLN] Renamed geometry to structs

304ed8f

- [ENH] Welly is optional - [ENH] structured_elements.py and unstructured_elements.py have been updated - [ENH] subpackage for visualiztion. pyvista code moved there - [DOC] Improved definition of the interfaces subpackage

Merge remote-tracking branch 'origin/mig_dev' into mig_dev

b9721fe

- [DOC] typo

b297b5b

Leguark mentioned this pull request Sep 16, 2020

Decide if point_data should be a fundamental attribute of UnstructuredData or should be part of unstructured_elements #15

Closed

- [TEST] Tests updated to new structure. Pyvista tests added

98f05eb

- [ENH] Pyvista plotting

Leguark and others added 10 commits September 21, 2020 10:30

- [DOC] Added docstrings to the updated classes

bc588ed

- [TEST] Added a bunch of test data

015880f

- [ENH] Added and testes StructuredData dataclass

f4c2066

Design document update

24058a5

- [ENH] added to_pyvista_grid

b56ba81

- [ENH] added support for StructuredSurfaces and pyvista plotting

f4add3f

- [CLN] remove deprecated code

05cecfe

- [TEST] Fixed Schaaf's faults test

5671929

Design document update

c314177

- [CLN] Fixed ssd typo for good

a4ac7e9

Leguark marked this pull request as ready for review September 24, 2020 10:36

Leguark requested a review from alex-schaaf September 24, 2020 10:36

Leguark requested a review from prisae September 24, 2020 10:40

Leguark changed the title ~~T20 progress~~ Basic folder and data structures and pyvista plotting Sep 24, 2020

prisae approved these changes Sep 24, 2020

View reviewed changes

requirements.txt Outdated Show resolved Hide resolved

sdd.md Show resolved Hide resolved

- [CLN] removed versions in requirements.txt

0ca3d25

- [DOC] added geoh5py to the sdd.md

prisae approved these changes Sep 24, 2020

View reviewed changes

Leguark merged commit 6a8916a into master Sep 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Basic folder and data structures and pyvista plotting #12

Basic folder and data structures and pyvista plotting #12

banesullivan commented Jun 13, 2020 •

edited

bluetyson commented Jun 14, 2020

banesullivan commented Jun 14, 2020

bluetyson commented Jun 14, 2020

Leguark left a comment

Leguark Jun 14, 2020

florian-wagner commented Jun 15, 2020

Leguark commented Jun 16, 2020 •

edited

florian-wagner commented Jun 16, 2020 •

edited

Leguark commented Sep 24, 2020

prisae left a comment

Basic folder and data structures and pyvista plotting #12

Basic folder and data structures and pyvista plotting #12

Conversation

banesullivan commented Jun 13, 2020 • edited

bluetyson commented Jun 14, 2020

banesullivan commented Jun 14, 2020

bluetyson commented Jun 14, 2020

Leguark left a comment

Choose a reason for hiding this comment

Leguark Jun 14, 2020

Choose a reason for hiding this comment

florian-wagner commented Jun 15, 2020

Leguark commented Jun 16, 2020 • edited

florian-wagner commented Jun 16, 2020 • edited

Leguark commented Sep 24, 2020

prisae left a comment

Choose a reason for hiding this comment

banesullivan commented Jun 13, 2020 •

edited

Leguark commented Jun 16, 2020 •

edited

florian-wagner commented Jun 16, 2020 •

edited