Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unification of point / cell sets among formats #619

Open
tianyikillua opened this issue Jan 5, 2020 · 33 comments
Open

Unification of point / cell sets among formats #619

tianyikillua opened this issue Jan 5, 2020 · 33 comments

Comments

@tianyikillua
Copy link
Contributor

tianyikillua commented Jan 5, 2020

Objectives

Unify the names (gmsh:physical, medit:ref, flac3d:zone...) and implementations of point / cell sets among formats.

Names

Currently these different names need to be hard-coded when the subsets are transferred among formats, like

if key in mesh.cell_data and "medit:ref" in mesh.cell_data[key]:
    labels = mesh.cell_data[key]["medit:ref"]
elif key in mesh.cell_data and "gmsh:physical" in mesh.cell_data[key]:
    # Translating gmsh data to medit is an important case, so treat it
    # explicitly here.
    labels = mesh.cell_data[key]["gmsh:physical"]
elif key in mesh.cell_data and "flac3d:zone" in mesh.cell_data[key]:
    labels = mesh.cell_data[key]["flac3d:zone"]
else:
    labels = numpy.ones(len(data), dtype=int)

Two solutions

  1. Either provide a list of these names like
CELL_SETS_NAMES = [
    "gmsh:physical", "medit:ref", "flac3d:zone", ...
]
  1. Adopt a unified name and object for point / cell sets, like
mesh.point_sets = ...
mesh.cell_sets = ...

Implementations

An example of mesh composed of 5 line elements and 6 nodes. We have two point sets: "Red" and "Green", and two cell sets: "Orange" and "Blue".

subsets

  1. Use a point_data and cell_data as indicator functions. Using the current terminology for MED, we have
mesh.point_data["point_tags"] = [0, 1, 1, 2, 1, 0]
mesh.point_tags = {
    1: ["Red"],
    2: ["Green"],
}

mesh.cell_data["line"]["cell_tags"] = [-1, -2, -1, 0, -2]
mesh.cell_tags = {
    -1: ["Orange"],
    -2: ["Blue"],
}

Note that with this implementation, we need two lists: the indicator function list that assigns a unique value to each point / cell, as well as a list that indicates the actual names corresponding to these unique values. This implementation is suited when we want to visualize point / cell sets in ParaView, since they are modeled as a point_data / cell_data.

This implementation is currently adopted by (to be completed)

  • gmsh
  • medit
  • med
  • flac3d
  1. Explicit enumerations of points / cell id's
mesh.point_sets = {
    "Red": [1, 2, 4],
    "Green": [3],
}
mesh.cell_sets = {
    "Orange": {"line": [0, 2]},
    "Blue": {"line": [1, 4]},
}

This implementation uses less memory and is more explicit. However it needs to be converted to the 1st method when visualizing them under ParaView. A universal conversion function can be provided in the Mesh class.

This method is already sketched (not yet exposed) in various formats

  • permas
  • abaqus

Question: can this 2nd method adopted universally for all mesh formats, considering their own specificities?

@gdmcbain
Copy link
Contributor

gdmcbain commented Jan 5, 2020

Thank you for launching this. I think it is the most pressing issue. I'll close #290 in favour of this.

I think 'gmsh' needs to be split into MSH2 and MSH4.1.

Another format that I'm actively working on so that it will support #290 is XDMF.

@gdmcbain
Copy link
Contributor

gdmcbain commented Jan 5, 2020

Question: can this 2nd method adopted universally for all mesh formats, considering their own specificities?

I think the answer to this is pretty much negative without a lot of cleverness? I envisage internally a Union of two representations, one isomorphic to MEDIT or MSH2, the other fulfilling #290. Conversion between the two, imperfect, as discussed previously, but independent of format as far as possible.

@gdmcbain
Copy link
Contributor

gdmcbain commented Jan 6, 2020

I'm hoping that fixing this will clarify and resolve limitations in reading and writing of MSH4.1 such as #524 #536 #550 #614.

@tianyikillua
Copy link
Contributor Author

tianyikillua commented Jan 6, 2020

I think the answer to this is pretty much negative without a lot of cleverness?

Maybe I need a quick catch up here, but what are the reasons why we can not represent sets in MSH4 by

mesh.point_sets = {
    "Red": [1, 2, 4],
    "Green": [3],
}
mesh.cell_sets = {
    "Orange": {"line": [0, 2], "quad": [1, 10, 11]},
    "Blue": {"line": [1, 4]},
}

@gdmcbain
Copy link
Contributor

gdmcbain commented Jan 6, 2020

None, I think, except historical. This looks better, a better reflection of the Gmsh MSH4.1 intent, capable of fixing the outstanding issues. I think you're on the right track.

Oh, do you mean why not dump the cell_data cells->tags approach and use this cels_sets tags->sets of cells for everything? The former is a more obvious and direct reflection of the 'ref' in MEDIT and 'physical entity' in MSH2. As discussed, one can more or less convert between the two representations, possibly lossily unless with additional bookkeeping, so I think that if one were only working in formats like MEDIT and MSH2 that one wouldn't want to convert to a set-based tagging and back again.

@nschloe
Copy link
Owner

nschloe commented Jan 6, 2020

Thanks the for the nice write-up.

Small remark: The original example doesn't contain the case where one point/cell belongs to multiple sets (e.g., "iron" and "boundary").

@tianyikillua
Copy link
Contributor Author

tianyikillua commented Jan 6, 2020

For points / cells that are "tagged" with multiple sets,

  1. The 1st point_data / cell_data method requires creating intersections of sets (as MED does)
  2. The 2nd "explicit enumerations of points / cell id's" method naturally covers the possibilities of having several sets that contain the same point / cell.

@nschloe
Copy link
Owner

nschloe commented Jan 6, 2020

Let me add that, coming from a PDE background, the material would probably be implemented as a function. As opposed to tags, the limitation here is that every point/cell has exactly one value.

I like this representation

mesh.cell_sets = {
    "Orange": {"line": [0, 2]},
    "Blue": {"line": [1, 4]},
}

for sets.

@nschloe
Copy link
Owner

nschloe commented Jan 6, 2020

Note also that gmsh calls "tag" what we'd rather refer to as a function, i.e.,

mesh.cell_data["line"]["cell_tags"] = [-1, -2, -1, 0, -2]

You probably realize that already.

@gdmcbain
Copy link
Contributor

gdmcbain commented Jan 6, 2020

That's MSH2, not MSH4.

@gdmcbain
Copy link
Contributor

gdmcbain commented Jan 6, 2020

In MSH2, each cell was assigned exactly one ‘physical entity’. If one defined two physical entities that both contained a particular cell, that cell was actually duplicated so that each only had to belong to a single physical entity. (Remember the comment 2018-02-05 from #175.)

But MSH4 did away with this. If I take the same Gmsh GEO snippet from that comment

Point (1) = {0, 0, 0, 1};
Point (2) = {1, 0, 0, 1};
Point (3) = {2, 0, 0, 1};

Line (1) = {1, 2};
Line (2) = {2, 3};

Physical Line("left", 1) = {1};
Physical Line("all", 2) = {1, 2};

and run it today through Gmsh 4.5.0-git-0c1b63c8d with

gmsh -1 twoseg.geo

I get

$MeshFormat
4.1 0 8
$EndMeshFormat
$PhysicalNames
2
1 1 "left"
1 2 "all"
$EndPhysicalNames
$Entities
3 2 0 0
1 0 0 0 0 
2 1 0 0 0 
3 2 0 0 0 
1 0 0 0 1 0 0 2 1 2 2 1 -2 
2 1 0 0 2 0 0 1 2 2 2 -3 
$EndEntities
$Nodes
5 3 1 3
0 1 0 1
1
0 0 0
0 2 0 1
2
1 0 0
0 3 0 1
3
2 0 0
1 1 0 0
1 2 0 0
$EndNodes
$Elements
2 2 1 2
1 1 1 1
1 1 2 
1 2 1 1
2 2 3 
$EndElements

This MSH4.1 is quite different to the MSH2.2 generated by Gmsh two years ago. (I can get the same today with -format msh2.) Note that the MSH4.1 has not duplicated the left element.

The problem is that meshio.read doesn't read this MSH4.1 correctly. It doesn't complain but

>>> mesh.cells
{'line': array([[0, 1],
       [1, 2]])}
>>> mesh.cell_data
{'line': {'gmsh:physical': array([1, 2])}}

The cells are right but the membership of line cell 1 in "gmsh:physical" 2 is lost. I suspect this is the same as one or more of the issues listed above yesterday.

@gdmcbain
Copy link
Contributor

gdmcbain commented Jan 6, 2020

For points / cells that are "tagged" with multiple sets,

  1. The 1st point_data / cell_data method requires creating intersections of sets (as MED does)
  2. The 2nd "explicit enumerations of points / cell id's" method naturally covers the possibilities of having several sets that contain the same point / cell.

Yes.

I think the 1st representation is natural for MEDIT and MSH2, but it's definitely defective for MED and MSH4.1, maybe Exodus II, XDMF, …

There's not much benefit for MEDIT & MSH2 in using the second representation that I can think of. Possibly convenient for meshio in not having two representations, but then if read-writing MEDIT/MSH2, it would mean a pointless roundtrip conversion between first and second representations. Maybe not that big a negative, but meanwhile: would it be a reasonable step to just get all the meshio implementations of the second representation into line under the banner of ‘uniformation of point / cell sets among formats’. Then after that we can think about whether the first representation can be dropped from formats for which it's adequate.

@nschloe
Copy link
Owner

nschloe commented Jan 6, 2020

I wouldn't worry about MSH2.2 at all. So yeah, let's see if we can get representation 2 to work. Let's best start with those formats where it's native.

@gdmcbain
Copy link
Contributor

gdmcbain commented Jan 6, 2020

I mostly mentioned MSH2.2 as our current implementation of MSH4.1 basically coerces the latter into the former.

Also though MSH2.2 is still the latest MSH version written by Gmsh 3.0.6 which is the one in the official Ubuntu package for Ubuntu 18.04, the latest Ubuntu LTS.

MSH2.2 was an important format for many years so there are a few meshes out there written in it; e.g. scikit-fem/docs/examples/box.msh.

I don't think we need to touch it here though, no.

@gdmcbain
Copy link
Contributor

gdmcbain commented Jan 7, 2020

I don't think we need to touch it here though, no.

Actually in the same spirit of unification, it would be good to see all meshio implementations of the first representation unified ("gmsh:physical" and "medit:ref" spring to mind). That has always grated. But perhaps not if the first representation is to be abandoned (something I'm not sure about).

@tianyikillua
Copy link
Contributor Author

Yes these two representations are both valid and can be inter-converted if needed. For points / cells belonging to multiple subsets, the 1st method involves introducing new cell tags corresponding to intersections of sets, while the 2nd method is more convenient.

Note that internally the MED format actually adopts the 1st method!

mesh.point_tags =
{2: ['Side'],
 3: ['Side', 'Top'],
 4: ['Top']}

mesh.cell_tags =
{-6: ['Top circle'],
 -7: ['Top', 'Top and down'],
 -8: ['Top and down'],
 -9: ['A', 'B'],
 -10: ['B'],
 -11: ['A', 'B', 'C'],
 -12: ['B', 'C'],
 -13: ['C']}

so the tag 3 (which is automatically created internally by MED) is assigned to nodes that belong both to "Side" and "Top"; the tag -11 is for cells that belong to the subsets "A", "B" and "C". From this the user can reconstruct the 2nd representation if he wants.

What I suggest is that if these additional tags for intersections of sets are already pre-computed and available in the mesh format, then the 1st representation can be used. Otherwise (for MSH4 as I understand), construct and expose these subsets using the 2nd method is more feasible, as the computation of set intersections could be tedious.

In practice,

  • When the 1st method is used, these tags should remain inside the point_data / cell_data structure, with a name to be defined: either specific for each format (in this case a list of these names could be provided), either a unique name
  • When the 2nd method is used, expose these subsets directly in the Mesh object by mesh.point_sets and mesh.cell_sets.

@gdmcbain
Copy link
Contributor

gdmcbain commented Jan 8, 2020

either specific for each format (in this case a list of these names could be provided), either a unique name

I vote for a unique name. That makes is trivial for meshio to convert cell-tags between any formats using the first representation, e.g. MSH2.2 and MEDIT, which it currently does not attempt.

It also means that any downstream program wanting to open a mesh and find the cell-tags (e.g. signifying subdomains or parts of the boundary in a mixed boundary value problem) can do so via meshio without worrying about whether the input mesh was written in MSH2.2 or MEDIT or whatever. For a real code example, see: skfem.importers.meshio.from_meshio for subdomains and parts of boundaries.

@tianyikillua
Copy link
Contributor Author

Yes I also vote for a unique name. Should some special treatment (underscore, add a "meshio" prefix, ...) be used for its name to reflect that it is a special point / cell data?

@gdmcbain
Copy link
Contributor

Yes, I guess so. I don't foresee the particular choice being that important. I think either of your suggestions will work fine. Say "meshio:tag"?

@tianyikillua
Copy link
Contributor Author

@nschloe @gdmcbain @keurfonluu I begin to work on an idea yesterday.

Since several formats (GMSH in particular) may define several tag possibilities (gmsh:physical and gmsh:geometrical) within a same mesh, using a unique name like meshio:tag only for one of them is not nice.

What I propose now, is to

  1. Keep the original point/cell tag inside point_data and cell_data, like gmsh:physical, medit:ref...
  2. For interoperationality among formats, when reading a mesh file (one of ) the tag information is also stored in a new attribute mesh.point_tags / mesh.cell_tags. This new attribute is used when writing to a new mesh file.

The name point_tags and cell_tags are in parallel with the point_sets and cell_sets. They reflect two different implementations of regions.

What do you think?

@gdmcbain
Copy link
Contributor

gdmcbain commented Feb 13, 2020

Thanks for taking this up again.

I wouldn't be too worried about "gmsh:geometrical"; I don't believe that it is ever used in applications. Also it's only MSH 2.2, isn't it? In referring to Gmsh formats, the latter must be distinguished from MSH 4.1, they're incompatible and far from isomorphic.

One other thing with MSH 4.1 is that cells (or rather cellblocks, but cells through their membership of blocks) can have zero or more physical tags. This was supposed to be the case in MSH 2.2 too but Gmsh itself didn't write MSH 2.2 like this, duplicating or multiplying elements when they were specified according to the GEO file to have two or more physical tags. (This is.documented here a few hundred issues back.) I'm not sure whether there are other file formats that allow other than one physical tag per cell, probably. MEDIT has exactly one tag per cell. To handle this, should each cell in .cell_tags get a (possibly empty) list or set of tags?

Besides MSH 2.2, which formats have 'several tag possibilities'?

I particularly like the idea of starting to get the physical tags out of .cell_data. As @keurfonluu wrote earlier today or yesterday, this contains fields like temperature which are quite different in character, float rather than int.

Also, I'm not sure whether this is the right place to ask, but could you elaborate a little on .cell_sets? I've only seen them used in formats which I don't use so am not clear on their more applicability or semantics.

@tianyikillua
Copy link
Contributor Author

One other thing with MSH 4.1 is that cells (or rather cellblocks, but cells through their membership of blocks) can have zero or more physical tags. This was supposed to be the case in MSH 2.2 too but Gmsh itself didn't write MSH 2.2 like this, duplicating or multiplying elements when they were specified according to the GEO file to have two or more physical tags. (This is.documented here a few hundred issues back.) I'm not sure whether there are other file formats that allow other than one physical tag per cell, probably. MEDIT has exactly one tag per cell. To handle this, should each cell in .cell_tags get a (possibly empty) list or set of tags?

How does GMSH treat exactly points / cells that are tagged by several sets? In your example (#619 (comment))

$Elements
2 2 1 2
1 1 1 1
1 1 2 
1 2 1 1
2 2 3 
$EndElements

It appears to me that the 1st element is only tagged by 1, although in the GEO file it is also tagged by 2.

In MED, when cells / points belong to several subsets defined by the user, then internally intersections of subsets are created with a new tag (#619 (comment)), so meshio can easily read these new tags for each point / cell.

If in GMSH no new tags are created for represent intersections of original subsets, I suggest adopt first the point_sets and cell_sets approach, already operational in the ABAQUS reader.

In [3]: mesh
Out[3]:
<meshio mesh object>
  Number of points: 270
  Number of cells:
    hexahedron: 88
  Point sets: Nfix, Nsurface

In [4]: mesh.point_sets
Out[4]:
{'Nfix': array([  0,  45,  90, 135, 180, 225]),
 'Nsurface': array([  0,  45,  46,   1,  47,   2,  48,   3,  49,   4,  50,   5,  51,
          6,  52,   7,  53,   8,  54,   9,  55,  10,  56,  11,  57,  12,
         58,  13,  59,  14,  60,  15,  61,  16,  62,  17,  63,  18,  64,
         19,  65,  20,  66,  21,  67,  22,  68,  23,  69,  24,  70,  25,
         71,  26,  72,  27,  73,  28,  74,  29,  75,  30,  76,  31,  77,
         32,  78,  33,  79,  34,  80,  35,  81,  36...])}

In mesh.point_sets and mesh.cell_sets, we just enumerate the point / cell natural id (corresponding to mesh.points and each block inside mesh.cells) defining the subset. Using this representation, for your GMSH example, that would be (suppose only one block of two line elements is present in cells)

mesh.cell_sets = {
    "left": [np.array([0])],
    "all": [np.array([0, 1])]
}

the name of each tag can be directly put into this structure.

@gdmcbain
Copy link
Contributor

gdmcbain commented Feb 13, 2020

It appears to me that the 1st element is only tagged by 1, although in the GEO file it is also tagged by 2.

The tags aren't given directly in the MSH 4.1 $Elements section. This one has two blocks of elements. The first block of elements belongs to 1-dimensional 'entity' 1, the second to 1-dimensional entity 2. Those element blocks belonging to entity (1, 1) have 2 physical tags, 1 and 2, corresponding to the names "left" and "all". The element blocks belonging to entity (1, 2) have only 1 physical tag, 2, "all".

@nschloe
Copy link
Owner

nschloe commented Feb 13, 2020

I like cell_sets and point_sets a lot since, in my view, these are the only actual tags: Each cell can have multiple, doesn't need to have any. This should be used for things like boundary, inlet etc.

As opposed to that, many formats define regions in terms an integer-valued function on cells. Each cell has exactly one value. meshio shouldn't care about that. "You give me an integer function? Okay, here's your cell_data." If a format has proper tags -> cell_sets.

For the interoperability ideas: point_tags, cell_tags don't need to copy over the array. Make it a private attribute _point_tag_name, _cell_tag_name, to a key in {point,cell}_sets.

@gdmcbain
Copy link
Contributor

Thanks for the explanation of Mesh.cell_sets. It looks like it could cover MSH 4.1, albeit with some redunduncy, each element of a block always having the same tags.

@nschloe
Copy link
Owner

nschloe commented Feb 13, 2020

each element of a block always having the same tags

Sorry to ask, you probably have explained that somewhere before: Are those gmsh 4.1 blocks always of the same cell type?

@gdmcbain
Copy link
Contributor

Yes. The four ints beginning each block of elements are

 entityDim(int) entityTag(int) elementType(int) numElementsInBlock(size_t)

so exactly one elementType.

@tianyikillua
Copy link
Contributor Author

I'll try to first work on a version that covers all formats but msh4.1.

@tianyikillua
Copy link
Contributor Author

tianyikillua commented Feb 13, 2020

A first draft is now in https://github.com/tianyikillua/meshio/commits/unification-point-cell-tags.

Already done

  • abaqus: point and cell sets
  • avsucd: cell tags
  • exodus: point sets
  • flac3d: cell tags
  • gmsh22: cell tags
  • med: point and cell tags
  • medit: point and cell tags
  • ugrid: cell tags

TODO

  • gmsh44: cell tags
  • permas: point and cell sets
  • ugrid: cell tags (WARNING after reviewing the code, it seems that when writing it does not support cells that contain multiple blocks of a same cell type due to ugrid_counts[key] = data.shape[0], also should in fact "triangle" be before "quad"?)

Repr of mesh also now prints point / cell tags.

In [4]: meshio.read("flac3d/flac3d_mesh_ex.f3grid")                             
Out[4]: 
<meshio mesh object>
  Number of points: 160
  Number of cells:
    hexahedron: 45
    pyramid: 9
    hexahedron: 18
    wedge: 9
    hexahedron: 6
    wedge: 3
    hexahedron: 6
    wedge: 3
    pyramid: 6
    tetra: 3
  Cell data: flac3d:zone
  Cell tags: 1, 2, 3, 4

In [5]: meshio.read("flac3d/flac3d_mesh_ex.f3grid").cell_tags                   
Out[5]: 
[array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
        2]),
 array([2, 2, 2, 2, 2, 2, 2, 2, 2]),
 array([4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4]),
 array([4, 4, 4, 4, 4, 4, 4, 4, 4]),
 array([3, 3, 3, 3, 3, 3]),
 array([3, 3, 3]),
 array([3, 3, 3, 3, 3, 3]),
 array([3, 3, 3]),
 array([3, 3, 3, 3, 3, 3]),
 array([3, 3, 3])]

Now point / cell tags can be transferred among formats, through mesh.point_tags and mesh.cell_tags. Here the cell data gmsh:physical is transferred into med:family.

In [3]: meshio.read("msh/insulated-2.2.msh")                                    
Out[3]: 
<meshio mesh object>
  Number of points: 67
  Number of cells:
    line: 21
    triangle: 111
  Cell data: gmsh:physical, gmsh:geometrical
  Cell tags: 1, 2, 3

mesh = meshio.read("msh/insulated-2.2.msh")
mesh.write("test.med")
mesh = meshio.read("test.med")

In [14]: mesh                                                                   
Out[14]: 
<meshio mesh object>
  Number of points: 67
  Number of cells:
    line: 21
    triangle: 111
  Cell data: med:family
  Cell tags: 1, 2, 3

@tianyikillua
Copy link
Contributor Author

@gdmcbain As you can see from the previous flac3d example, the 1st (hexa) block contains two tags. So each cell block is not necessarily associated with a unique tag.

@gdmcbain
Copy link
Contributor

gdmcbain commented Feb 13, 2020

That (heterogeneous tags within a block) seems like a terrible design decision, but that's not our call. I haven't heard of flac3d but I guess it's out there and meshio has to support it. Thank you for highlighting this.

@tianyikillua
Copy link
Contributor Author

I will push a draft PR for easy review and discussion.

@keurfonluu
Copy link
Contributor

That (heterogeneous tags within a block) seems like a terrible design decision, but that's not our call. I haven't heard of flac3d but I guess it's out there and meshio has to support it. Thank you for highlighting this.

FLAC3D is not a proper meshing software. It's a commercial mechanical simulator mostly used in civil engineering. It has a meshing tool that stitches pre-meshed grids all together. For instance, you can create a big pyramid made of smaller hexs and tets that corresponds to the roof of a house for instance. The order of cells in FLAC3D is defined by the order you create your objects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants