Skip to content

TPMS/lattice perf: PyVista 0.48 SMP tools + parallelize ImplicitLattice field eval #131

@kmarchais

Description

@kmarchais

After auditing the actual hot paths, only one item from PyVista 0.48 maps onto a real microgen bottleneck. The bigger win is unrelated to PyVista and lives in ImplicitLattice. Recording both here.

1. enable_smp_tools() — real win for TPMS / generic implicit shapes

The TPMS surface/sheet/skeletal pipeline runs heavy VTK filters on a dense StructuredGrid (resolution³ cells, often 100³-200³):

  • microgen/shape/tpms.py:357,360,375,391grid.clip_scalar(...) for sheet / upper / lower
  • microgen/shape/tpms.py:397,403,412extract_surface().clean().triangulate()
  • microgen/shape/tpms.py:428grid.contour(isosurfaces=[0.0], ...) (marching cubes)
  • microgen/shape/shape.py:167 — same contour pattern in the generic implicit shape pipeline
  • microgen/shape/implicit_lattice.py:259,277clip_scalar / contour for lattice surfacing
  • microgen/box_mesh.py:552,603 — per-phase threshold + extract_surface

These are exactly the filters VTK's SMP backend (TBB / OpenMP) parallelizes. Marching cubes and clip on a dense grid are typically 50-80% of TPMS wall time, so wrapping the surface-generation entry points with pv.enable_smp_tools() should give a real multi-core speedup with no algorithmic change.

Action: bump pyvista>=0.46>=0.48 in pyproject.toml, refresh the stale pyvista : 0.42.2 line in microgen/report.py:48, then prototype + benchmark enable_smp_tools() around Tpms.surface / Tpms.grid_sheet / Shape contour at e.g. resolution 100 / 150 / 200.

2. ImplicitLattice field eval — bigger lever, not a PyVista feature

The ImplicitLattice bottleneck is not in any VTK filter, so SMP tools don't help here. It's the Python triple loop in microgen/shape/implicit_lattice.py:207-223:

for ix in range(self.repeat_cell[0]):
    for iy in range(self.repeat_cell[1]):
        for iz in range(self.repeat_cell[2]):
            ...
            cell_field = _compute_strut_field(points, nodes, self._base_struts, ...)
            field = smooth_min(field, cell_field, self.smoothing)

For each of the repeat_cell cells, _compute_strut_field is evaluated over the full grid (resolution³ points) and reduced. Cost scales as O(N_cells × N_grid_points × N_struts) with no parallelism.

Options, in order of effort vs. payoff:

  1. Numba @njit(parallel=True) with prange over cells. Move the triple loop + _compute_strut_field + smooth_min into a single kernel and prange the outer cell loop. Near-linear scaling on cores, no algorithmic change. ~1 day of work.
  2. Localize via spatial index. Each strut only affects points within radius + smoothing. Build a cKDTree of grid points, query per strut, update only the local slab of the SDF. Turns O(P · S) into O(P + S · k). Big asymptotic win for sparse / large lattices.
  3. GPU offload (JAX / Torch). Embarrassingly parallel; biggest absolute speedup but heavy dependency surface.

(1) is the clear next step. (2) becomes attractive once the grid gets very fine.

Items considered and dropped

For the record, after checking the codebase these PyVista 0.48 features do not apply:

  • wrap() perf work. No pv.wrap() calls anywhere in microgen. pv.StructuredGrid(...) is constructed once at init.
  • remove_nan_cells. No NaN handling code in tpms_grading.py / Infill / anywhere else; implicit functions are continuous and finite by construction.
  • Dataset accessor / reader-writer registry. Only ~15 to_pyvista / from_pyvista calls total across the codebase, all internal one-shot conversions. No user-facing ceremony to dissolve.
  • Arrow / DataFrame export. No pandas / polars / pyarrow usage today; pure speculation.
  • Modernized notebook HTML repr. Cosmetic; existing repr in the example notebooks already works.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions