After auditing the actual hot paths, only one item from PyVista 0.48 maps onto a real microgen bottleneck. The bigger win is unrelated to PyVista and lives in ImplicitLattice. Recording both here.
1. enable_smp_tools() — real win for TPMS / generic implicit shapes
The TPMS surface/sheet/skeletal pipeline runs heavy VTK filters on a dense StructuredGrid (resolution³ cells, often 100³-200³):
microgen/shape/tpms.py:357,360,375,391 — grid.clip_scalar(...) for sheet / upper / lower
microgen/shape/tpms.py:397,403,412 — extract_surface().clean().triangulate()
microgen/shape/tpms.py:428 — grid.contour(isosurfaces=[0.0], ...) (marching cubes)
microgen/shape/shape.py:167 — same contour pattern in the generic implicit shape pipeline
microgen/shape/implicit_lattice.py:259,277 — clip_scalar / contour for lattice surfacing
microgen/box_mesh.py:552,603 — per-phase threshold + extract_surface
These are exactly the filters VTK's SMP backend (TBB / OpenMP) parallelizes. Marching cubes and clip on a dense grid are typically 50-80% of TPMS wall time, so wrapping the surface-generation entry points with pv.enable_smp_tools() should give a real multi-core speedup with no algorithmic change.
Action: bump pyvista>=0.46 → >=0.48 in pyproject.toml, refresh the stale pyvista : 0.42.2 line in microgen/report.py:48, then prototype + benchmark enable_smp_tools() around Tpms.surface / Tpms.grid_sheet / Shape contour at e.g. resolution 100 / 150 / 200.
2. ImplicitLattice field eval — bigger lever, not a PyVista feature
The ImplicitLattice bottleneck is not in any VTK filter, so SMP tools don't help here. It's the Python triple loop in microgen/shape/implicit_lattice.py:207-223:
for ix in range(self.repeat_cell[0]):
for iy in range(self.repeat_cell[1]):
for iz in range(self.repeat_cell[2]):
...
cell_field = _compute_strut_field(points, nodes, self._base_struts, ...)
field = smooth_min(field, cell_field, self.smoothing)
For each of the repeat_cell cells, _compute_strut_field is evaluated over the full grid (resolution³ points) and reduced. Cost scales as O(N_cells × N_grid_points × N_struts) with no parallelism.
Options, in order of effort vs. payoff:
- Numba
@njit(parallel=True) with prange over cells. Move the triple loop + _compute_strut_field + smooth_min into a single kernel and prange the outer cell loop. Near-linear scaling on cores, no algorithmic change. ~1 day of work.
- Localize via spatial index. Each strut only affects points within
radius + smoothing. Build a cKDTree of grid points, query per strut, update only the local slab of the SDF. Turns O(P · S) into O(P + S · k). Big asymptotic win for sparse / large lattices.
- GPU offload (JAX / Torch). Embarrassingly parallel; biggest absolute speedup but heavy dependency surface.
(1) is the clear next step. (2) becomes attractive once the grid gets very fine.
Items considered and dropped
For the record, after checking the codebase these PyVista 0.48 features do not apply:
wrap() perf work. No pv.wrap() calls anywhere in microgen. pv.StructuredGrid(...) is constructed once at init.
remove_nan_cells. No NaN handling code in tpms_grading.py / Infill / anywhere else; implicit functions are continuous and finite by construction.
- Dataset accessor / reader-writer registry. Only ~15
to_pyvista / from_pyvista calls total across the codebase, all internal one-shot conversions. No user-facing ceremony to dissolve.
- Arrow / DataFrame export. No pandas / polars / pyarrow usage today; pure speculation.
- Modernized notebook HTML repr. Cosmetic; existing repr in the example notebooks already works.
After auditing the actual hot paths, only one item from PyVista 0.48 maps onto a real microgen bottleneck. The bigger win is unrelated to PyVista and lives in
ImplicitLattice. Recording both here.1.
enable_smp_tools()— real win for TPMS / generic implicit shapesThe TPMS surface/sheet/skeletal pipeline runs heavy VTK filters on a dense
StructuredGrid(resolution³cells, often 100³-200³):microgen/shape/tpms.py:357,360,375,391—grid.clip_scalar(...)for sheet / upper / lowermicrogen/shape/tpms.py:397,403,412—extract_surface().clean().triangulate()microgen/shape/tpms.py:428—grid.contour(isosurfaces=[0.0], ...)(marching cubes)microgen/shape/shape.py:167— samecontourpattern in the generic implicit shape pipelinemicrogen/shape/implicit_lattice.py:259,277—clip_scalar/contourfor lattice surfacingmicrogen/box_mesh.py:552,603— per-phasethreshold+extract_surfaceThese are exactly the filters VTK's SMP backend (TBB / OpenMP) parallelizes. Marching cubes and clip on a dense grid are typically 50-80% of TPMS wall time, so wrapping the surface-generation entry points with
pv.enable_smp_tools()should give a real multi-core speedup with no algorithmic change.Action: bump
pyvista>=0.46→>=0.48inpyproject.toml, refresh the stalepyvista : 0.42.2line inmicrogen/report.py:48, then prototype + benchmarkenable_smp_tools()aroundTpms.surface/Tpms.grid_sheet/Shapecontour at e.g. resolution 100 / 150 / 200.2.
ImplicitLatticefield eval — bigger lever, not a PyVista featureThe
ImplicitLatticebottleneck is not in any VTK filter, so SMP tools don't help here. It's the Python triple loop inmicrogen/shape/implicit_lattice.py:207-223:For each of the
repeat_cellcells,_compute_strut_fieldis evaluated over the full grid (resolution³points) and reduced. Cost scales asO(N_cells × N_grid_points × N_struts)with no parallelism.Options, in order of effort vs. payoff:
@njit(parallel=True)withprangeover cells. Move the triple loop +_compute_strut_field+smooth_mininto a single kernel andprangethe outer cell loop. Near-linear scaling on cores, no algorithmic change. ~1 day of work.radius + smoothing. Build acKDTreeof grid points, query per strut, update only the local slab of the SDF. TurnsO(P · S)intoO(P + S · k). Big asymptotic win for sparse / large lattices.(1) is the clear next step. (2) becomes attractive once the grid gets very fine.
Items considered and dropped
For the record, after checking the codebase these PyVista 0.48 features do not apply:
wrap()perf work. Nopv.wrap()calls anywhere in microgen.pv.StructuredGrid(...)is constructed once at init.remove_nan_cells. No NaN handling code intpms_grading.py/Infill/ anywhere else; implicit functions are continuous and finite by construction.to_pyvista/from_pyvistacalls total across the codebase, all internal one-shot conversions. No user-facing ceremony to dissolve.