Fix vector DMInterpolation overshoot at cell-shared boundaries#164
Merged
lmoresi merged 1 commit intodevelopmentfrom May 5, 2026
Merged
Fix vector DMInterpolation overshoot at cell-shared boundaries#164lmoresi merged 1 commit intodevelopmentfrom
lmoresi merged 1 commit intodevelopmentfrom
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Fixes incorrect/garbage values produced by the PETSc-based evaluator for vector-valued MeshVariables at a small set of boundary-coincident Q2 DOF locations, and hardens post-load vector visibility by forcing proper local↔global sync / mesh-lvec invalidation after read_timestep.
Changes:
- Add “hint cell” recovery + output zeroing + reference-coordinate clamping in
DMInterpolationSetUp_UW/DMInterpolationEvaluate_UWto avoid uninitialised/overshoot results at cell-shared boundaries. - Harden
MeshVariable.read_timestep()by syncing var local→global and invalidating the cached mesh-level local vector. - Add a developer reproducer script documenting the bug and the expected round-trip behavior.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
src/underworld3/function/petsc_tools.c |
Changes PETSc interpolation setup/evaluation to prevent uninitialised outputs and mitigate boundary/reference-coordinate issues for vectors. |
src/underworld3/discretisation/discretisation_mesh_variables.py |
Ensures read_timestep() results are visible through mesh-level vectors by syncing and invalidating cached mesh _lvec. |
docs/developer/design/_repro_dminterp_multifield_bug.py |
Adds a minimal reproducer/regression gate script for the multi-field/vector interpolation failure mode. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+1317
to
+1321
| self._sync_lvec_to_gvec() | ||
| if self.mesh._lvec is not None: | ||
| self.mesh._lvec.destroy() | ||
| self.mesh._lvec = None | ||
| self.mesh._stale_lvec = True |
| """ | ||
|
|
||
| import numpy as np | ||
| import sympy |
Comment on lines
+106
to
+120
| /* | ||
| Build a per-point cell map combining DMLocatePoints results with | ||
| the upstream `owning_cell` hint. PETSc's _REMOVE flag silently | ||
| drops points sitting exactly on inter-cell boundaries (e.g. the | ||
| degree-2 cell-interior nodes of a Q2 vector field that happen to | ||
| floating-point-coincide with a face). The upstream caller already | ||
| filters out genuinely-out-of-domain points via | ||
| `mesh.points_in_domain`, so any unlocated point here is in the | ||
| mesh; the hint from `mesh.get_closest_cells` gives a valid | ||
| adjacent cell. Using the hint to recover those points avoids | ||
| silent data loss in the interpolant — the symptom in Phase H was | ||
| `uw.function.evaluate(u.sym, u.coords)` returning ~1e+246 at five | ||
| isolated cell-midpoint nodes (uninitialised PETSc memory in the | ||
| output buffer). | ||
| */ |
Comment on lines
+121
to
138
| PetscInt *recovery_cells = NULL; | ||
| PetscCall(PetscMalloc1(N, &recovery_cells)); | ||
| for (PetscInt k = 0; k < N; ++k) | ||
| recovery_cells[k] = owning_cell ? (PetscInt)owning_cell[k] : -1; | ||
| for (p = 0; p < numFound; ++p) { | ||
| if (foundCells[p].index >= 0) foundProcs[foundPoints ? foundPoints[p] : p] = rank; | ||
| if (foundCells[p].index >= 0) { | ||
| PetscInt orig_p = foundPoints ? foundPoints[p] : p; | ||
| recovery_cells[orig_p] = foundCells[p].index; | ||
| foundProcs[orig_p] = rank; | ||
| } | ||
| } | ||
| /* Recover unlocated points on rank 0 using the hint. */ | ||
| if (owning_cell && rank == 0) { | ||
| for (PetscInt k = 0; k < N; ++k) { | ||
| if (foundProcs[k] == size && recovery_cells[k] >= 0) | ||
| foundProcs[k] = rank; | ||
| } | ||
| } |
Comment on lines
+263
to
+276
| // Clamp reference coords to the cell's valid range. When a | ||
| // hint cell was used (via SetUp's owning_cell fallback) for a | ||
| // point sitting on a shared face, the physical-to-reference | ||
| // map can return ξ slightly outside [-1, 1] (e.g. 1+1e-16) — | ||
| // and the Q2 / higher-order basis polynomial evaluates to | ||
| // large numbers when extrapolated. For shared-boundary points | ||
| // the correct answer is the cell-boundary value, which the | ||
| // basis returns when ξ is clamped to the cell's valid range. | ||
| // [-1, 1] covers quad / hex elements; triangle / tet maps | ||
| // already return ξ inside their reference simplex so this | ||
| // clamp is a no-op for them at well-behaved coordinates. | ||
| for (d = 0; d < cdim; ++d) { | ||
| if (xi[d] < -1.0) xi[d] = -1.0; | ||
| else if (xi[d] > 1.0) xi[d] = 1.0; |
`uw.function.evaluate(vec_var.sym, coords)` silently returned values far outside the data range — up to ~1e+246 — at a small handful of degree-2 cell-interior nodes whose coordinates floating-point-coincide with an inter-cell face. Scalars were unaffected; vectors lost their y-component to either uninitialised PETSc memory or basis-polynomial overshoot. Three layered defects in `src/underworld3/function/petsc_tools.c`: 1. `DMInterpolationEvaluate_UW` skipped the FE evaluation for points with `cells[p] = -1` (rejected by `DMLocatePoints`'s strict containment test) but never zeroed `interpolant[p*dof + ...]`. Result: those points presented as whatever was in PETSc-allocated output memory (1e+246-class outliers). 2. `DMInterpolationSetUp_UW` discarded the upstream `owning_cell` hint for any point `DMLocatePoints` rejected. The hint comes from `mesh.get_closest_cells()` which is reliable: callers already filter for in-domain points via `mesh.points_in_domain()` upstream so any unlocated point at this stage is genuinely inside the mesh, just sitting on a face PETSc decided not to claim. `recovery_cells` map now combines `foundCells` with the hint; rank 0 claims ownership of hint-recovered points. 3. `DMPlexCoordinatesToReference` returns ξ slightly outside [-1, 1] for boundary points evaluated against an adjacent cell, and the Q2 basis polynomial extrapolates wildly there. Clamping ξ to the cell's reference range produces the correct boundary value (the basis is continuous at faces, so either adjacent cell yields the same answer once ξ is clamped to the shared face). `MeshVariable.read_timestep` also gains a local→global vector sync plus mesh-lvec invalidation. This isn't part of the visible bug but makes `read_timestep` symmetric with the sibling `load_from_h5_plex_vector` and removes a foot-gun for any caller reading `_gvec` directly after a load. Regression gate: `docs/developer/design/_repro_dminterp_multifield_bug.py` builds a fresh mesh, assigns known constants to a vector + several scalars, evaluates each at its own DOF nodes via `uw.function.evaluate`, and reports max abs error. Pre-fix output: vector `err = 7.0` (full component loss); scalar `err = 2.8e-14`. Post-fix: every variable round-trips at machine precision (1.4e-14). This bug silently corrupted the y-component of any vector field evaluated through `uw.function.evaluate`. Anything downstream that sets initial conditions, checks BC enforcement, post-processes, or visualises a velocity field could pick up the corruption depending on whether mesh resolution placed a degree-2 interior node on a cell face. Found while diagnosing a Phase H pyvista snapshot that showed `max |u| = 70` instead of `V_top = 0.227`. Underworld development team with AI support from [Claude Code](https://claude.com/claude-code)
346b1a0 to
8a56a80
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
uw.function.evaluate(vec_var.sym, coords)silently returned valuesfar outside the data range at a small handful of degree-2
cell-interior nodes whose coordinates floating-point-coincide with
an inter-cell face. Scalars were unaffected; vectors lost their
y-component to either uninitialised PETSc memory (values up to
~1e+246) or Q2 basis-polynomial overshoot.
This bug silently corrupted any vector field evaluated through
uw.function.evaluate— initial conditions, BC enforcement checks,post-processing, visualisation, anything downstream of the canonical
evaluator on certain mesh resolutions. Found while diagnosing a
visualisation snapshot that showed
max |u| = 70instead ofV_top = 0.227.Three layered defects in
src/underworld3/function/petsc_tools.cUninitialised output for unlocated points —
DMInterpolationEvaluate_UWskipped the FE evaluation forpoints with
cells[p] = -1but never zeroedinterpolant[p*dof + ...]. Now zeroed explicitly.Lost upstream cell hint —
DMInterpolationSetUp_UWdiscardedthe
owning_cellhint passed in bymesh.get_closest_cells().recovery_cellsmap now combinesDMLocatePointsresults withthe hint; rank 0 claims ownership of hint-recovered points.
Justification: callers already filter for in-domain points via
mesh.points_in_domain()upstream, so any point unlocated hereis genuinely inside the mesh — just sitting on a shared face
PETSc's strict containment test rejected.
Reference-coord overshoot —
DMPlexCoordinatesToReferencereturns ξ slightly outside [-1, 1] for boundary points evaluated
against an adjacent cell, and the Q2 basis polynomial
extrapolates wildly. ξ now clamped to the cell's valid range —
correct boundary value because the basis is continuous at faces.
MeshVariable.read_timestephardeningAdds a local→global vector sync + mesh-lvec invalidation at the end
of
read_timestep. Not part of the visible bug, but makesread_timestepsymmetric with siblingload_from_h5_plex_vectorand removes a foot-gun for any caller reading
_gvecdirectlyafter a load.
Test plan
docs/developer/design/_repro_dminterp_multifield_bug.pybuilds a fresh mesh, assigns known constants to a vector +
multiple scalars, evaluates each at its own DOF nodes,
reports max abs error. Both fresh-build and save+load.
- Pre-fix: vector
err = 7.0(full y-component loss);scalar
err = 2.8e-14.- Post-fix: every variable round-trips at machine precision
(1.4e-14).
Underworld development team with AI support from Claude Code