[Math] New QIPC ops for single-threaded linalg by hughperkins · Pull Request #683 · Genesis-Embodied-AI/quadrants

hughperkins · 2026-05-10T20:19:59Z

Issue: #

Brief Summary

copilot:summary

Walkthrough

copilot:walkthrough

Adds a free function quadrants.lang.matrix_ops.frobenius_inner(A, B) and a matching Matrix.frobenius_inner(other) method, computing ⟨A, B⟩ = Σ_ij A_ij B_ij. Mirrors the existing norm_sqr function — they are the same operation when A == B. Tests parametrise over arch=qd.gpu (CUDA / Metal / Vulkan / AMDGPU) for both f32 and f64, on square sizes 2, 3, 6, 9, 12 (matching qipc's IPC needs) plus rectangular shapes 9×12, 12×3, 2×4 to cover the non-square use cases (Hessian blocks in qipc). Closes the "Frobenius inner product" gap row in perso_hugh/doc/qipc/qipc_gaps_linalg.md.

…12 · 12×9) Adds test_matmul_chain_qipc_sizes_{f32,f64} verifying that the largest matmul chain qipc's IPC pipeline needs (9×12 · 12×12 · 12×9 → 9×9) compiles cleanly and matches numpy. Both the chained form (A @ B @ C) and the staged form (AB = A @ B; AB @ C) are checked, since the chained form may stress the backend codegen differently (intermediate has 1296 FMAs unrolled). Parametrised over arch=qd.gpu so CUDA / Metal / Vulkan / AMDGPU all run. Quadrants imposes no enforced size cap on matmul; the "n·m ≤ 32" warning in qipc's design doc is qipc-side only. Closes the "Matrix.__matmul__ correctness at large sizes" gap row in perso_hugh/doc/qipc/qipc_gaps_linalg.md (gap (verify) → ✅).

qipc's ARAP rotation R = U @ V.T must be a proper rotation (det R = +1) for any input deformation gradient F. The libuipc convention enforced by qipc is det(U) = det(V) = +1 always, with the sign of det(F) absorbed into σ (σ may have a negative entry when det(F) < 0). This test verifies that quadrants' qd.svd at 3×3 follows the same convention, across the cases qipc actually exercises: - identity (det = +1), reflection (det = -1), generic positive- and negative-det matrices, SPD, near rank-deficient, near-degenerate singular values. Parametrised over arch=qd.gpu × {f32, f64}.

…pivoting Adds _inverse_lu in matrix_ops.py: in-place Gauss elimination with partial pivoting, fully unrolled (static range for all loop bounds, runtime int for pivot-row index). The inverse function dispatches to it for N >= 5; sizes 1–4 keep the existing closed-form cofactor-expansion paths. Precondition relaxed from dim_lt(0, 5) to dim_lt(0, 13). The implementation maintains a working copy `a` for in-place LU and a parallel matrix `b` initialised to identity that receives the same row swaps + row eliminations; at the end `b = L⁻¹ P` and the inverse is read column-by-column by back-solving `U x = b[:, c]`. Tests at tests/python/test_linalg.py::test_inverse_large_{f32,f64} cover N ∈ {5..12} × {diagonally-dominant, SPD, permuted-upper-triangular}. The permuted-upper-triangular factory has a zero in [0, 0] so it specifically exercises the pivoting path. Tolerance scales with condition number × machine epsilon (50 × cond × eps + dtype floor). Parametrised over arch=qd.gpu so CUDA / Metal / Vulkan / AMDGPU all run.

Adds python/quadrants/_funcs_sym_eig_general.py with sym_eig_general(A, dt) and make_spd(A, dt). Implements Eigen 3.4's SelfAdjointEigenSolver compute() path: Householder tridiagonalisation + implicit QR with Wilkinson shift + ascending sort. Direct port of qipc/_src/core/linalg/evd.py — qipc can drop its private copy once this lands. qd.sym_eig now dispatches: N=2/3 keep the existing closed-form _sym_eig{2,3}x3 paths; 4 ≤ N ≤ 12 → sym_eig_general. Also exposes qd.make_spd(A) which projects a symmetric matrix to the nearest PSD matrix in Frobenius norm by clamping eigenvalues to ≥ 0 — qipc's per-element Hessian projection. Tests at tests/python/test_eig.py: - test_sym_eig_general_{f32,f64}: N ∈ {4, 5, 6, 9, 12} × {random symmetric, SPD, indefinite, diagonal, repeated-eigenvalues}. Verifies eigenvalues match numpy, eigenvectors are orthogonal, and Q diag(λ) Qᵀ ≈ A. - test_make_spd_{f32,f64}: N ∈ {4, 6, 9, 12} × {indefinite, random, SPD}. Verifies symmetry, PSD-ness (min eig ≥ -tol), and that the result matches numpy-reference clamping. Parametrised over arch=qd.gpu so CUDA / Metal / Vulkan / AMDGPU all run.

The previous Householder + implicit-QR port produced wrong eigenvalues for N>3 (off-diagonal residue ~50% of the input scale), and the algorithm's many static branches did not lend themselves to debugging via printf. Switch sym_eig_general (and the make_spd that builds on it) to cyclic Jacobi. The Jacobi loop is fully unrolled with static(range): runtime range loops in @func that return values were observed to iterate only once on this branch, so static unrolling of MAX_SWEEPS=6 sweeps is what actually reduces the off-diagonals across passes. The 6-sweep budget gives ~6 digits in f32 and ~12 digits in f64 for N≤9 on the test factories. N=12 (used by qipc's 12×12 contact Hessians) is dropped from this path: the fully static-unrolled Jacobi at N=12 with 6 sweeps does not finish compiling within 15 minutes on CUDA. Either a blocked/partially-runtime implementation, or porting qipc's exact `sym_evd` template-mutation pattern (ndarray of compound types passed via template()), is needed to recover N=12 — tracked as a follow-up. - _MAX_SWEEPS = 6 (sweep / (p,q) / per-row updates all static). - 2-pass right/left rotation (no `r != p, q` static guards) keeps the unrolled body lean enough to compile for N≤9. - sym_eig() now raises for N≥10 with a follow-up note. - test_eig.py: parametrize sym_eig_general / make_spd at N ∈ {4, 5, 6, 9} (drop 12).

Hugh's catch: the previous sweep loop wasn't broken — it was being parallelized. quadrants @qd.kernel reaches into a callee @qd.func and parallelizes the func's outermost runtime range loop when the kernel itself doesn't have one. The sweep loop saw _MAX_SWEEPS threads each running one sweep on a fresh copy of the locals, with last-write-wins. That's why static(range) was the only thing that "worked", and that's what blew up the compile time at N=12. Fix is a one-liner in the test kernel (for _tid in range(1):) plus flipping the sweep loop back to runtime range: for _sweep in range(_MAX_SWEEPS): # was static(range(_MAX_SWEEPS)) With that, N=12 compiles in ~2 minutes (vs not finishing in 15 min before) and the cap in qd.sym_eig() goes back up to 12, restoring qipc's target sizes. - _MAX_SWEEPS = 12 (was 6) — runtime, so cost is per-call, not compile. - qd.sym_eig() supports N ∈ {2..12}; N≥13 raises. - test_eig.py: re-add N=12 to the parametrize lists; wrap kernel calls in `for _tid in range(1):`. - Docstring on _funcs_sym_eig_general points at the gotcha note in perso_hugh that explains the parallel-for behavior.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 40af9db7ab

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-05-10T20:23:12Z

+            eigvecs[i, j] = zero
+        eigvecs[i, i] = one
+
+    for _sweep in range(_MAX_SWEEPS):


Make Jacobi sweeps sequential inside sym_eig

For callers that use qd.sym_eig in the normal scalar-kernel style, e.g. eigvals[None], eigvecs[None] = qd.sym_eig(A[None]) without a dummy outer for, this runtime range is parallelized by the kernel machinery instead of executing the 12 sweeps sequentially; the new implementation's own docstring calls out that this produces incorrect iteration semantics. That means the newly supported n >= 4 path can return a non-converged eigensystem unless every caller restructures its kernel, unlike the existing 2x2/3x3 API. Since _MAX_SWEEPS is a fixed constant, this should be hidden inside the implementation rather than imposed as a caller requirement.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-10T20:23:12Z



-__all__ = ["randn", "polar_decompose", "eig", "sym_eig", "svd", "solve"]
+__all__ = ["randn", "polar_decompose", "eig", "sym_eig", "make_spd", "svd", "solve"]


Update the API snapshot for public additions

Adding make_spd to _funcs.__all__ exposes qd.make_spd through the top-level wildcard import, and this same commit also exposes Matrix.frobenius_inner; however tests/python/test_api.py::test_api compares dir(qd) and dir(qd.Matrix) against hard-coded lists that contain neither name. The existing CPU API snapshot test will therefore fail even though the new APIs are intentional, so please update the expected API lists with these additions.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-10T20:23:12Z

+    if A.n <= 12:
+        return sym_eig_general(A, dt)


Keep user docs in sync with the expanded API

This expands the public qd.sym_eig contract from 2x2/3x3 to sizes up to 12, but no docs/ files were changed; AGENTS.md specifically requires docs updates for public API or usage changes. The current user guide still advertises qd.sym_eig as 2x2/3x3-only, and related changes in this commit also leave the documented inverse() size cap and new public helpers stale, so users will get contradictory guidance.

Useful? React with 👍 / 👎.

github-actions · 2026-05-10T20:52:23Z

Total: 7 file(s) changed, +479 -5 code lines.

Tests (test_eig.py): four new contract / edge-case tests for the N≥4 cyclic-Jacobi path, all parametrized over qd.gpu × N ∈ {4,6,9,12} and wrapped in the required `for _tid in range(1):` outer loop: - test_sym_eig_alpha_identity_f64 — α·I (incl. α=0) at every N to cover the fully-degenerate / repeated-eigenvalue case. - test_make_spd_idempotent_f64 — make_spd(make_spd(A)) ≈ make_spd(A) over indefinite / negative-definite / SPD inputs. - test_make_spd_negative_definite_zero_f64 — all-negative-eig inputs project to the zero matrix. - test_sym_eig_above_cap_raises — N=13 raises with a clear "up to 12" message instead of silently miscompiling. 30/30 pass on cluster (CUDA + Vulkan + CPU, ~32 min) and amddesktop (AMDGPU + Vulkan + CPU, ~10 min). Docs (user_guide/decompositions.md): updated the table to include N up to 12 for qd.sym_eig and add qd.make_spd; documented the cyclic-Jacobi path, the for-_tid kernel pattern requirement, and compile/runtime cost characteristics. Replaced the inlined make_spd 3×3 snippet with a real qd.make_spd example for N≥4 (kept the 3×3 recipe for users below the cap). Added a Frobenius inner-product section to user_guide/matrix_vector.md and updated the Matrix.inverse size cap from 4×4 to 12×12.

github-actions · 2026-05-10T21:40:23Z

Total: 7 file(s) changed, +565 -5 code lines.

github-actions · 2026-05-10T22:07:21Z

Diff coverage: 21% · 505 lines, 397 missing

- decompositions.md → linalg_per_thread.md (rename + retitle "Per-thread linear algebra"; updated opening paragraph to make the per-thread, non-cooperative semantics explicit; updated "Related" cross-links). - matrix_vector.md → split: type / storage / declaration content stays here; element-wise + closed-form ops (arithmetic, dot/cross, norm, transpose/det/trace/inverse, frobenius_inner, mat-vec/mat-mat multiply) move to matrix_vector_per_thread.md. - index.md toctree: replaces `decompositions` with `matrix_vector_per_thread` + `linalg_per_thread` under Core concepts. The "_per_thread" suffix marks pages whose ops run per thread in registers with no cross-thread cooperation (no shared memory, no syncs, no warp/subgroup primitives). Cross-thread / sparse linalg under qd.linalg.* is a separate axis and is not covered yet. No source / test changes — pure docs reorg.

The dispatch shape-cap exceptions in qd.polar_decompose / qd.svd / qd.eig / qd.solve and the dim assertion in qd.solve all said "2D matrix" / "3D matrix" when they meant "2×2 / 3×3 matrix". "2D matrix" conventionally means "rank-2 tensor", which is true of every matrix — so the message was actively misleading. Updated to: - "Polar decomposition only supports 2×2 and 3×3 matrices." - "SVD only supports 2×2 and 3×3 matrices." - "Eigen solver only supports 2×2 matrices." - "Solver only supports 2×2 and 3×3 matrices." - assert "Only 2×2 and 3×3 matrices are supported" Also updated the one test that pinned on the old wording (test_ast_refactor.py::test_raise) and dropped the FIXME from the linalg_per_thread.md doc. Verified locally on amddesktop: test_raise passes 3/3 (cpu + amdgpu + vulkan).

github-actions · 2026-05-10T22:48:13Z

Total: 8 file(s) changed, +571 -11 code lines.

github-actions · 2026-05-10T23:14:53Z

Diff coverage: 22% · 510 lines, 400 missing

Drops the caller-pattern requirement that qd.sym_eig / qd.make_spd be called from inside a top-level `for _tid in range(N):` in the @qd.kernel. The sweep loop in _funcs_sym_eig_general.sym_eig_general now opens with `qd.loop_config(serialize=True)`, which pins its parallelism to 1 — so even when the kernel parallelizer reaches into the callee @qd.func (the underlying gotcha) it serializes the sweep loop and iterates the requested MAX_SWEEPS times on a single thread. Tests (test_eig.py): removed all `for _tid in range(1):` wrappers from the kernels in test_sym_eig_general / test_make_spd / test_sym_eig_alpha_identity / test_make_spd_idempotent / test_make_spd_negative_definite_zero / test_sym_eig_above_cap_raises. Each test now calls `qd.sym_eig(...)` or `qd.make_spd(...)` as a plain single-element kernel body. Verified: - amd (CPU + AMDGPU + Vulkan): full test_eig.py 211/211 pass in 18:22. - MRE without any wrapper at N ∈ {4,6,9,12} on amdgpu / vulkan / cpu: all pass with eig err ≤ 1.7e-14, orth err ≤ 5e-15 (f64). - cluster (CPU + CUDA + Vulkan): 126/126 of the new sym_eig_general + make_spd test parametrizations pass before Slurm step time-out (no failures or crashes from this code path). Docs (linalg_per_thread.md): removed the "Caller pattern" subsection and the top-level "needs a top-level `for` in the calling kernel" bullet — the constraint no longer exists.

github-actions · 2026-05-10T23:59:08Z

Total: 8 file(s) changed, +566 -11 code lines.

github-actions · 2026-05-11T00:27:18Z

Diff coverage: 22% · 505 lines, 394 missing

CI test_api[arch=arm64-quadrants] and test_api[arch=arm64-Matrix] were failing because the new public symbols added on hp/new-qipc-ops-linalg (`qd.make_spd` module-level entry point and the `Matrix.frobenius_inner` method) weren't listed in the expected API manifests in tests/python/test_api.py. Added: - "make_spd" in user_api[qd] (alphabetical, between "loop_config" and "math"). - "frobenius_inner" in _get_expected_matrix_apis() (between "fill" and "identity"). Verified locally: test_api[arch=x64-quadrants] and test_api[arch=x64-Matrix] both pass on amd. The other test_api parametrizations failing locally are pre-existing API drift between quadrants==0.7.6 wheel and main; unrelated to this PR.

Reflowed AI-default ~75-80c wraps to the project's 120c budget on the new docstrings touched by this PR: - sym_eig (`_funcs.py`): closed-form/cyclic-Jacobi dispatch note, plus dropped a stale `.. note::` block describing the old "needs `for _tid in range(...)` wrap" caller-pattern requirement (no longer true after loop_config(serialize=True)). - make_spd (`_funcs.py` and `_funcs_sym_eig_general.py`): both docstrings. - sym_eig_general (`_funcs_sym_eig_general.py`): module docstring + Returns block. - Matrix.frobenius_inner docstring (`lang/matrix.py`). - _inverse_lu docstring (`lang/matrix_ops.py`). After: all touched runs sit in the 104-115c range, well within 120c and well clear of the ~80c AI-default that the CI line-wrapping checker flags.

Black reformat from `pre-commit run -a`: - test_linalg.py: split a multi-arg `np.testing.assert_allclose(...)` onto separate lines. - test_svd.py: collapsed a few oversplit f-strings / `print(...)` calls back onto single lines (now under 120c). No behavioural change.

CI was timing out on test_make_spd_idempotent_f64[arch=cuda-*-12] (~10 min/test budget). The test was defining two @qd.kernel closures each with its own qd.make_spd(...) call — each closure JIT-compiles the cyclic-Jacobi + reconstruct path independently, so N=12 on CUDA hit 2× single-kernel compile time and exceeded the timeout. Refactor to a single parametric kernel taking ndarray args: @qd.kernel def project(src: NDArray[mat_t, 1], dst: NDArray[mat_t, 1]): dst[0] = qd.make_spd(src[0], dt) project(A, A_spd_1) project(A_spd_1, A_spd_2) Now qd.make_spd is JIT-compiled exactly once per test and called twice with different ndarray bindings; the second call is a pure launch with no recompile. Verified locally: 12/12 (4 sizes × 3 factories) pass on amd (amdgpu+vulkan+cpu) in 2:18.

…-step numerical algorithm' (clearer scope)

…nt already covered in perf section

github-actions · 2026-05-11T20:24:34Z

Total: 9 file(s) changed, +565 -11 code lines.

Aligns sort behavior with NumPy / LAPACK conventions inside each op: - `qd.sym_eig` 2×2 now returns eigenvalues ascending (matches >=3×3 paths and `np.linalg.eigh`). Trivial fix — emit `(lambda_lo, lambda_hi)` instead of the high/low pair, with eigenvectors swapped to match. - `qd.svd` 3×3 now sorts singular values descending (matches 2×2 path and `np.linalg.svd`). Adds a 3-element selection sort over `sig_v` with paired column swaps in U, V; tracks swap parity and negates column 0 of U and V at the end if odd, restoring `det(U) = det(V) = +1` (Sifakis's invariant). Reconstruction is preserved across all swaps and the column-0 sign fix-up. Cross-op disagreement (sym_eig ascending vs. svd descending) is the LAPACK / NumPy convention and is left as-is. New tests, parametrized over `qd.gpu` and shapes: - `test_sym_eig_sort_order_{f32,f64}` for n in {2, 3, 4, 6, 9, 12}. - `test_svd_sort_order_{f32,f64}` for n in {2, 3}, including a hand-picked 3×3 with σ that arrives unsorted from Sifakis. Skips `vulkan + n=3` for sym_eig: the closed-form `_sym_eig3x3` path (Eigen3 `computeDirect`) segfaults during SPIR-V codegen on the cluster's Vulkan stack. Same code runs cleanly on amddesktop's Vulkan, so it's a pre-existing driver / SDK quirk, not a regression. Verified: 103 passed cluster CUDA, 23 passed cluster CUDA+Vulkan sort_order (1 skip), 146 passed amddesktop AMDGPU+Vulkan (1 skip). Doc: `linalg_per_thread.md` drops both sort-consistency FIXMEs, states the per-op sort directions explicitly, and notes the 3×3 det(U)=det(V)=+1 enforcement (useful for ARAP-style rotations).

github-actions · 2026-05-11T20:53:24Z

Diff coverage: 22% · 502 lines, 391 missing

github-actions · 2026-05-11T21:24:43Z

Total: 9 file(s) changed, +680 -25 code lines.

Reflows comment blocks and docstrings that were wrapped at the AI default ~75-80c instead of the project's 120c target, flagged by the line-wrapping CI check. Touches comments only — no behavior change.

github-actions · 2026-05-11T22:10:16Z

Total: 9 file(s) changed, +680 -25 code lines.

Tightens 2-line docstrings whose first line was wrapped at ~76-83c instead of using the 120c budget more evenly. Found by the line-wrapping CI on the previous push.

github-actions · 2026-05-11T22:50:19Z

Total: 9 file(s) changed, +680 -25 code lines.

github-actions · 2026-05-11T23:29:08Z

Total: 9 file(s) changed, +680 -25 code lines.

github-actions · 2026-05-12T01:21:50Z

Diff coverage: 73% · 615 lines, 167 missing

MuGdxy

`n * m > 32` warning contradicts the new 12×12 support

python/quadrants/lang/matrix.py L305–316 emits a UserWarning when constructing any matrix with more than 32 entries:

if self.n * self.m > 32:
    warning(
        f"Quadrants matrices/vectors with {self.n}x{self.m} > 32 entries are not suggested."
        " Matrices/vectors will be automatically unrolled at compile-time for performance."
        " So the compilation time could be extremely long if the matrix size is too big."
        ...
    )

The new APIs in this PR (qd.sym_eig N≤12, qd.make_spd, Matrix.inverse() N≤12) internally construct matrices via Matrix.zero(dt, N, N) → _filled_matrix → Matrix(...), which hits this check. For N≥6 (6×6 = 36 > 32), users will see a "not suggested" warning when calling officially supported APIs — contradicting the PR's intent.

Suggestions:

Suppress the warning on internal call paths (e.g. via warnings.filterwarnings or an internal _no_size_warning flag); or
Raise the threshold from 32 to match the new caps (e.g. 144 = 12×12); or
At minimum, document that users can safely ignore this warning for the new APIs.

Not a blocker, but it hurts UX — especially qd.make_spd which constructs multiple large matrices internally and may emit the warning several times per call.

Addresses MuGdxy's review on PR #683: the n*m > 32 UserWarning fires during JIT trace of officially-supported APIs that internally construct ≥6×6 register-resident matrices (`qd.sym_eig` N≤12, `qd.make_spd`, `Matrix.inverse` N≤12), contradicting the PR's intent to support those sizes. 144 = 12×12 is the new natural cap: it matches the largest size every officially-supported per-thread linalg op accepts, so internal constructions stay below the warning threshold while user code that builds a register-resident matrix beyond that still gets the "consider qd.field instead" advice. Doc updated in `matrix_vector.md` (intro line + "Size limit" section).

hughperkins · 2026-05-12T16:38:29Z

n * m > 32 warning contradicts the new 12×12 support

python/quadrants/lang/matrix.py L305–316 emits a UserWarning when constructing any matrix with more than 32 entries:
if self.n * self.m > 32:
    warning(
        f"Quadrants matrices/vectors with {self.n}x{self.m} > 32 entries are not suggested."
        " Matrices/vectors will be automatically unrolled at compile-time for performance."
        " So the compilation time could be extremely long if the matrix size is too big."
        ...
    )
The new APIs in this PR (qd.sym_eig N≤12, qd.make_spd, Matrix.inverse() N≤12) internally construct matrices via Matrix.zero(dt, N, N) → _filled_matrix → Matrix(...), which hits this check. For N≥6 (6×6 = 36 > 32), users will see a "not suggested" warning when calling officially supported APIs — contradicting the PR's intent.

Suggestions:

Suppress the warning on internal call paths (e.g. via warnings.filterwarnings or an internal _no_size_warning flag); or

Raise the threshold from 32 to match the new caps (e.g. 144 = 12×12); or

At minimum, document that users can safely ignore this warning for the new APIs.

Not a blocker, but it hurts UX — especially qd.make_spd which constructs multiple large matrices internally and may emit the warning several times per call.

Addressed by Opus:

Threshold raised from 32 → 144 in lang/matrix.py; warning text updated to match;
matrix_vector.md intro and "Size limit" sections updated to 144 (with a note that it lines up with the per-thread linalg APIs' 12×12 cap).

github-actions · 2026-05-12T17:05:03Z

Total: 9 file(s) changed, +681 -26 code lines.

github-actions · 2026-05-12T19:12:50Z

Diff coverage: 71% · 660 lines, 190 missing

hughperkins added 7 commits May 10, 2026 07:35

chatgpt-codex-connector Bot reviewed May 10, 2026

View reviewed changes

hughperkins added 2 commits May 10, 2026 15:11

hughperkins commented May 10, 2026

View reviewed changes

Comment thread docs/source/user_guide/linalg_per_thread.md Outdated

hughperkins commented May 10, 2026

View reviewed changes

Comment thread docs/source/user_guide/linalg_per_thread.md Outdated

hughperkins added 9 commits May 11, 2026 15:17

Merge branch 'main' into hp/new-qipc-ops-linalg

ebfedca

[Docs] linalg_per_thread: 'numerical algorithm' → 'iterative or multi…

e5222b7

…-step numerical algorithm' (clearer scope)

[Docs] linalg_per_thread: drop redundant 'They run per thread' sentence

da3d787

[Docs] linalg_per_thread: tighten example exception wording

f665808

[Docs] linalg_per_thread: drop undefined 'cap' reference; codegen poi…

1a74929

…nt already covered in perf section

hughperkins assigned alanray-tech May 11, 2026

[Comments] reflow under-wrapped prose to 120c target

cfdf1fd

Reflows comment blocks and docstrings that were wrapped at the AI default ~75-80c instead of the project's 120c target, flagged by the line-wrapping CI check. Touches comments only — no behavior change.

[Comments] reflow 3 more docstrings flagged by line-wrap CI

8df4bd9

Tightens 2-line docstrings whose first line was wrapped at ~76-83c instead of using the 120c budget more evenly. Found by the line-wrapping CI on the previous push.

Merge branch 'main' into hp/new-qipc-ops-linalg

9a922ff

This comment was marked as outdated.

Sign in to view

MuGdxy reviewed May 12, 2026

View reviewed changes

Merge branch 'main' into hp/new-qipc-ops-linalg

0bd5d91

hughperkins changed the title ~~[Math] New single-threaded linalg ops for QIPC~~ [Math] New QIPC ops for single-threaded linalg May 12, 2026



		__all__ = ["randn", "polar_decompose", "eig", "sym_eig", "svd", "solve"]
		__all__ = ["randn", "polar_decompose", "eig", "sym_eig", "make_spd", "svd", "solve"]

Conversation

hughperkins commented May 10, 2026

Brief Summary

Walkthrough

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 10, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 10, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 10, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 10, 2026

Uh oh!

github-actions Bot commented May 10, 2026

Uh oh!

github-actions Bot commented May 10, 2026

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented May 10, 2026

Uh oh!

github-actions Bot commented May 10, 2026

Uh oh!

github-actions Bot commented May 10, 2026

Uh oh!

github-actions Bot commented May 11, 2026

Uh oh!

github-actions Bot commented May 11, 2026

Uh oh!

github-actions Bot commented May 11, 2026

Uh oh!

github-actions Bot commented May 11, 2026

Uh oh!

github-actions Bot commented May 11, 2026

Uh oh!

github-actions Bot commented May 11, 2026

Uh oh!

github-actions Bot commented May 11, 2026

Uh oh!

github-actions Bot commented May 12, 2026

Uh oh!

This comment was marked as outdated.

Uh oh!

MuGdxy left a comment

Choose a reason for hiding this comment

n * m > 32 warning contradicts the new 12×12 support

Uh oh!

hughperkins commented May 12, 2026

n * m > 32 warning contradicts the new 12×12 support

Uh oh!

github-actions Bot commented May 12, 2026

Uh oh!

github-actions Bot commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

`n * m > 32` warning contradicts the new 12×12 support

`n * m > 32` warning contradicts the new 12×12 support