fix: matrix_power exponent gradient — clamp subgradient + Python-scalar support by bruAristimunha · Pull Request #29 · spdlearn/spd_learn

bruAristimunha · 2026-05-17T12:38:31Z

Context

Follow-up to #26, which added a gradient w.r.t. the exponent parameter in matrix_power.backward. Two issues remained after that merge.

Bug 1 — spurious huge exponent gradient at clamped eigenvalues

For numerical stability, eigenvalues below get_epsilon(dtype, "eigval_power") (~1e-13 for double) are clamped to that threshold. The X-gradient correctly picks subgradient 0 there:

# core.py:412-413 (existing)
s_deriv = exponent * s_clamped.pow(exponent - 1.0)
s_deriv[s <= threshold] = 0   # subgradient 0 at clamped eigenvalues

But the new exponent-gradient kept flowing through s_modified * log(s_safe) = threshold^e * log(threshold). For s = 1e-30, e = -0.5, this produces grad_exponent ≈ -6.18e+07 instead of the expected ~-0.49.

This makes any training that touches near-singular SPDs through matrix_power blow up.

Bug 2 — Python-scalar exponent crashes `backward`

Several internal call sites pass a Python float:

# modules/liebn.py:289
matrix_power.apply(X, self.theta)   # self.theta is a Python float

PR #26's backward does .reshape_as(exponent) which crashes with TypeError: reshape_as(): argument 'other' must be Tensor, not float. Reproduced on main.

Fix

forward: cast exponent to a 0-d tensor on X's device/dtype via torch.as_tensor so the .reshape_as call always sees a tensor.
backward: apply the same subgradient-0 convention to exp_g via torch.where(s > threshold, exp_g, 0.0).
Skip the exponent-gradient computation (including the n×n matmul U.mT @ grad_output @ U) when ctx.needs_input_grad[1] is False — saves work on the common path where a Python scalar is passed.
Docstring updated: exponent : float or torch.Tensor.

Net diff: +25 / −13 lines.

Tests

tests/test_functional.py::test_matrix_power now covers:

existing gradcheck with 0-d tensor exponent (unchanged)
backward with a Python-scalar exponent doesn't crash
exponent gradient stays bounded (< 1) on a near-singular input where the bug previously produced ~6e7

Full suite: 707 passed, 146 skipped (no regressions).

Authors

@bruAristimunha

…alues, accept Python scalars PR #26 added a gradient w.r.t. the exponent in matrix_power.backward, but two issues remained: 1. With near-singular inputs, the X-gradient correctly picks subgradient 0 at clamped eigenvalues (s_deriv[s <= threshold] = 0 in `derivative`), but the new exponent-gradient kept flowing through `s_modified * log(s_safe) = threshold^e * log(threshold)`, producing huge spurious values (e.g. ~6e7 for s=1e-30, e=-0.5). Apply the same subgradient-zero convention. 2. Passing a Python scalar (as several call sites do, e.g. `matrix_power.apply(X, self.theta)` in modules/liebn.py) crashed backward at `.reshape_as(exponent)`. Normalize the exponent to a 0-d tensor in forward and return None for non-Variable inputs. The exponent-gradient block also runs only when needed (skips the n×n matmul `U.mT @ grad_output @ U` on the common no-grad-on-exponent path). Regression tests cover both cases.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 22fda2bbbb

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

github-actions · 2026-05-18T15:40:17Z

📚 Documentation Preview

📦 Download Documentation Artifact

Download the documentation-html artifact from the workflow run to view the docs locally.

💡 To enable live previews, add a SURGE_TOKEN secret to this repository. See surge.sh for setup instructions.

chatgpt-codex-connector Bot reviewed May 17, 2026

View reviewed changes

Comment thread spd_learn/functional/core.py

pre-commit

6f4671f

bruAristimunha merged commit a40daf6 into main May 18, 2026
11 checks passed

bruAristimunha deleted the fix/matrix-power-exponent-grad-clamp branch May 18, 2026 15:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: matrix_power exponent gradient — clamp subgradient + Python-scalar support#29

fix: matrix_power exponent gradient — clamp subgradient + Python-scalar support#29
bruAristimunha merged 2 commits into
mainfrom
fix/matrix-power-exponent-grad-clamp

bruAristimunha commented May 17, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

github-actions Bot commented May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bruAristimunha commented May 17, 2026

Context

Bug 1 — spurious huge exponent gradient at clamped eigenvalues

Bug 2 — Python-scalar exponent crashes backward

Fix

Tests

Authors

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

github-actions Bot commented May 18, 2026

📚 Documentation Preview

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Bug 2 — Python-scalar exponent crashes `backward`