v0.8.0 - Matrix/Vector methods and optimisation
Vector-oriented Matrix API — six new methods (vecdot,
cross, normalize, perpendicular, angle,
magnitude_squared), two new read-only properties (size,
length), and a unified in_place= keyword on every unary
method round out Matrix as a first-class vector and
batch-of-vectors type — plus an internal X-macro template refactor
of every _math.c op family that restores the compiler's
auto-vectoriser. 44 of 71 benched rows improved by ≥10%, with
representative wins of −50% to −88% on aggregates, broadcast
arithmetic, and normalize. The _math extension now ships
with -O3 (Linux/macOS) / /O2 (Windows) so end users pick
up the wins by default.
New Features
-
Vector-oriented
Matrixmethods — six new methods designed
for theNx2/2xN/Nx3/3xNvector and
batch-of-vectors shapes that show up inexamples/boids.pyand
similar simulation code:magnitude_squared(axis=None)— squared L2 norm without the
sqrtstep. Cheaper thanmagnitude()and safe for
sub-normal thresholding.vecdot(other, axis=None)— axis-aware inner product matching
numpy.linalg.vecdot. Not equivalent tonumpy.dot;
use@for matrix multiplication. Same-shape, row-broadcast
(1xNvsMxN), and column-broadcast (Mx1vsMxN)
operands are all supported.cross(other, axis=None)— 2D scalar z-component or 3D cross
product. Five shape paths share one method:1x2/2x1
returns a float;1x3/3x1returns a same-orientation
Matrix;Nx2/2xNbatches collect per-vector
scalars;Nx3/3xNbatches return same-shapeMatrix
results.axis=disambiguates the square2x2/3x3
shapes (default per-row).normalize(axis=None, in_place=False)— divide every element
by its magnitude. Zero-magnitude rows / columns are returned as
exact zeros (no NaN, no division by zero).axis=selects
per-row, per-column, or total normalisation.perpendicular(axis=None, in_place=False)— rotate every 2D
vector 90° counter-clockwise:(x, y) -> (-y, x). Accepts a
single 2D vector, anNx2row batch, or a2xNcolumn
batch.angle(axis=None)— polar angleatan2(y, x)of every 2D
vector. Returns a float for a single 2D vector input,
otherwise aMatrixof per-vector angles.
-
Matrix.sizeproperty — total element count
(rows * columns). Matchesnumpy.ndarray.size. -
Matrix.lengthproperty — Frobenius (L2) magnitude as a
read-only@propertyso vector-like code reads naturally
(direction.length,velocity.length) without the
parentheses of a method call. Equivalent tomagnitude()with
no axis argument. -
in_place=keyword on every unaryMatrixmethod —
transpose,ceil,floor,round,negate,
abs, plus the newnormalizeandperpendicularall
acceptin_place=Trueto mutateselfand return it.
Replaces the oldertranspose_in_place()method (see
Breaking Changes below). -
axis=keyword on aggregate methods —sum,mean,
min,max,magnitude, and the newmagnitude_squared
now share a tri-stateaxis=argument (None/0/1)
decoded through a single classifier. Negative axes (-1/
-2) accepted for NumPy parity.
Improvements
-
Auto-vectorised
_math.cop kernels — the binary,
aggregate, unary, and two-operand-aggregate op families inside
_math.care now stamped from per-family descriptor tables,
one kernel per (op, shape) combination. Each per-element body is
literally substituted into its own monomorphic inner loop,
restoring the precondition for GCC's / Clang's auto-vectoriser.
Representative wins (lower is better):Bench row 0.7.0 (ns) 0.8.0 (ns) Δ mean()shape=(1000, 100)44179.6 9001.6 −79.6% mean(1)shape=(1000, 100)51699.4 7058.5 −86.3% max(1)shape=(1000, 100)97184.2 11322.7 −88.3% magnitude()shape=(1000, 3)1098.2 306.8 −72.1% add col-bcastshape=(1000, 100)37823.4 20172.5 −46.7% div same-shapeshape=(1000, 100)80134.2 45458.9 −43.3% normalize()shape=(1000, 3) axis=None3644.6 1775.5 −51.3% Four rows in code paths untouched by the refactor regressed by
5–15% from layout drift (_math.so.textgrew +125% from
kernel specialisation); none are on a hot path. No behavioural
change;test_matrix.pypasses unchanged. -
-O3//O2onbocpy._math— the math extension now
sets per-platformextra_compile_argsinsetup.py
(-O3 -fno-plton Linux/macOS,/O2on Windows) so end-user
wheels and editable installs both pick up the auto-vectoriser
wins above. Otherbocpyextensions are unaffected. The SBOM
hash for_math.*.sowill drift accordingly — see
:doc:sbomfor the auditor-facing note.
Breaking Changes
Matrix.transpose_in_place()removed — superseded by
Matrix.transpose(in_place=True), which returnsselfand
so composes the same way every other unary method does.
Migration is mechanical: replacem.transpose_in_place()with
m.transpose(in_place=True).
Documentation
- New
MatrixAPI entries in :doc:apiforsize,length,
magnitude_squared,vecdot,cross,normalize,
perpendicular, andangle, plus updatedin_place=
keyword signatures on the existing unary methods.
Tests
- 234 new test cases for the new
Matrixmethods and
properties (1571 → 1805 passed). Coverage includes a stub-guard
test that greps__init__.pyifor every new C-level name and
in-cown coverage exercising each new method inside@when. - Portable overflow regex + cross 2x3/3x2 contract pinning —
the cross-product test for the doubly-valid2x3/3x2
shapes now pins the 2D-batch interpretation explicitly, locking
the documented behaviour.
Internal
scripts/bench_matrix.py— bench harness used to gate the
refactor:--jsonappend mode,--report-medianper-row
merge, 200 ms warmup, batch-size auto-tuning.scripts/validate_wheel.py+
scripts/_vendored_warehouse_wheel.py— stdlib-only wheel
RECORDvalidator and a vendored slice of Warehouse's wheel
parser; used by the PR gate to catchRECORDregressions
before PyPI does.
CI / build
cibuildwheelv3.4.0 → v3.4.1 andclang-format-action
pin normalised to the underlying commit SHA (Dependabot's
preferred format). Both pins move in lock-step with the
github-actions Dependabot group.idna3.16 → 3.17 inci/constraints-docs.txt. Five
other Dependabot proposals (docutils0.23,ruamel-yaml
0.19,sphinx-tabs3.4.7+,sphinx-toolbox4.2, and
standard-imghdr3.13) require Python ≥3.11 and so cannot
enter a universal lock that still includes Python 3.10; a
comment aboverequires-python = ">=3.10"in
pyproject.tomllists them for the post-3.10-EOL bump.flake8extend-excludefor.copilot/,build/,
sphinx/build/, and the scratch.env*venvs so the walker
no longer trips on generated or vendored Python files.