Conversation
Agent-Logs-Url: https://github.com/Quafadas/vecxt/sessions/7d847bd1-604b-4b29-a265-f3cb0abf602d Co-authored-by: Quafadas <24899792+Quafadas@users.noreply.github.com>
|
@copilot I've updated the design doc to roll back the use of specialised. Can you review your implementation to check consistency with that new principle? |
…imitives Agent-Logs-Url: https://github.com/Quafadas/vecxt/sessions/9b13c6c2-31a2-45c9-a1e2-aa70b9a9f636 Co-authored-by: Quafadas <24899792+Quafadas@users.noreply.github.com>
Done in bc60626. Removed all |
|
@copilot Write out a detailed implementation plan for Milestone 2. |
…e-type-factories' into copilot/ndarray-core-type-factories
Here is the detailed implementation plan for Milestone 2. Milestone 2: Indexing + Views — Detailed PlanGoal: Read/write individual elements, and create views of an NDArray without copying data. All code lives in Files
1. Element read —
|
| Group | Tests |
|---|---|
| Element read | 1D, 2D, 3D correct value; col-major layout reads correct element; strided/offset view reads correctly |
| Element write | 1D, 2D mutation; mutation through a view is visible on the original data |
| Slice | Correct shape/offset/strides; data aliasing (mutation visible); slice of slice |
| Transpose 2D | Shape/strides swapped; T.T identity; element equivalence |
| Transpose N-D | Arbitrary permutation; invalid permutation rejected |
| Reshape | Contiguous → new shape, correct strides; non-contiguous triggers copy; wrong numel rejected |
| Squeeze | All-1 dims removed; single dim removed; non-1 dim rejected |
| Unsqueeze | Shape gains new size-1 dim; unsqueeze.squeeze round-trip |
| Flatten | Contiguous returns view (same data reference); non-contiguous returns fresh array |
| toArray | Dense col-major returns clone; strided view materialises correctly |
| Bounds check | Out-of-range index throws; wrong rank throws; no skips check |
Key invariants throughout
- View operations (
slice,transpose,squeeze,unsqueeze) always sharearr.data— mutation through one is visible through the other (NumPy semantics, documented). - Copy operations (
toArray,flattenon non-contiguous,reshapeon non-contiguous) return a freshArray. - Stride formula everywhere:
offset + Σ indices(k) * strides(k). - All entry points respect
BoundsCheck— inline, erasable at call site withDoBoundsCheck.no. - All code in
vecxt/src/(cross-platform shared only — no JVM/JS/Native forks in M2).
|
@copilot implement it |
|
@copilot continue with the implementation. Implement milestone 2 in code, with tests etc as per the plan you've written out above. |
…en, toArray Agent-Logs-Url: https://github.com/Quafadas/vecxt/sessions/12022e30-8b23-45cf-9d8f-c6ccf26801ca Co-authored-by: Quafadas <24899792+Quafadas@users.noreply.github.com>
Done in e17a842. Implemented all of Milestone 2: New file
All 140 tests pass. |
|
@copilot I've pushed an update to the design doc which changes the approach to broadcasting. We want it to be explicit rather than implicit. Please check that diff. Once done, implement milestone 3 in accordance with the plan below. Milestone 3 — Detailed Design: Element-wise Operations (Double)OverviewMilestone 3 introduces arithmetic, comparison, unary, and in-place operations on The operations do not require M4 (reductions) or M5 (Matrix bridge). The only prerequisite is Design principles for this milestone
Broadcasting algorithmBroadcasting is the mechanism behind Given a source shape and a target shape: Step 1 — Shape alignment. Right-align the shapes, padding the shorter one with ones on the left: Step 2 — Dimension compatibility. For each aligned dimension pair
Result shape above: Step 3 — Broadcast strides. For each input, compute effective strides in the output rank:
A stride of 0 means "always read the same element along this axis" — no copy, no expansion. Element Implementation utilities (to be placed in package vecxt
import vecxt.ndarray.NDArray
object broadcast:
/** Compute the output shape for broadcasting two shapes. Throws BroadcastException on incompatibility. */
def broadcastShape(a: Array[Int], b: Array[Int]): Array[Int] = ...
/** Compute broadcast-extended strides for `arr` into `outShape`.
* Pads with 0 for prepended dimensions; sets 0 for original dimensions of size 1.
*/
def broadcastStrides(arr: NDArray[?], outShape: Array[Int]): Array[Int] = ...
/** True if two shapes are identical (no broadcasting needed). */
def sameShape(a: Array[Int], b: Array[Int]): Boolean = ...
extension [A](arr: NDArray[A])
/** Return a zero-copy view of this NDArray broadcast to `targetShape`.
* Dimensions of size 1 are expanded via stride-0; prepended dimensions get stride 0.
* Throws BroadcastException if shapes are incompatible.
*/
def broadcastTo(targetShape: Array[Int]): NDArray[A] = ...
/** Broadcast both operands to their common shape. Convenience for explicit broadcasting.
* Returns (a’, b’) where both have shape == broadcastShape(a.shape, b.shape).
*/
def broadcastPair[A](a: NDArray[A], b: NDArray[A]): (NDArray[A], NDArray[A]) =
val outShape = broadcastShape(a.shape, b.shape)
(a.broadcastTo(outShape), b.broadcastTo(outShape))
N-dimensional iteration kernelAll general-case (non-fast-path) binary ops share one iteration function. This kernel handles The key insight is that iterating over a column-major output in linear order (0 to numel-1) Where the cumulative products are precomputed once. This is O(ndim) per element, which is Concrete kernel (cross-platform, lives in private def binaryOpGeneral(
a: NDArray[Double],
b: NDArray[Double],
outShape: Array[Int],
aStrides: Array[Int], // broadcast strides for a into outShape
bStrides: Array[Int], // broadcast strides for b into outShape
f: (Double, Double) => Double
): NDArray[Double] =
val n = shapeProduct(outShape)
val out = new Array[Double](n)
// Precompute cumulative products for coordinate decomposition
val ndim = outShape.length
val cumProd = new Array[Int](ndim)
cumProd(0) = 1
var d = 1
while d < ndim do
cumProd(d) = cumProd(d - 1) * outShape(d - 1)
d += 1
end while
var j = 0
while j < n do
var posA = a.offset
var posB = b.offset
var k = 0
while k < ndim do
val coord = (j / cumProd(k)) % outShape(k)
posA += coord * aStrides(k)
posB += coord * bStrides(k)
k += 1
end while
out(j) = f(a.data(posA), b.data(posB))
j += 1
end while
new NDArray(out, outShape, colMajorStrides(outShape), 0)
end binaryOpGeneralFor the common 1D and 2D cases (where the inner coordinate loop is 1 or 2 iterations), the JVM JIT File layout
No new platform-specific files are needed. The fast path delegates to existing API referenceAll extension methods live in The Binary ops (element-wise, same-shape required)extension (a: NDArray[Double])
def +(b: NDArray[Double]): NDArray[Double]
def -(b: NDArray[Double]): NDArray[Double]
def *(b: NDArray[Double]): NDArray[Double]
def /(b: NDArray[Double]): NDArray[Double]Throws To operate on differently-shaped arrays, broadcast explicitly first: val (a2, b2) = NDArray.broadcastPair(a, b)
val c = a2 + b2
// or:
val c = a + b.broadcastTo(a.shape)Scalar opsextension (a: NDArray[Double])
def +(s: Double): NDArray[Double]
def -(s: Double): NDArray[Double]
def *(s: Double): NDArray[Double]
def /(s: Double): NDArray[Double]
extension (s: Double)
def +(a: NDArray[Double]): NDArray[Double]
def -(a: NDArray[Double]): NDArray[Double]
def *(a: NDArray[Double]): NDArray[Double]
def /(a: NDArray[Double]): NDArray[Double]Scalar ops are a special case of array-vs-broadcast(scalar-as-0D). Implemented directly as a flat Unary opsextension (a: NDArray[Double])
def neg: NDArray[Double] // element-wise negation
def abs: NDArray[Double] // element-wise |x|
def exp: NDArray[Double] // element-wise e^x
def log: NDArray[Double] // element-wise ln(x)
def sqrt: NDArray[Double] // element-wise √x
def tanh: NDArray[Double] // element-wise tanh(x)
def sigmoid: NDArray[Double] // element-wise 1 / (1 + e^{-x})Unary ops are always free of broadcasting complexity. The fast path for contiguous arrays delegates For In-place binary ops (mutating
|
Agent-Logs-Url: https://github.com/Quafadas/vecxt/sessions/297d8ade-9e29-44b5-8626-4d7f180ca09b Co-authored-by: Quafadas <24899792+Quafadas@users.noreply.github.com>
Done in 4be1e34. Milestone 3 implemented:
All 140 JVM tests pass. |
Agent-Logs-Url: https://github.com/Quafadas/vecxt/sessions/297d8ade-9e29-44b5-8626-4d7f180ca09b Co-authored-by: Quafadas <24899792+Quafadas@users.noreply.github.com>
Adds the foundational
NDArray[A]type to vecxt, a full set of indexing and view operations, and element-wise arithmetic/comparison operations onNDArray[Double]— cross-platform, N-dimensional array with configurable strides, offset, and column-major default layout.New files
vecxt/src/ndarray.scala—NDArray[A]class + companion@publicInBinary()private constructor; no@specialized—Array[A]for primitives is already unboxed at the JVM level, and@specializedin Scala 3 can silently de-specialize withinlineand extension methodsndim,numel,isColMajor,isRowMajor,isContiguous,layoutapply(full strides / column-major convenience),fromArray,zeros,ones,fillcolMajorStrides,shapeProduct,mkNDArray(package-private unchecked constructor for view operations)vecxt/src/NDArrayCheck.scala— inline bounds checks (erasable viaBoundsCheck)strideNDArrayCheck— rank consistency, positive dims, offset bounds, corner-index rangedimNDArrayCheck— shape product vs data lengthshapeCheck— non-empty, all-positive shapeindexNDArrayCheck— rank and per-axis bounds for element accessInvalidNDArrayexceptionvecxt/src/ndarrayOps.scala— extension methods onNDArray[A]for indexing and viewsapplyoverloads for 1D/2D/3D/4D (transparent inline, zero allocation) + N-DArray[Int]variantupdateoverloads matching allapplyvariants (enablesarr(i,j) = valuesyntax)slice(dim, start, end)— zero-copy view; adjusts offset and shrinks one shape dimensionT— 2D transpose shorthand (zero-copy, validates ndim=2)transpose(perm)— N-D axis permutation with full permutation validation (zero-copy)reshape(newShape)— zero-copy view whenisColMajor; copies viatoArrayotherwisesqueeze/squeeze(dim)— remove all or a specific size-1 dimension (zero-copy)unsqueeze(dim)/expandDims(dim)— insert a size-1 dimension (zero-copy)flatten— 1D view if contiguous; copies to col-major order otherwisetoArray— fastdata.clone()whenisColMajor; col-major odometer iteration otherwisevecxt/src/broadcast.scala— explicit broadcasting (no implicit broadcast in binary ops)broadcastTo(targetShape)— inline zero-copy view with stride-0 expansion for broadcast dimensionsbroadcastPair(a, b)— broadcasts both operands to their common shapebroadcastShape,broadcastStrides,sameShape— helpersBroadcastException,ShapeMismatchException— error typesvecxt/src/ndarrayDoubleOps.scala— element-wise operations onNDArray[Double]+,-,*,/(same shape required; usebroadcastTo/broadcastPairto align shapes first)+,-,*,/(bothndarray op scalarandscalar op ndarray)neg,abs,exp,log,sqrt,tanh,sigmoid+=,-=,*=,/=(col-major fast path + general stride kernel)+=,-=,*=,/=>,<,>=,<=,=:=,!:=(array and scalar variants, returnNDArray[Boolean])Array[Double]ops; general stride-kernel for non-col-major/broadcast viewsModified
vecxt/src/all.scala— addsexport vecxt.ndarray.*,export vecxt.ndarrayOps.*,export vecxt.NDArrayDoubleOps.*,export vecxt.broadcast.*Key invariants
broadcastTo/broadcastPairto align shapes before arithmetic — consistent with vecxt's existingArray[Double]ops and making broadcast sites visible in code.slice,transpose,squeeze,unsqueeze,broadcastTo) always sharearr.data— mutation through one is visible through the other (NumPy semantics).toArray,flattenon non-contiguous,reshapeon non-col-major, binary ops on non-col-major) return a fresh col-majorArray.offset + Σ indices(k) * strides(k).BoundsCheck— inline, erasable at call site withDoBoundsCheck.no.vecxt/src/(cross-platform shared only — no JVM/JS/Native forks).Usage
Strides follow the same column-major convention as
Matrix(rowStride=1, colStride=rows): for shape[d₀, d₁, …], strides =[1, d₀, d₀·d₁, …].