Skip to content

v1.4.0

Choose a tag to compare

@imezx imezx released this 01 Dec 06:11
· 17 commits to main since this release

# v1.4.0-rc5

# What's new?

  • (Experimental) KAN (Kolmogorov-Arnold Network) at Gradien.Experimental.NN.
  • (Added) Tokenizers (inspired by HuggingFace) at Gradien.Tokenizer.
  • (Added) GroupNorm layer at Gradien.NN.
  • (Added) Adam optimizer now supports L2 weight decay (applied to gradients) at Gradien.Optim.

# What's changed?

  • (Changes) Several minor changes & improvements.
    • Linear now slightly safer to "nil" values, no longer stack error.
    • Others minor changes to string's.
  • (BugFix) Fix mismatch Flatten.

# v1.4.0-rc4

# What's new?

  • (Experimental) Hierarchical RL (Feudal Networks) at Gradien.Experimental.RL.
  • (Added) ConvTranspose2d at Gradien.NN.
  • (Added) ConvTranspose2d (Ops) at Gradien.Ops.

# What's changed?

  • (Experimental) Optimized Mish Activation by adding threshold.
  • (Changed) a few fixes and improvements to Visual3D.
  • (Changed) Optimizations.
    • Now uses native operations.
    • (Experimental) Cached global math functions.
    • (Experimental) Some ops are now precalculated.
  • (Changed) Updated docs.

# v1.4.0-rc3

# What's new?

  • (Experimental) State Space Models (SSMs) / Mamba at Gradien.NN & Gradien.Ops.
  • (Added) Sophia Optimizer (2nd Order Approximation) at Gradien.Optim.
  • (Added) is_contiguous, contiguous & expand Tensor method.

# What's changed?

  • (Changed) Optimized BLAS Matrix Multiplication
    • Switched from the naive IJK loop order to IKJ (often 5-10x faster than IJK for large matrices).
  • (Changed) Optimized Pooling
    • Removed redundant index recalculations and duplicate assignments inside the innermost loop of MaxPool2d.

# v1.4.0-rc2

# What's new?


# What's changed?

  • (Changes) Several small fixes & improvements
    • Fixes incorrectness to dump & load State to optimizer & trainer.
    • Corrected Softmax backward pass to use the efficient Jacobian-vector product.
    • Added strict assertion to shapes for Math operations.
    • Softmax and Activations now reuse internal buffers to reduce memory allocation during training loops.
    • Fused Kernels: NN.Linear now combines matrix multiplication and bias addition into a single parallel block, reducing thread synchronization overhead significantly.

# v1.4.0-rc1

# What's new?

  • (Experimental) Quantum-Inspired Metaheuristic Neural Network (QIMHNN) at Gradien.Experimental.Models.

  • (Experimental) Metaheuristic Optimizer for QIMHNN at Gradien.Experimental.Optim.

  • (Experimental) SwarmPSO Optimizer for QIMHNN at Gradien.Experimental.Optim.

  • (Added) Adafactor Optimizer at Gradien.Optim.

  • (Added) Accumulated Optimizer (Gradient Accumulation) wrapper at Gradien.Optim.


# What's changed?

  • (Changes) Small fixes

    • several small fixes. (forgot what is it)
  • (Changes) Several small improvements

    • removed most metatables.
    • slightly performance bumps (only micro-optimizations).

# v1.4.0-rc0

# What's new?

  • Gradien now available on Wally!

  • (Added) Prebuilt models at Gradien.Models.

    • MLP, ResMLP, ConvNet, TransformerEncoder, SequenceClassifier, AutoEncoder.
  • (Added) Multi-head self-attention at Gradien.NN.Attention.

    • Attention.new(embedDim, numHeads, opts) with support for dropout, custom initializers, causal masking, and getLastAttention() inspection.
    • Backed by a numerically stable softmax in Gradien.Ops.Softmax.
  • (Added) Multi-agent RL wrapper at Gradien.RL.MultiAgent.

    • Wraps a list of agents with:

      • getAgent(i), size()
      • act(i, state, step) and actAll(states, step)
      • parameters() and zeroGrad() that fan out across agents.
  • (Added) Buffer utilities at Gradien.Util.Buffer.

    • High-level encode/decode of Luau values and tensors into buffer objects.
  • (Added) Profiler at Gradien.Util.Profiler.

    • API: .start/stop, .scope, .wrap, .instrument, .snapshot, .report, .withEnabled, .get, .reset/flush.
  • (Added) Stable Softmax at Gradien.Ops.Softmax.

    • SoftmaxOps.forward(logits) implements a max-shifted, numerically stable softmax used by nn.Softmax and nn.Attention.
  • (Added) Classification-oriented trainer constructor at Gradien.Trainer.

    • Trainer.newClassification(cfg, opts?):

      • Default loss: nn.Losses.cross_entropy_backward (with optional label smoothing).
      • Default metric: Metrics.accuracy.
      • Returns a regular Trainer wired for supervised classification.
  • (Added) Cosine schedule with warmup in Gradien.Optim.Schedulers.

    • S.linearWarmupThenCosine(lr, warmupSteps, totalSteps, lrMin):

      • Linear warmup phase followed by cosine decay toward lrMin.
  • (Added) Snapshot ↔ buffer helpers at Gradien.State.

    • State.toBuffer(snap): buffer – serializes a Types.Snapshot via Util.Buffer.
    • State.fromBuffer(buf): Snapshot? – reconstructs snapshots from a buffer.
  • (Added) New initializers at Gradien.Init.

    • heNormal(W), heUniform(W), lecunNormal(W), lecunUniform(W) – fan-in/out aware weight initializers.
  • (Added) Tensor view & shape helpers at Gradien.Tensor.

    • Tensor.reshape(t, newShape) – view-style reshape (shared storage) with size checking.
    • Tensor.slice(t, dim, startIdx, endIdx?, step?) – strided slices with view-based implementation.
    • Tensor.transpose(t, dim1?, dim2?) – generic axis swap; defaults to 2D transpose when dims are omitted.
    • Tensor.narrow(t, dim, startIdx, length) – thin wrapper around slice for PyTorch-style narrowing.
    • Tensor.noGrad(t) – in-place: marks a tensor as non-differentiable and clears _grad.

# What's changed?

  • (Changed) Tensor & Autograd

    • Tensor now uses explicit computeStrides and view objects internally to implement reshape, slice, and transpose without copying storage, while still propagating gradients.

    • Tensor.detach() still returns a detached view, but Tensor.noGrad() was added for in-place disabling of gradients.

    • autograd.Tape.matmul:

      • Allocates A._grad / B._grad with the correct dtype (Tensor.zeros(..., x._dtype)).
      • Accumulates gradients into existing .grad buffers instead of overwriting them.
    • Tape.noGrad(fn) no longer wraps fn in pcall.

  • (Changed) Initializers

    • All initializer functions now operate on Types.Tensor and share the internal _randn() normal sampler.
    • Existing initializers (e.g. xavierUniform) are updated to use fan-in/fan-out computations consistent with the new He/LeCun variants.
  • (Changed) BatchNorm & Metrics

    • nn.BatchNorm1d:

      • Running statistics now have shape {D, 1} instead of {1, 1} and are tracked per-feature.
      • Training mode computes per-channel means and variances and updates runningMean / runningVar with the configured momentum.
      • Eval mode uses the stored per-channel statistics for normalization.
    • Metrics:

      • Multi-class precision/recall/F1 now pre-init tpC, fpC, fnC arrays with zeros to avoid nil indexing on unseen classes.
      • Confusion matrix allocation uses table.create(C, 0) and fills with zeros, fixing edge cases when some classes never appear.
  • (Changed) Convolutions & Softmax

    • ops.Conv2d:

      • Reimplemented using helper routines (copyShape, makeMatrixView, im2col, col2im, reshapeInPlace, transposeMatrix, addInto) and BLAS.matmul.
      • Keeps public signature the same (Conv2d(X, W)), but forwards now use an im2col + GEMM approach for better performance.
    • nn.Conv2d:

      • Continues to delegate to ops.Conv2d, inheriting the new, more efficient kernel without changing its module API.
    • nn.Softmax:

      • Simplified to delegate to Ops.Softmax.forward, consolidating the softmax implementation in ops/Softmax.luau.
  • (Changed) RL Replay Buffers

    • Gradien.RL.Replay:

      • Now requires t.state and t.nextState to be tensors and asserts their presence.
      • Infers stateDim and dtype on first push and stores state vectors as dense arrays in Util.Buffer buffers instead of raw tables.
      • sample(batchSize) reconstructs batched state / next-state tensors from the underlying buffers.
    • Gradien.RL.UniformReplay & Gradien.RL.PrioritizedReplay:

      • Similarly updated to serialize state vectors into buffers via Util.Buffer and to rebuild batched S / NS tensors on sampling.
      • Insert logic and bookkeeping (head, size_) are clarified and wrapped in explicit conditionals.
  • (Changed) Schedulers & Trainer

    • Trainer.fit now works through a typed FitOptions table (epochs, stepsPerEpoch, onMetric), assigning defaults via a local fitOpts but remaining backwards compatible with previous usage.
  • (Changed) Small fixes

    • nn.BatchNorm1d, replay buffers, and metrics all gained more explicit shape checks, zero-initialization, and assert messages to catch configuration errors earlier.
  • (Changes) Several small improvements

    • slightly performance bumps specially on heavy ops compared to previous versions.
    • small Types fix for Tensor

Full Changelog: v1.3.0...v1.4.0