Nicole 0.2.0 - PyTorch Backend with Autograd and Device Control

Release Date: February 6, 2026

Version 0.2 introduces a major backend migration from NumPy to PyTorch, enabling automatic differentiation, GPU acceleration, and enhanced device management for tensor network computations with Abelian symmetries.

🔄 Backend Migration - NumPy to PyTorch

Core Infrastructure Changes

Complete migration from NumPy to PyTorch as the tensor backend
All tensor operations now leverage PyTorch's optimized kernels
Backward compatibility maintained for existing user code
Updated dependencies: torch>=2.5 replaces numpy>=2.0 as primary backend
Preserved block-sparse semantics with PyTorch tensors

🎯 Autograd Support

Gradient Tracking

requires_grad property: Control gradient computation for individual tensors
Automatic differentiation: Through all tensor operations (add, sub, mul, contract, decomp)
Tensor.backward() method: Compute gradients for scalar tensors (0D)
Full computational graph support: For optimization workflows in variational algorithms
Element-wise operations: Preserve gradient flow through operations
Integration with PyTorch's autograd engine: Access gradients via underlying torch.Tensor blocks

Gradient Management

requires_grad parameter in constructors (zeros, random, from_scalar)
Setter for requires_grad to enable/disable gradient tracking
Default: gradients disabled (torch.set_grad_enabled(False)) for performance
Compatible with PyTorch optimizers for gradient-based optimization

🖥️ Device Management

Multi-Device Support

CPU: Full dtype support (float32, float64, complex64, complex128)
CUDA (NVIDIA): Full dtype support with optimal GPU performance
MPS (Apple Silicon): float32/complex64 with automatic dtype normalization

Device Operations

Tensor.device property: Query tensor placement
Tensor.to(device) method: Transfer tensors between devices
Tensor.cpu() convenience method: Move to CPU
Tensor.cuda() convenience method: Move to CUDA device
device parameter: In constructors (zeros, random, from_scalar)
Automatic device consistency validation: In tensor operations

MPS Dtype Normalization

normalize_dtype_for_device() utility function in typing module
Automatic float64 → float32 conversion on MPS
Automatic complex128 → complex64 conversion on MPS
Transparent handling: In constructors and .to() method
Comprehensive test coverage: For MPS compatibility

🧪 Testing Infrastructure

Comprehensive Test Coverage

708 tests covering all functionality (up from 662 in v0.1)
New test modules: test_autograd.py, test_device.py
MPS dtype normalization tests integrated into test_device.py
Device management tests for CPU, CUDA, MPS
Autograd tests for gradient computation and backward()
Gradient flow tests for operations (add, sub, mul, contract)
All existing tests updated for PyTorch backend

Test Organization

Device tests in tests/support/test_device.py (new)
Autograd tests in tests/support/test_autograd.py (new)
GPU tests skip gracefully when hardware unavailable
MPS-specific tests for dtype normalization

🚀 Performance Optimizations

Backend Changes

Replaced numpy arrays with torch tensors throughout codebase
torch.randn() for random generation with generator support
torch.eye() for identity matrices
torch.zeros() for zero initialization
torch.complex() for complex number construction
Maintained block-sparse structure with PyTorch tensors

Performance Features

Disabled autograd by default (torch.set_grad_enabled(False))
Set default device to CPU (torch.set_default_device('cpu'))
Efficient device transfers with minimal overhead
GPU acceleration for large-scale computations
Block-sparse algorithms unchanged, now with PyTorch backend

📚 API Surface Updates

Core (Enhanced)

Tensor.requires_grad: Property for gradient tracking control
Tensor.backward(): Compute gradients for scalar tensors
Tensor.device: Property for querying device placement
Tensor.to(): Move tensors between devices
Tensor.cpu(): Convenience method for CPU transfer
Tensor.cuda(): Convenience method for CUDA transfer

Utilities (New)

normalize_dtype_for_device(): Helper function in typing module

Constructors (Enhanced)

device parameter in zeros, random, from_scalar
requires_grad parameter in zeros, random, from_scalar

Operations

All operations now support autograd and device management
Gradient flow preserved through operations

📊 Statistics

Code Changes

161 commits across develop branch
52 files changed: 3,723 insertions, 1,026 deletions
Major refactors: tensor.py, test suite updates
New helper functions: normalize_dtype_for_device() in typing module

Test Coverage

708 comprehensive tests (46 new tests since v0.1)
16 device management tests (including MPS)
21 autograd tests for gradient computation
All tests pass on CPU, CUDA, and MPS devices

🎓 Target Users

Researchers in quantum many-body physics, machine learning, and quantum information who require:

GPU acceleration for large-scale tensor network simulations
Automatic differentiation for variational algorithms (variational MPS, PEPS optimization)
Modern optimization workflows with PyTorch ecosystem integration

💡 Use Cases

Variational Algorithms

Gradient-based optimization of tensor network states
Variational MPS and PEPS algorithms
Neural network quantum states (NQS)
Integration with PyTorch optimizers (Adam, SGD, etc.)

GPU Acceleration

Large tensor network contractions on CUDA
Batch processing on GPU
Apple Silicon (MPS) support for Mac users
Efficient device transfers for hybrid CPU/GPU workflows

Machine Learning Integration

Seamless integration with PyTorch ecosystem
Compatibility with PyTorch data loaders and optimizers
Mixed precision training support
Automatic differentiation for custom tensor network layers

✅ Compatibility

Breaking Changes: None - fully backward compatible with v0.1.x API

Note: Internal backend changed from NumPy to PyTorch, but user-facing API unchanged. Existing code will continue to work without modification.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nicole 0.2.0

Choose a tag to compare

Sorry, something went wrong.