Nicole 0.2.0
·
555 commits
to stable
since this release
Nicole 0.2.0 - PyTorch Backend with Autograd and Device Control
Release Date: February 6, 2026
Version 0.2 introduces a major backend migration from NumPy to PyTorch, enabling automatic differentiation, GPU acceleration, and enhanced device management for tensor network computations with Abelian symmetries.
🔄 Backend Migration - NumPy to PyTorch
Core Infrastructure Changes
- Complete migration from NumPy to PyTorch as the tensor backend
- All tensor operations now leverage PyTorch's optimized kernels
- Backward compatibility maintained for existing user code
- Updated dependencies:
torch>=2.5replacesnumpy>=2.0as primary backend - Preserved block-sparse semantics with PyTorch tensors
🎯 Autograd Support
Gradient Tracking
requires_gradproperty: Control gradient computation for individual tensors- Automatic differentiation: Through all tensor operations (add, sub, mul, contract, decomp)
Tensor.backward()method: Compute gradients for scalar tensors (0D)- Full computational graph support: For optimization workflows in variational algorithms
- Element-wise operations: Preserve gradient flow through operations
- Integration with PyTorch's autograd engine: Access gradients via underlying
torch.Tensorblocks
Gradient Management
requires_gradparameter in constructors (zeros,random,from_scalar)- Setter for
requires_gradto enable/disable gradient tracking - Default: gradients disabled (
torch.set_grad_enabled(False)) for performance - Compatible with PyTorch optimizers for gradient-based optimization
🖥️ Device Management
Multi-Device Support
- CPU: Full dtype support (
float32,float64,complex64,complex128) - CUDA (NVIDIA): Full dtype support with optimal GPU performance
- MPS (Apple Silicon):
float32/complex64with automatic dtype normalization
Device Operations
Tensor.deviceproperty: Query tensor placementTensor.to(device)method: Transfer tensors between devicesTensor.cpu()convenience method: Move to CPUTensor.cuda()convenience method: Move to CUDA devicedeviceparameter: In constructors (zeros,random,from_scalar)- Automatic device consistency validation: In tensor operations
MPS Dtype Normalization
normalize_dtype_for_device()utility function intypingmodule- Automatic float64 → float32 conversion on MPS
- Automatic complex128 → complex64 conversion on MPS
- Transparent handling: In constructors and
.to()method - Comprehensive test coverage: For MPS compatibility
🧪 Testing Infrastructure
Comprehensive Test Coverage
- 708 tests covering all functionality (up from 662 in v0.1)
- New test modules:
test_autograd.py,test_device.py - MPS dtype normalization tests integrated into
test_device.py - Device management tests for CPU, CUDA, MPS
- Autograd tests for gradient computation and
backward() - Gradient flow tests for operations (add, sub, mul, contract)
- All existing tests updated for PyTorch backend
Test Organization
- Device tests in
tests/support/test_device.py(new) - Autograd tests in
tests/support/test_autograd.py(new) - GPU tests skip gracefully when hardware unavailable
- MPS-specific tests for dtype normalization
🚀 Performance Optimizations
Backend Changes
- Replaced numpy arrays with torch tensors throughout codebase
torch.randn()for random generation with generator supporttorch.eye()for identity matricestorch.zeros()for zero initializationtorch.complex()for complex number construction- Maintained block-sparse structure with PyTorch tensors
Performance Features
- Disabled autograd by default (
torch.set_grad_enabled(False)) - Set default device to CPU (
torch.set_default_device('cpu')) - Efficient device transfers with minimal overhead
- GPU acceleration for large-scale computations
- Block-sparse algorithms unchanged, now with PyTorch backend
📚 API Surface Updates
Core (Enhanced)
Tensor.requires_grad: Property for gradient tracking controlTensor.backward(): Compute gradients for scalar tensorsTensor.device: Property for querying device placementTensor.to(): Move tensors between devicesTensor.cpu(): Convenience method for CPU transferTensor.cuda(): Convenience method for CUDA transfer
Utilities (New)
normalize_dtype_for_device(): Helper function intypingmodule
Constructors (Enhanced)
deviceparameter inzeros,random,from_scalarrequires_gradparameter inzeros,random,from_scalar
Operations
- All operations now support autograd and device management
- Gradient flow preserved through operations
📊 Statistics
Code Changes
- 161 commits across develop branch
- 52 files changed: 3,723 insertions, 1,026 deletions
- Major refactors:
tensor.py, test suite updates - New helper functions:
normalize_dtype_for_device()in typing module
Test Coverage
- 708 comprehensive tests (46 new tests since v0.1)
- 16 device management tests (including MPS)
- 21 autograd tests for gradient computation
- All tests pass on CPU, CUDA, and MPS devices
🎓 Target Users
Researchers in quantum many-body physics, machine learning, and quantum information who require:
- GPU acceleration for large-scale tensor network simulations
- Automatic differentiation for variational algorithms (variational MPS, PEPS optimization)
- Modern optimization workflows with PyTorch ecosystem integration
💡 Use Cases
Variational Algorithms
- Gradient-based optimization of tensor network states
- Variational MPS and PEPS algorithms
- Neural network quantum states (NQS)
- Integration with PyTorch optimizers (Adam, SGD, etc.)
GPU Acceleration
- Large tensor network contractions on CUDA
- Batch processing on GPU
- Apple Silicon (MPS) support for Mac users
- Efficient device transfers for hybrid CPU/GPU workflows
Machine Learning Integration
- Seamless integration with PyTorch ecosystem
- Compatibility with PyTorch data loaders and optimizers
- Mixed precision training support
- Automatic differentiation for custom tensor network layers
✅ Compatibility
Breaking Changes: None - fully backward compatible with v0.1.x API
Note: Internal backend changed from NumPy to PyTorch, but user-facing API unchanged. Existing code will continue to work without modification.