Skip to content

[Core] UnmanagedSpan<T> with native long indexing for >2B element arrays #598

@Nucs

Description

@Nucs

Overview

NumSharp arrays are currently limited to ~2 billion elements due to .NET's Span<T> using int indices. This feature introduces UnmanagedSpan<T> with native long indexing, enabling NumSharp to support arrays with more than int.MaxValue elements.

Problem

NumPy arrays can hold up to 2^63 elements (limited by available memory). NumSharp's reliance on Span<T> creates a hard ceiling at int.MaxValue (~2.1B elements):

// Current limitation
var arr = np.zeros(3_000_000_000);  // Fails - exceeds int.MaxValue

This blocks scientific computing use cases with large datasets, high-resolution imaging, and genomics data.

Solution

Port .NET runtime's Span<T> implementation to UnmanagedSpan<T> with:

  • long for all lengths and indices
  • Full SIMD acceleration (Vector128/256/512)
  • where T : unmanaged constraint (appropriate for NumSharp's numeric types)

New Types

Type Description
UnmanagedSpan<T> Mutable span with long indexing
ReadOnlyUnmanagedSpan<T> Immutable span with long indexing

SIMD-Optimized Operations

All operations use Vector128/256/512 acceleration when available:

Method Description
Contains<T> SIMD search for value
IndexOf<T> SIMD first-occurrence search
LastIndexOf<T> SIMD last-occurrence search
IndexOfAny<T> SIMD search for any of N values
SequenceEqual<T> SIMD sequence comparison
Fill<T> SIMD memory fill
Reverse<T> SIMD in-place reverse
BinarySearch<T> Binary search with long indices

Key Technical Decisions

  1. 64-bit addressing is NOT an AVX limitation - AVX2/512 use nuint offsets which are 64-bit on x64. The int limitation was purely a .NET API design choice.

  2. Internal types:

    • Loop variables use nuint (64-bit on x64)
    • Unsafe.Add calls use (nint) casts
    • Method signatures use long for public API
  3. Compatibility:

    • Works on .NET 8.0 and .NET 10.0
    • Conditional compilation for AVX-512 (#if NET9_0_OR_GREATER)
    • Scalar fallback when SIMD unavailable

Files Changed

src/NumSharp.Core/Utilities/SpanSource/
├── UnmanagedSpan.cs              # Core span type (long _length, long indexer)
├── ReadOnlyUnmanagedSpan.cs      # Read-only variant
├── UnmanagedSpanHelpers.cs       # Memmove, Clear, Reverse (SIMD)
├── UnmanagedSpanHelpers.T.cs     # SIMD search/compare operations (~2400 lines)
├── UnmanagedSpanExtensions.cs    # Extension methods dispatching to SIMD
├── UnmanagedBuffer.cs            # Low-level memory operations
├── UnmanagedSpanDebugView.cs     # Debugger visualization
└── UnmanagedSpanThrowHelper.cs   # Exception helpers

API Examples

// Create span over large unmanaged buffer
unsafe {
    void* ptr = NativeMemory.Alloc(5_000_000_000);
    var span = new UnmanagedSpan<byte>(ptr, 5_000_000_000L);
    
    span.Fill(0xFF);                          // SIMD fill
    long idx = span.IndexOf((byte)0x00);      // SIMD search, returns long
    span.Reverse();                           // SIMD reverse
    
    var slice = span.Slice(2_500_000_000L);   // Long slicing
}

Testing

  • Unit tests for all SIMD operations
  • Edge cases: empty spans, single element, vector boundary conditions
  • Large array tests (when memory available)

Breaking Changes

None - this is additive. Existing code using Span<T> continues to work. UnmanagedSpan<T> is opt-in for >2B element scenarios.

Related Work

This is part of the larger int64 indexing initiative to align NumSharp with NumPy's capacity:

  • Shape dimensions: int[]long[]
  • Strides: int[]long[]
  • SliceDef indices: intlong
  • Storage size: intlong

Checklist

  • Port Span<T> source from dotnet/runtime
  • Convert all indices/lengths to long
  • Add SIMD-optimized value type methods (INumber)
  • Add extension methods with type dispatch
  • .NET 8.0 compatibility (IsAddressLessThanOrEqualTo shim)
  • .NET 10.0 AVX-512 support
  • Build succeeds on both target frameworks
  • Integration with NumSharp's ArraySlice/UnmanagedStorage
  • Performance benchmarks vs standard Span

Metadata

Metadata

Assignees

No one assigned

    Labels

    NumPy 2.x ComplianceAligns behavior with NumPy 2.x (NEPs, breaking changes)coreInternal engine: Shape, Storage, TensorEngine, iteratorsperformancePerformance improvements or optimizations

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions