Skip to content

[new dtypes, NEP50] fully supported Half/Complex/SByte, np.* alias overhaul, NumPy 2.x type alias alignment#612

Merged
Nucs merged 59 commits intomasterfrom
worktree-half
Apr 22, 2026
Merged

[new dtypes, NEP50] fully supported Half/Complex/SByte, np.* alias overhaul, NumPy 2.x type alias alignment#612
Nucs merged 59 commits intomasterfrom
worktree-half

Conversation

@Nucs
Copy link
Copy Markdown
Member

@Nucs Nucs commented Apr 22, 2026

TL;DR

Full NumPy-parity support for three new dtypesSByte (int8), Half (float16), Complex (complex128) — across every np.* API, operator, IL kernel, and reduction. A new DateTime64 helper type closes a 64-case conversion gap vs NumPy's datetime64. The np.* class-level type aliases are now fully aligned with NumPy 2.4.2 (breaking changes: np.byte = int8, np.complex64 throws, np.uint = uintp, np.intp is platform-detected), and np.dtype(string) is rewritten as a FrozenDictionary lookup covering every NumPy 2.x type code.

55 commits, +30k / −5.0k lines, 165 files. 34 NumPy-parity bugs closed. ~7,000+ tests / 0 failures / 11 skipped.

Full release notes: docs/releases/RELEASE_0.51.0-prerelease.md


Major features

New dtypes: SByte (int8), Half (float16), Complex (complex128)

  • NPTypeCode extended (SByte=5, Half=16, Complex=128) across every extension method.
  • Type aliases: np.int8, np.sbyte, np.float16, np.half.
  • np.find_common_type — ~80 new NEP50 promotion entries.
  • NDArray Get*/Set* methods for the three dtypes.
  • Iterator casts: NDIterator.Cast.{Half,Complex,SByte}.cs.
  • ILKernelGenerator extended across .Binary, .Unary, .Unary.Math, .Comparison, .Reduction, .Reduction.Axis, .Masking.NaN, .Scan, .Scalar.

DateTime64 helper type

readonly struct with NumPy datetime64 semantics — full long.MinValue..long.MaxValue tick range, NaT == long.MinValue sentinel that propagates through arithmetic and compares like IEEE NaN. Closes 64 datetime fuzz diffs that previously forced DateTime.MinValue fallbacks.

NumPy 2.x type alias alignment (breaking)

Alias Before After Reason
np.byte byte (uint8) sbyte (int8) NumPy C-char convention
np.complex64 alias → complex128 throws NotSupportedException no silent widening
np.csingle alias → complex128 throws NotSupportedException same rationale
np.uint uint64 uintp (pointer-sized) NumPy 2.x
np.intp nint long on 64-bit nint has NPTypeCode.Empty which breaks dispatch
np.uintp nuint ulong on 64-bit same
np.int_ long intp NumPy 2.x: int_ == intp

Added: np.short, np.ushort, np.intc, np.uintc, np.longlong, np.ulonglong, np.single, np.cdouble, np.clongdouble. Platform-detected: np.@long, np.@ulong (C-long: 32-bit MSVC / 64-bit *nix LP64).

np.dtype(string) parser rewrite

Regex replaced with FrozenDictionary<string, Type> built at static init. Covers every NumPy 2.x single-char code (? b B h H i I l L q Q p P e f d g D G), sized form (b1, i1..i8, u1..u8, f2/f4/f8, c16), lowercase name (int8..int64, uint8..uint64, float16..float64, complex128, intc, intp, longlong, …), and NumSharp-friendly alias (SByte, Half, Complex, …). Unsupported NumPy codes (S, U, M, m, O, V, complex64/F/c8) throw NotSupportedException with explanatory messages.

np.finfo + np.iinfo extended

  • np.finfo(Half) — IEEE binary16 machine limits.
  • np.finfo(Complex) — reports underlying float64 (NumPy parity: finfo(complex128).dtype == float64).
  • np.iinfo(SByte) — int8 with signed min/max.

NumPy-canonical exception types at rejection sites

Site Before After
Non-integer shift ops NotSupportedException TypeError ("ufunc '…' not supported for the input types, … safe casting")
Invalid index type ArgumentException new IndexError ("only integers, slices, ellipsis, newaxis, integer/boolean arrays are valid indices")
np.repeat on non-integer repeats permissive truncation TypeError ("Cannot cast … to dtype('int64') according to the rule 'safe'")
Explicit cast NDArray[Complex] → non-complex scalar silent imaginary drop TypeError ("can't convert complex to int/float/…") — matches Python's int(complex)

Bitshift operators

<< and >> added to NDArray (NDArray.Shift.cs). Two overloads per direction (NDArray↔NDArray, NDArray↔object). C# synthesizes <<= / >>= (reassign, not in-place).

Casting overhaul (Rounds 1-5E)

  • Modular wrapping for integer overflow (no more OverflowException).
  • NaN / Inf → 0 consistently for float → int targets.
  • IConvertible constraint removed from generic converters to admit Half / Complex.
  • UnmanagedMemoryBlock.Allocate(Type, long, object) — direct boxing casts replaced with Converts.ToXxx(fill) dispatchers; fill = 1 on a Half array no longer throws InvalidCastException.

Complex matmul preserves imaginary

Default.MatMul.2D2D::MatMulMixedType<TResult> short-circuits to a dedicated MatMulComplexAccumulator when TResult == Complex — the prior double accumulator dropped imaginary parts.

34 NumPy-parity bugs closed (B1-B37 minus B34 accepted BCL divergence, B38 alias)

Creation: np.eye diagonal stride + negative N/M validation, np.asanyarray/asarray dtype override, np.frombuffer dtype parser + byte-swap for Half/Complex. Arithmetic: Complex 1/0, floor_divide(inf), integer power overflow, reciprocal(int) dtype preservation, floor/ceil/trunc(int) no-op preservation. Reductions: Half min/max, Complex mean/std/var/prod/cumsum/argmax axis, Complex nansum NaN-skip, Half var/std dtype preservation, np.unique(Complex), NaN-first in Complex argmax. Unary math: Half+Complex log10/log2/cbrt/exp2/log1p/expm1, Complex sign(inf) unit vector, Half subnormal precision. Comparison: Complex NaN short-circuit. Statistics: Half+Complex nanmean/nanstd/nanvar.

Accepted BCL divergences (4): Complex (inf+0j)^(1+1j), exp2(complex(inf, inf)), SByte // 0 / % 0, frombuffer(">f2"/">c16") loses byte-order prefix.


Test plan

  • CI green on net8.0 + net10.0 — suite must stay at 0 failed / 11 skipped.
  • Half/Complex/SByte coverage sweeps — re-run the three probe matrices (Creation 330 cases, Arithmetic 109, Reductions 80) against NumPy 2.4.2. Expected 100% / 96.3% / 100%.
  • NumPy 2.x alias parityNpTypeAliasParityTests (174 LoC) checks every np.* alias on Windows 64-bit. Platform-gated assertions for np.@long, np.@ulong, np.intp, np.uintp, np.int_, np.@uint, np.@int.
  • np.dtype string parserDTypeStringParityTests (319 LoC) covers every NumPy code. DTypePlatformDivergenceTests (166 LoC) verifies 'l' / 'L' / 'int' platform-dependent behavior.
  • Complex64 refusalComplex64RefusalTests (116 LoC) locks in that np.complex64 / np.csingle throw rather than silently widen.
  • Scalar cast matrixNDArrayScalarCastTests (384 LoC) exercises the 15×15 0-d cast grid + Complex→non-complex TypeError.
  • Cross-type fillsUnmanagedMemoryBlockAllocateTests (226 LoC) for Allocate(Type, long, object) with Half/Complex fills from int/double.
  • Linux/macOS leg validation — at least one non-Windows CI machine must exercise the C-long platform-detection branches (_cLongType, _cULongType) since Windows CI locks in the 32-bit LLP64 convention.
  • Breaking changes review — 8 behavior-affecting changes (np.byte, np.complex64/np.csingle throw, np.uint, np.intp, np.int_, shift TypeError, index IndexError, repeat TypeError, complex cast TypeError) documented in the release notes. Downstream consumers (TensorFlow.NET, ML.NET integrations, Gym.NET, Pandas.NET) should review.

Remaining work / follow-ups

  • np.finfo("float16") / np.finfo("half")np.finfo(NPTypeCode) constructor still needs Half in its internal IsFloatType fast path; string parsing already resolves correctly.
  • Missing functions (pre-existing, not in scope for this PR): np.sort, np.flip family, np.split family, np.diag/np.diagonal/np.trace, np.diff/np.gradient/np.ediff1d, np.pad, np.tile, np.round (alias).
  • np.where was added to master ([API] Support np.where via ILKernelGenerator #606) after this branch was created and is now rebased on top of it — no action needed here.

Nucs added 30 commits April 21, 2026 19:24
…type support

Core type system infrastructure for three new NumPy-compatible data types:

NPTypeCode enum:
- SByte = 5 (matches TypeCode.SByte)
- Half = 16 (new value for System.Half)
- Complex = 128 (existing, now implemented in switches)

Updated extension methods:
- GetTypeCode, AsType, SizeOf, IsRealNumber, IsUnsigned, IsSigned
- GetGroup, GetPriority, ToTypeCode, ToTYPECHAR, AsNumpyDtypeName
- GetAccumulatingType, GetDefaultValue, GetOneValue
- IsFloatingPoint, IsInteger, IsSimdCapable, IsNumerical

Memory management:
- UnmanagedMemoryBlock: FromArray and Allocate
- ArraySlice: Scalar and all Allocate overloads
- UnmanagedStorage: typed fields and SetInternalArray

Type properties:
- SByte: 1 byte, signed integer, SIMD capable, "int8"
- Half: 2 bytes, floating point, not SIMD capable, "float16"
- Complex: 16 bytes, real number, not SIMD capable, "complex128"

Special handling:
- Half and Complex don't implement IConvertible
- Conversions use intermediate double for Half
- Complex uses direct cast or constructs from real

Note: Many DefaultEngine operations still need switch statement updates.
See docs/NEW_DTYPES_IMPLEMENTATION.md for remaining work.
…tters

Updated UnmanagedStorage.Getters.cs:
- GetValue(int[]): Added SByte, Half, Complex cases
- GetValue(long[]): Added SByte, Half, Complex cases
- GetAtIndex: Added SByte, Half, Complex cases
- Added GetSByte, GetHalf, GetComplex direct getter methods
- Added long[] overloads for new types

Updated UnmanagedStorage.Setters.cs:
- SetAtIndex: Added SByte, Half, Complex cases

Note: Half uses direct cast, Complex uses System.Numerics.Complex.
Both work correctly for getting/setting values from storage.
…8) dtype support

Implements full support for three new NumPy-compatible data types:
- SByte (int8): NPTypeCode.SByte = 5, maps to np.int8
- Half (float16): NPTypeCode.Half = 16, maps to np.float16
- Complex (complex128): NPTypeCode.Complex = 128, maps to np.complex128

Core changes:
- Add conversion methods in Converts.Native.cs (ToSByte, ToHalf, ToComplex)
- Add CreateFallbackConverter in Converts.cs for Half/Complex (no IConvertible)
- Update UnmanagedMemoryBlock.Casting.cs to use typed generic CastTo path
- Add ToSByte/ToHalf array conversion methods in ArrayConvert.cs
- Create NDIterator.Cast.SByte/Half/Complex.cs for iteration support

Verified working:
- Array creation: np.array(new sbyte/Half/Complex[])
- np.zeros/ones/empty with NPTypeCode.SByte/Half/Complex
- dtype string parsing: np.dtype("int8"), np.dtype("float16"), np.dtype("complex128")
- Type conversion: arr.astype(NPTypeCode.SByte/Half/Complex)

Special handling:
- Half: Doesn't implement IConvertible, conversions go through double
- Complex: Doesn't implement IConvertible or IComparable, excluded from unique/clip/randint
- SByte: Full parity with byte, SIMD possible but uses fallback path

ILKernelGenerator files use fallback paths (functional but not SIMD optimized).

Closes #567 (int8), #568 (float16), partially addresses #569 (complex128)
…c and reductions

This commit enables full arithmetic operations and basic reductions for the
three new NumPy-compatible types: SByte (int8), Half (float16), Complex (complex128).

Key changes:

np.cs:
- Added type aliases: np.int8, np.sbyte, np.float16, np.half

np.find_common_type.cs:
- Added all type promotion entries for SByte, Half, Complex
- Both arr_arr and arr_scalar tables updated (~80 new entries)
- Follows NumPy 2.x promotion rules

ILKernelGenerator.cs:
- GetTypeSize: Added SByte=1, Half=2, Complex=16
- GetClrType: Added mappings for all three types
- CanUseSimd: SByte is SIMD capable; Half/Complex are not
- EmitLoadIndirect/EmitStoreIndirect: Added SByte (Ldind_I1/Stind_I1),
  Half/Complex (Ldobj/Stobj)
- EmitConvertTo: Added SByte (Conv_I1) + EmitHalfOrComplexConversion
- EmitScalarOperation: Added EmitHalfOperation, EmitComplexOperation
  - Half: converts to double, performs op, converts back
  - Complex: uses System.Numerics.Complex operator methods

ArraySlice.cs:
- Fixed Scalar() methods to handle Half/Complex which don't implement IConvertible
- Uses pattern matching to preserve type when value is already correct type

DefaultEngine.ReductionOp.cs:
- Added SByte to sum_elementwise_il switch
- Added SumElementwiseHalfFallback() - iterator-based for Half
- Added SumElementwiseComplexFallback() - iterator-based for Complex

Verified working:
- Array creation: np.array(new sbyte[]/Half[]/Complex[])
- Arithmetic: sbyte+sbyte, half+half, complex+complex
- Type conversion: byte->sbyte, byte->half, byte->complex
- np.sum() for all three types
Complete the ILKernelGenerator support for the three new dtypes:
- SByte (int8): SIMD-capable, same patterns as Byte for most ops
- Half (float16): Scalar path via double conversion
- Complex (complex128): Scalar path, special handling for abs/sign

ILKernelGenerator.Reduction.cs:
- EmitLoadZero: Add SByte (Ldc_I4_0), Half (Half.Zero), Complex (Complex.Zero)
- EmitLoadOne: Add SByte (Ldc_I4_1), Half (via double conversion), Complex (Complex.One)
- EmitLoadMinValue: Add SByte (sbyte.MinValue), Half (Half.NegativeInfinity)
- EmitLoadMaxValue: Add SByte (sbyte.MaxValue), Half (Half.PositiveInfinity)
- Complex throws NotSupportedException for Min/Max (no comparison operators)

ILKernelGenerator.Reduction.Axis.cs:
- CreateAxisReductionKernel: Add SByte to SIMD dispatch path
- ReadAsDouble: Add SByte, Half, Complex (uses Real part)
- WriteFromDouble: Add SByte, Half, Complex
- ConvertToDouble: Add SByte, Half, Complex
- ConvertFromDouble: Add SByte, Half, Complex

ILKernelGenerator.Unary.Math.cs:
- EmitAbsCall: Add SByte (bitwise like Int16), Half (via Math.Abs), Complex (magnitude)
- EmitSignCall: Add SByte (comparison pattern), Half (NaN-safe), Complex (unit vector z/|z|)
- EmitConvertFromInt: Add SByte (Conv_I1), Half (via double), Decimal, Complex
…f/Complex

Fixes:
- Half.NaN, Half.Zero, Half.PositiveInfinity, Half.NegativeInfinity are
  static **properties**, not fields - changed GetField to GetProperty().GetGetMethod()
- Complex abs IL was invalid due to extra Ldloca - simplified to just call Complex.Abs
- Added CachedMethods for all Half/Complex reflection lookups to avoid
  repeated reflection at kernel generation time

CachedMethods added:
- HalfToDouble, DoubleToHalf (op_Explicit conversion methods)
- HalfIsNaN, HalfNaN, HalfZero, HalfPositiveInfinity, HalfNegativeInfinity
- ComplexAbs, ComplexDivisionByDouble, ComplexZero, ComplexOne, ComplexCtor

Verified NumPy parity:
- SByte: sum, abs, sign all match NumPy
- Half: sum(NaN)=NaN, abs, sign(NaN)=NaN all match NumPy
- Complex: sum, sign (unit vector) match NumPy
  (abs returns Complex(magnitude,0) instead of float64 - type preservation limitation)
- Axis reductions: sum(axis), mean(axis) all match NumPy
NumPy behavior: np.abs(complex_array) returns a float64 array containing
the magnitudes, not a complex array.

Before: np.abs([1+2j, -3+4j]) → [Complex(2.236,0), Complex(5,0)]
After:  np.abs([1+2j, -3+4j]) → [2.236, 5.0] (dtype=float64)

Implementation:
- DefaultEngine.Abs() now detects Complex input and calls ExecuteComplexAbs()
- ExecuteComplexAbs() uses iterator-based approach to compute Complex.Abs()
  for each element, storing double results
- The IL kernel for Complex abs is bypassed since type changes

Verified with Python:
>>> np.abs(np.array([1+2j, -3+4j, 0+0j, 5+0j])).dtype
dtype('float64')
…or fallback

Moved Complex→Double abs logic into ILKernelGenerator where it belongs:

ILKernelGenerator.Unary.cs / ILKernelGenerator.Scalar.cs:
- Added special case for Complex abs that calls Complex.Abs directly
- Emits Complex.Abs (returns double magnitude) instead of converting
  Complex→Double (which extracts Real part) then calling Math.Abs

DefaultEngine.UnaryOp.cs:
- Added SByte, Half, Complex to scalar dispatch (uses Storage.Get* methods)
- Added SByte, Half, Complex to output type dispatch

Default.Abs.cs:
- Removed iterator-based ExecuteComplexAbs fallback
- Now delegates to ExecuteUnaryOp with output type = Double
- IL kernel handles the Complex→Double type change

This is the proper fix - abs goes through the IL kernel path like all other
unary operations, with special handling for Complex input type.
Add typed accessor methods for the three new dtypes:
- GetSByte/SetSByte (int[] and long[] overloads)
- GetHalf/SetHalf (int[] and long[] overloads)
- GetComplex/SetComplex (int[] and long[] overloads)

These methods are added to both UnmanagedStorage and NDArray (wrapper).

Also includes scalar extraction support in DefaultEngine for binary
and unary operations using the new Get* methods.
Add 109 new tests covering:
- Basic operations (create, zeros, ones) - 12 tests
- Arithmetic (add, multiply, negate, divide) - 11 tests
- Reductions (sum, prod, mean, min, max, std, var, argmax, argmin) - 24 tests
- Unary operations (abs, sign, sqrt, floor, ceil, exp, log, sin) - 19 tests
- Cumulative operations (cumsum, cumprod) - 6 tests
- Type promotion (mixed type operations) - 6 tests
- Edge cases (NaN, infinity, all/any, count_nonzero, broadcasting, slicing) - 16 tests
- Comparison (equal, less than, astype conversions, power) - 15 tests

68 tests pass, 41 marked [OpenBugs] for operations not yet supported:
- Half/Complex IL kernel comparisons
- Half/Complex unary math functions (sqrt, exp, log, sin, floor, ceil)
- Complex multiply/negate
- SByte/Half/Complex min/max/argmin/argmax
- SByte/Half/Complex all/any
- SByte/Complex dot
- Complex mean/std/axis reductions
- Half/Complex astype conversions
- SByte power

All expected values verified against NumPy 2.x.
IL Kernel Fixes:
- Add EmitUnaryComplexOperation for Complex: Negate, Sqrt, Exp, Log, Sin, Cos, Tan, Square, Reciprocal, Sign
- Add EmitUnaryHalfOperation for Half: Negate, Abs, Sqrt, Sin, Cos, Tan, Exp, Log, Floor, Ceil, Truncate, Square, Reciprocal, Sign, IsNaN, IsInf, IsFinite
- Add cached methods for Complex/Half operators and math functions
- Fix Half ArgMax/ArgMin to use helper method (IL comparison doesn't work for Half)
- Add ComplexSignHelper (returns unit vector z/|z|)
- Add HalfSignHelper (handles NaN properly)

Reduction Fixes:
- Add SByte/Half to All/Any dispatch in DefaultEngine
- Add Half/Complex special handlers in Default.All.cs and Default.Any.cs
- Add SByte/Half ArgMax/ArgMin dispatch in DefaultEngine.ReductionOp.cs

Test Status:
- 90 NewDtypes tests pass (excluding OpenBugs)
- 15 OpenBugs remain for: Mean, Std, NaN-aware reductions, Dot, Power, CumProd, ArgMax/ArgMin by magnitude, type promotion, infinity operations

Fixed tests (removed [OpenBugs]):
- SByte: Min, Max, ArgMax, ArgMin, All_Any
- Half: Sqrt, Floor, Ceil, Exp, Sin, ArgMax, ArgMin, All_Any, Sign
- Complex: Negate, Sqrt, Exp, Log, All_Any, Equal, LessThan, NaN_Comparisons, Sign
- Add Complex and Half cases to EmitReductionCombine for Sum/Prod
- Add EmitHalfBinaryOp helper for Half arithmetic in reductions
- Add cached ComplexOpAddition and ComplexOpMultiply methods
- Add (Half,Half) and (Complex,Complex) axis reduction kernel paths
- Fix CombineScalarsPromoted to preserve Complex imaginary part
- Fix DivideByCount and GetIdentityValueTyped for Half/Complex
- Add Complex handling in mean_elementwise_il to return Complex
- Add Complex handling in std/var fallbacks using |x-mean|^2

This fixes Complex_Mean, Complex_Std, Complex_Sum_Axis tests.
Replace GetMethod("op_Explicit", ...) with CachedMethods.HalfToDouble
and CachedMethods.DoubleToHalf to avoid AmbiguousMatchException when
Half has multiple op_Explicit methods with same parameter but different
return types.

Fixes Half_Plus_Complex_PromotesToComplex test.
…nd type conversion

Cumulative operations:
- Add Complex fallback path in cumsum_elementwise_fallback using Complex accumulator
- Add Complex fallback path in cumprod_elementwise_fallback using Complex accumulator

Mean operation:
- Handle Half separately in mean_elementwise_il to preserve float16 dtype
- Update ReduceMean output type logic to preserve Half/Single/Double/Complex
- NumPy 2.x parity: mean(float16) returns float16, not float64

Type conversion:
- Add Half case in generic ChangeType<T> for converting any type to Half
- Add Half/Complex source type handling in non-generic ChangeType for Double/Single/Decimal targets
- Fixes InvalidCastException when Half doesn't implement IConvertible

Tests now passing: Complex_CumProd, Half_Mean, Half_Std, and more cumulative tests
- Use direct float->Half cast instead of going through double
- Keep axis mean output as Double for compatibility with axis reduction kernels
- Element-wise mean still preserves dtype per NumPy 2.x (float32->float32)
- Update test to reflect element-wise mean dtype preservation
…N reductions

## Complex ArgMax/ArgMin (magnitude-based)
- Add ArgMaxComplexHelper/ArgMinComplexHelper in ILKernelGenerator.Reduction.Arg.cs
- Add Complex case to EmitArgMaxMinSimdLoop and EmitReductionScalarLoop
- Add NPTypeCode.Complex cases to argmax/argmin_elementwise_il

## IsInf implementation
- Implement Default.IsInf using IL kernel (was returning null)
- Add ComplexIsNaNHelper, ComplexIsInfinityHelper, ComplexIsFiniteHelper
- Add IsNan/IsInf/IsFinite cases to EmitUnaryComplexOperation

## Half NaN-aware reductions
- Add NPTypeCode.Half to NanSum/NanProd/NanMin/NanMax type checks
- Add NanSumHalfHelper, NanProdHalfHelper, NanMinHalfHelper, NanMaxHalfHelper
- Add NanReduceScalarHalf fallback method

## Tests fixed (removed [OpenBugs])
- Complex_ArgMax_ByMagnitude, Complex_ArgMin_ByMagnitude
- Half_Infinity_Operations, Complex_Infinity_Operations
- Half_NanSum, Half_NanMin

OpenBugs: 58 -> 27 failing
Updated Default.Dot to pass through the input dtype to ReduceAdd
instead of allowing it to promote to Int64. This matches NumPy
behavior where np.dot for int8 arrays returns int8.

Tests fixed (removed [OpenBugs]):
- SByte_Dot - now preserves SByte dtype
- SByte_Power - was already working, removed stale attribute
Updated CreateFallbackConverter to use Converts.ToXxx methods with
unchecked wrapping instead of Convert.ChangeType which throws
OverflowException on integer overflow.

This ensures NDArray.astype() matches NumPy behavior for:
- Integer-to-integer overflow wrapping (e.g., int64(-1) -> uint32 = 4294967295)
- Float special values (inf/nan) -> integer (returns 0 or min value)
- Half/Complex -> integer conversions

The fallback converter now dispatches to the appropriate Converts.ToXxx
method based on output type, using integer path for integer inputs and
double path for float inputs.
Fixed type promotion tables for float16 (Half) to match NumPy behavior:

arr_arr table (array + array):
- float16 + int16/uint16 → float32 (was float16)
- float16 + int32/uint32/int64/uint64 → float64 (was float32)
- int16/uint16 + float16 → float32 (symmetric)
- int32/uint32/int64/uint64 + float16 → float64 (symmetric)

arr_scalar table (array + scalar):
- int_arr + float16_scalar → promotes (float16 is strongly typed)
- float16_arr + int_scalar → float16 (C# int is weakly typed like Python)

Also fixed CreateFallbackConverter to use Converts.ToXxx methods with
unchecked wrapping instead of Convert.ChangeType for NumPy-compatible
integer overflow behavior.

All 121 arr+arr type pairs verified against NumPy 2.x.
All 5687 tests passing.
Complex comparisons:
- Add lexicographic ordering for Complex comparisons (< > <= >=)
- NumPy 2.x compares Complex first by real part, then imaginary part
- Add ComplexLessThanHelper, ComplexLessEqualHelper, etc. helper methods
- Add SByte, Half, Complex to scalar comparison dispatch

Array creation functions:
- Fix np.ones for Half and SByte by adding explicit cases
- Fix np.full and np.full_like to use NPTypeCode overload of ChangeType
  instead of TypeCode cast (which doesn't support Half/Complex)
- Fix ArraySlice.Allocate to handle when fill value is already Half
  (Half doesn't implement IConvertible)

Type promotion:
- Fix float16 + int16/uint16 to return float32 (was returning float16)
- Matches NumPy 2.x behavior where larger integers require more float precision
This commit brings NumSharp's type conversion behavior to 100% parity
with NumPy 2.x astype() semantics.

Key changes:

1. Float → Unsigned Integer (the critical fix):
   - BEFORE: Negative floats returned 0 (WRONG)
   - AFTER: Truncate toward zero, then wrap modularly (NumPy behavior)
   - Examples: -1.0→uint8 = 255, -3.7→uint8 = 253

2. Float → Integer truncation:
   - Truncation toward zero (not rounding to nearest)
   - 3.7 → 3, -3.7 → -3, 0.9 → 0, -0.9 → 0

3. NaN/Inf → Integer special values:
   - int8/int16: returns 0
   - int32: returns int.MinValue (-2147483648)
   - int64: returns long.MinValue
   - uint8/uint16/uint32: returns 0
   - uint64: returns 2^63 (9223372036854775808)

4. Integer overflow wrapping:
   - Values outside target range wrap modularly
   - Examples: 256.0→uint8 = 0, 1000.0→uint8 = 232

5. Bool conversion:
   - 0 → False, nonzero → True
   - NaN → True, Inf → True (any nonzero is True)

Updated tests to expect NumPy-compatible behavior instead of the
previous IConvertible rounding semantics.

Files modified:
- Converts.Native.cs: Core float→int conversion methods
- Converts.cs: Added System.Numerics using for Complex
- Test files: Updated expectations to match NumPy behavior

Verified against NumPy 2.x output for all edge cases.
NumPy uses int32 as intermediate type when converting floats to byte.
- Values outside int32 range overflow to 0
- Half always fits in int32 range, so simplified path

Verified edge cases match NumPy:
- 1e30 -> uint8: 0 (overflow)
- 256.0 -> uint8: 0 (wraps)
- 1000.0 -> uint8: 232 (wraps)
- -3.7 -> uint8: 253 (truncate+wrap)
Add 43 new tests covering all NumPy-compatible dtype conversion behaviors:

- Float to integer truncation (toward zero, not round)
- Negative float to unsigned integer (truncate then wrap)
- Positive float overflow wrapping for small types
- Float outside int32 range returns 0 for small types
- NaN/Inf special handling for all integer types
- Half (float16) conversions
- Integer-to-integer wrapping and narrowing
- Bool conversions (0→False, nonzero→True, NaN→True)
- Complex number conversions (discard imaginary)
- NDArray.astype() integration tests

All expected values verified against NumPy 2.x output.
Test count: 5687 → 5730 (+43)
Complex comparisons (lexicographic ordering):
- Complex_LessThan_Lexicographic: verifies c1 < c2 uses lexicographic order
- Complex_GreaterThan_Lexicographic: verifies c1 > c2
- Complex_LessEqual_Lexicographic: verifies c1 <= c2
- Complex_GreaterEqual_Lexicographic: verifies c1 >= c2
All verified against NumPy 2.x behavior

Type promotion tests:
- Half_Plus_Int16_PromotesToFloat32: float16 + int16 = float32
- Half_Plus_UInt16_PromotesToFloat32: float16 + uint16 = float32
- Half_Plus_Int8_StaysHalf: float16 + int8 = float16 (int8 fits)
- Half_Plus_Int32_PromotesToFloat64: float16 + int32 = float64

Array creation tests:
- SByte_Ones, Half_Ones, Complex_Ones: np.ones with new dtypes
- SByte_Full, Half_Full, Complex_Full: np.full with new dtypes

All expected values verified against NumPy 2.x
Add DtypeConversionMatrixTests.cs with 70 tests covering all 12 source types
converting to all 12 target types, verified against NumPy 2.4.2 output.

Coverage includes:
- Bool: 2 values × 12 targets (False, True)
- Int8: 5 values × 12 targets (0, 1, -1, 127, -128)
- UInt8: 5 values × 12 targets (0, 1, 127, 128, 255)
- Int16: 5 values × 12 targets (0, 1, -1, 32767, -32768)
- UInt16: 5 values × 12 targets (0, 1, 32767, 32768, 65535)
- Int32: 5 values × 12 targets (0, 1, -1, MAX, MIN)
- UInt32: 5 values × 12 targets (0, 1, 2147483647, 2147483648, MAX)
- Int64: 5 values × 12 targets (0, 1, -1, MAX, MIN)
- UInt64: 5 values × 12 targets (0, 1, INT64_MAX, INT64_MAX+1, MAX)
- Float32: 8 values × 12 targets (0, 1, -1, 3.7, -3.7, NaN, +Inf, -Inf)
- Float64: 8 values × 12 targets (same as Float32)
- Half: 8 values × 12 targets (same as Float32)

Edge cases covered:
- Integer wrapping (signed ↔ unsigned, wider → narrower)
- Float truncation toward zero (3.7 → 3, -3.7 → -3)
- Negative float → unsigned wrapping (-1.0 → 255 for uint8)
- NaN/Inf special handling per target type
- Float16 precision limits (65535 → inf, INT32_MAX → inf)

Total dtype conversion tests: 113 (43 parity + 70 matrix)
Expand DtypeConversionMatrixTests.cs with 16 more tests covering:

Large float edge cases:
- float64(1e10) → int32 returns MIN_VALUE (not 0)
- float64(1e10) → uint32 wraps to 1410065408 (not 0)
- float64(-1e10) → uint32 wraps to 2884901888
- float64(1e19/1e20) → int64/uint64 overflow behavior

Exact boundary tests:
- float64 at int8 boundaries (127, 128, -128, -129)
- float64 at uint8 boundaries (255, 256)
- Small fractions (0.1, 0.999999) all truncate to 0

NumSharp-specific type coverage:
- Char → int32/uint8 (ASCII values)
- int → Char (uses low bits)
- Complex → float64/int32/bool (takes real part, pure imaginary is truthy)
- Decimal → float64/int32 (preserves/truncates)

All edge case values verified against NumPy 2.4.2 output.

Total tests: 86 matrix + 43 parity = 129 dtype conversion tests
Ensure all 144 source→target dtype combinations are tested:
- Add Half (float16) as target to all Float32/Float64/Float16 NaN/Inf tests
- Add Half, float32, float64 targets to all fractional float tests
- Add separate Xxx_ToHalf tests for all integer types (Bool, Int8-64, UInt8-64)
- Add Float16_ToFloat32, Float16_ToFloat64 tests
- Add Float32_ToHalf, Float32_NaNInf_ToHalf tests
- Add Float64_ToHalf, Float64_NaNInf_ToHalf tests

Coverage matrix (12 source × 12 target types):
- Bool: 2 values → 12 targets ✓
- Int8/UInt8: 5 values each → 12 targets ✓
- Int16/UInt16: 5 values each → 12 targets ✓
- Int32/UInt32: 5 values each → 12 targets ✓
- Int64/UInt64: 5 values each → 12 targets ✓
- Float16/32/64: 8 values each → 12 targets ✓

Total dtype conversion tests: 146 (103 matrix + 43 parity)
All conversions verified against NumPy 2.4.2
Complex as SOURCE (5 values × 12 targets = 60 conversions):
- Complex_Zero_ToAllTypes: 0+0j → all 12 types
- Complex_One_ToAllTypes: 1+0j → all 12 types
- Complex_NegativeOne_ToAllTypes: -1+0j → all 12 types (wrapping)
- Complex_Fractional_ToAllTypes: 3.7+4.2j → all 12 types
- Complex_PureImaginary_ToAllTypes: 0+1j → all 12 types

Complex as TARGET (8 source types):
- Bool_ToComplex: False→0+0j, True→1+0j
- Int8_ToComplex: sbyte values → Complex
- UInt8_ToComplex: byte values → Complex
- Int32_ToComplex: int values → Complex
- Float32_ToComplex: float values → Complex
- Float64_ToComplex: double values → Complex
- Float64_NaNInf_ToComplex: NaN/Inf → Complex(NaN/Inf, 0)
- Half_ToComplex: Half values → Complex

All conversions verified against NumPy 2.4.2:
- Real part extraction for numeric targets
- Bool considers magnitude (pure imaginary is True)
- Integer targets truncate real part
- Unsigned targets wrap negative values

Total dtype conversion tests: 155 (112 matrix + 43 parity)
Fixes 7 bugs in Converts.ToXxx where the paths were either throwing
InvalidCastException/OverflowException instead of wrapping modularly as
NumPy does, or silently emitting inconsistent values for special inputs.

Since Char is 16-bit unsigned, its conversion semantics now mirror
NumPy's uint16 behavior. Decimal (no NumPy equivalent) follows the same
wrapping + NaN/Inf->0 pattern as the small integer types.

Fixes
-----
Bug #1: ToChar(object) no longer throws for Half/Complex/bool. Replaced
the naive ((IConvertible)value).ToChar(null) call with a full switch
expression covering all 15 NumSharp types, matching the ToByte(object)
pattern.

Bug #2: ToDecimal(object) no longer throws for Complex/Half. Added a
switch expression routing Complex (real part only), Half, and all other
types through their typed ToDecimal overloads.

Bug #3: ToChar(bool) no longer throws. Previously called
((IConvertible)value).ToChar(null) which .NET rejects for bool. Now
returns 1 for true, 0 for false.

Bug #4: ToChar(float/double/decimal) no longer throws. Previously
delegated to IConvertible which rejects all three. Now follows the
NumPy small-integer pattern: NaN/Inf -> 0, out-of-int32-range -> 0,
otherwise truncate toward zero and wrap via (char)(ushort)(int).

Bug #5: ToChar(Half) now checks NaN/Inf before casting. Previously a
raw (char)(ushort)value cast produced inconsistent values (Half.NaN
-> 0, Half.PositiveInfinity -> 65535). Now both special cases return 0,
matching all other Half->integer methods in this file.

Bug #6: ToByte/UInt16/UInt32/UInt64(decimal) now wrap modularly for
negative values instead of throwing OverflowException. This also fixes
ToSByte/Int16/Int32/Int64(decimal) for out-of-range values, which
previously threw. All decimal->integer conversions now route through
decimal.Truncate + intermediate int/long cast to match the float->int
behavior already in place.

Bug #7: ToDecimal(float/double) now returns 0m for NaN/Inf and for
values exceeding decimal's range (~+/-7.9e28) instead of throwing
OverflowException. This also adjusts ToDecimal(Half) for consistency
(NaN/Inf -> 0) and ToDecimal(Complex) to route through the same
double->decimal path.

Notable ripple effects
----------------------
decimal->int16/int32/int64 overflow now returns MinValue (matching
NumPy's float->intN overflow convention) instead of throwing. This is
new behavior but was previously unreachable in tests that relied on
the throw path (none existed).

All existing ToXxx(decimal) for signed types remain backward
compatible for in-range values.

Tests
-----
Added test/NumSharp.UnitTest/Casting/ConvertsBattleTests.cs with 45
tests covering each bug at both the scalar Converts.ToXxx layer and
the array astype() layer. Tests verify:

- bool/float/double/decimal/Half/Complex -> Char produce correct
  truncated + wrapped values with NaN/Inf -> 0
- negative decimal -> byte/uint16/uint32/uint64 produce wrapped values
  (e.g. -1m -> 255, 65535, 4294967295, ulong.MaxValue)
- double NaN/Inf/overflow -> Decimal produce 0m (not throw)
- Half NaN/Inf -> Char produce 0 (not 65535)
- ToChar(object) and ToDecimal(object) handle all 15 types

Test result: 5901 passed / 0 failed / 11 skipped on both net8.0 and
net10.0. OpenBugs category unaffected (48 failures, same as before).
…ngeType

Second round of audit found 12 more bugs across five groups, all
missed by Round 1. All fixed via TDD. Scalar/object/array paths now
have 105/105 consistency on the 15x15 matrix.

Groups and fixes
----------------

Group A: typed ToXxx(char) scalar methods
  .NET's IConvertible.ToBoolean/Single/Double/Decimal on char all
  throw InvalidCastException. Our typed methods delegated to
  IConvertible and inherited the breakage.

  - ToBoolean(char)  throws -> return value != 0
  - ToSingle(char)   throws -> return (float)value
  - ToDouble(char)   throws -> return (double)value
  - ToDecimal(char)  throws -> return (decimal)value

Group B: ToXxx(object) dispatchers missing char
  ToBoolean/ToSingle/ToDouble/ToHalf/ToComplex(object) had only
  Half/Complex if-checks before falling through to IConvertible,
  which throws for char. Refactored to full switch expressions
  matching the ToByte(object) pattern used by the integer targets.

  Each dispatcher now handles: Half, Complex, char, bool, all primitive
  numerics, decimal, plus IConvertible fallback for unknown types.

  Also consolidated the (object, IFormatProvider) overloads to
  delegate to the no-provider version since provider is only
  meaningful for string targets.

Group C: CreateFallbackConverter for Half/Complex output
  The lambda called ic.ToDouble(null) which throws for char source.
  Replaced both Half and Complex output lambdas to delegate to
  Converts.ToHalf((object)in) / Converts.ToComplex((object)in) which
  now handle char via Group B fix.

Group D: CreateDefaultConverter NaN/overflow safety
  Used Convert.ChangeType which doesn't handle Half.NaN or
  Complex.NaN (throws OverflowException for decimal target).
  Replaced with delegation to Converts.ChangeType(obj, NPTypeCode)
  which uses the NumPy-aware helpers.

Group E: ChangeType(object, NPTypeCode, IFormatProvider) at line 1141
  Still used raw IConvertible (Round 1 fixed only the 2-arg version).
  Threw ArgumentException for SByte/Half/Complex targets (missing
  from switch), threw InvalidCastException for Half/Complex source
  (Complex doesn't implement IConvertible), threw OverflowException
  for NaN (IConvertible doesn't do NumPy's MinValue pattern).
  Replaced with delegation to the 2-arg version, handling provider
  only for String target.

Internal helper fixes
---------------------
  ToDecimal_NumPy in Converts.cs was missing the Complex case, which
  caused array-path Complex -> decimal to go to IConvertible and
  throw (Complex doesn't implement it). Same issue in ToHalf_NumPy.
  Both helpers now include all 15 source types.

Tests
-----
Added 29 new tests to ConvertsBattleTests.cs:
  - 4 typed ToXxx(char) scalar tests
  - 6 ToXxx(object) char dispatcher tests
  - 6 array path char->target tests
  - 2 FindConverter char->Half/Complex tests
  - 3 CreateDefaultConverter NaN safety tests
  - 8 ChangeType(obj, tc, provider) tests covering Half/Complex/SByte
    targets, Half/Complex sources, and NaN -> MinValue/zero

Test totals: 74 battletests pass (up from 45), full suite passes
5930 / 0 fail / 11 skip on both net8.0 and net10.0 (up from 5901).

Final consistency: 105/105 on the 15-source x 7-target scalar-vs-
object matrix (was 102/105 before this round).
Nucs added 28 commits April 21, 2026 19:24
…lex/char

Three remaining paths in Converts that still used raw IConvertible casts,
causing throws for Half/Complex sources (neither implements IConvertible)
and char→Boolean (char's IConvertible.ToBoolean unsupported by the BCL).

Group A: String target in NumPy-aware ChangeType dispatchers
---------------------------------------------------------
ChangeType<TOut>(Object) at Converts.cs:159 and ChangeType(Object, NPTypeCode)
at Converts.cs:223 called ((IConvertible)value).ToString(InvariantCulture)
for String target. This threw InvalidCastException for Half/Complex sources.

Replaced with IFormattable-based conversion:
    value is IFormattable f ? f.ToString(null, InvariantCulture) : value.ToString()

All 15 NumSharp scalar types implement IFormattable (Boolean, Char, SByte,
Byte, Int16/32/64, UInt16/32/64, Single, Double, Decimal, Half, Complex),
so this covers every supported source. Fallback to value.ToString() preserves
contract for exotic types.

Group B: FindConverter routes through the fixed path
---------------------------------------------------
FindConverter<Half, string>() and FindConverter<Complex, string>() previously
threw because CreateFallbackConverter → CreateDefaultConverter routes through
Converts.ChangeType(obj, NPTypeCode.String), which is now fixed by Group A.

Group C: .NET-style ChangeType(Object, TypeCode, IFormatProvider)
---------------------------------------------------------------
Converts.Native.cs:62-120 two overloads (TypeCode, not NPTypeCode) used raw
IConvertible for every case. Broken for:
- Half source to any target: InvalidCastException (no IConvertible)
- Complex source to any target: InvalidCastException (no IConvertible)
- char source to Boolean: InvalidCastException (char's IConvertible.ToBoolean
  unsupported per BCL design)
- Double source rounded instead of truncated (NumPy parity): 3.7 → 4, not 3

Refactored to route through Converts.ToXxx(object) dispatchers which handle
Half/Complex/char sources and apply NumPy-parity semantics (truncation,
wrapping, NaN handling). DateTime target preserved on raw IConvertible since
DateTime is not a NumPy dtype. String target uses IFormattable as in Group A.

Result
------
All 7 conversion paths (typed scalar / object dispatcher / array astype /
FindConverter / Converts<T> / ChangeType<TOut>(Object) / ChangeType(Object,
TypeCode)) now consistently support all 15 source+target combinations or
throw with the same error where intentional.

Tests
-----
+21 battletests in ConvertsBattleTests.cs (Round 4 region), total 118.
Full suite: 5974/0/11 on both net8.0 and net10.0 (was 5953 before, +21).
Zero regressions.

Notes
-----
DateTime target via ChangeType(Object, TypeCode) still uses raw IConvertible;
Half/Complex/char → DateTime therefore throws. DateTime is not a NumPy dtype
and not part of the scalar parity guarantee, so left as-is.

Half.NaN → Int32 remains MIN_VALUE across all 7 paths. This matches NumPy
2.x on x86-64 (implementation-defined behavior). Consistent everywhere.
…omplex

Two leftover sites from docs/plans/LEFTOVER.md (H1, H2, H3) that broke
NumPy-aligned operations for Half and Complex sources:

H1+H2: ArraySlice.Allocate(*, count, fill) [src/Backends/Unmanaged/ArraySlice.cs]
-----------------------------------------------------------------------------
Two overloads (NPTypeCode-based at L404-426, Type-based at L479-501) cast
the fill value to IConvertible to invoke ToBoolean/ToSByte/.../ToDecimal.
This throws InvalidCastException when fill is Half or Complex.

The Half target case had a partial workaround:
    fill is Half h ? h : (Half)Convert.ToDouble(fill)
But Convert.ToDouble also routes through IConvertible internally, so a
Complex fill targeting Half still throws. Same for Half fill targeting
Complex (line 422/497).

Replaced all 26 IConvertible/Convert calls in both overloads with
Converts.ToXxx(fill) which handles all 15 dtypes via the object
dispatcher (NumPy-parity truncation/wrapping/NaN semantics).

H3: np.searchsorted [src/Sorting_Searching_Counting/np.searchsorted.cs]
---------------------------------------------------------------------
Three Convert.ToDouble(arr.Storage.GetValue(...)) sites (L51, L61, L85)
boxed array elements before conversion. Throws when array dtype is
Half or Complex.

Replaced with Converts.ToDouble which handles Half/Complex (Complex
discards imaginary, matching NumPy's ComplexWarning sort behavior).

Also added 'using NumSharp.Utilities;' for Converts access.

Tests
-----
+16 battletests in ConvertsBattleTests.cs (Round 5A region):

ArraySlice.Allocate (10):
- NPTypeCode_Int32_FillHalf, NPTypeCode_Double_FillHalf
- NPTypeCode_Int32_FillComplex_DiscardsImaginary
- NPTypeCode_Half_FillComplex_DiscardsImaginary
- NPTypeCode_Complex_FillHalf, NPTypeCode_Bool_FillComplex_NonZero
- NPTypeCode_Char_FillHalf, NPTypeCode_Int32_FillInt (regression)
- Type_Int32_FillHalf, Type_Half_FillComplex, Type_Complex_FillHalf

Searchsorted (5):
- HalfArray_FindsPosition, HalfArray_DoubleValue_FindsPosition
- ComplexArray_FindsPosition, HalfArray_MultipleValues_Works
- DoubleArray_FindsPosition (regression)

Total battletests: 134 (was 118 in Round 4, +16).
Full suite: 5990/0/11 on both net8.0 and net10.0 (was 5974, +16 from
new battletests + 0 net change from production).
Zero regressions.

Note
----
docs/plans/LEFTOVER.md updated to track Round 5A completion. Remaining
H4-H8 + M1-M4 sites from that doc are still TODO for Round 5B+.
…t sites

Completes the docs/plans/LEFTOVER_CONVERTS.md audit by fixing the remaining
20 sites across 9 files outside the Converts utility. All paths that boxed
values for System.Convert.ToXxx or cast to IConvertible now route through
Converts.ToXxx (object dispatcher) for full Half/Complex/char support.

5B: Math/BLAS/Convolve scalar fallbacks (7 sites)
-------------------------------------------------
- H4 Default.MatMul.2D2D.cs:323,329 — matmul scalar fallback now accepts Half.
  Complex preserves real part only (scalar fallback uses double accumulator —
  full Complex matmul needs separate accumulator path; documented as Misaligned).
- H5 Default.Dot.NDMD.cs:371,375 — dot product scalar fallback for Half.
- H6 NdArray.Convolve.cs:154,155 — convolve scalar path now boxes Half pointer
  derefs explicitly: Converts.ToDouble((object)aPtr[j]).
- H8 DefaultEngine.ReductionOp.cs:310 — mean of scalar Half via null-typeCode
  fallback no longer throws. Returns Double dtype (NumPy returns Float16 —
  separate dtype-decision issue, not in scope).

5C: Scan kernel scalar accumulators (13 sites)
----------------------------------------------
- H7 ILKernelGenerator.Scan.cs lines 1128, 1138, 1148, 1947-1987, 2392-2432.
  AxisCumProd/AxisCumSum/ElementwiseCumSum scalar accumulator paths box
  generic TIn* deref to call Converts.ToInt64/Double/Single/UInt64/Decimal.
  Enables np.cumsum / np.cumprod on Half (1D) and Complex arrays.
  Note: AxisCumSum on Half still throws "AxisCumSum not supported for type Half"
  earlier in dispatch — separate IL kernel issue, not in this fix's scope.

5D: Edge cases (7 sites)
------------------------
- M1 np.repeat.cs:75,172 — Half/Complex as repeats array now permissively
  truncates to int64 (NumPy throws TypeError; documented as Misaligned).
- M2 Default.Shift.cs:136 — Half/Complex shift amount conversion. Defensive
  fix; np.left_shift's asanyarray(Half) rejects Half upstream.
- M3+M4 NDArray.Indexing.Selection.{Setter,Getter}.cs — added Half/Complex
  case branches before IConvertible case in slice-conversion switches.
  Defensive fix; deeper validation switch (Getter:70-87, Setter:75-97) still
  rejects Half/Complex with "Unsupported indexing type" error.

Tests
-----
+11 new battletests in ConvertsBattleTests.cs (Round 5B/5C/5D regions):

Round 5B (4):
- MatMul_HalfMatrix_ScalarFallback_Works
- MatMul_ComplexMatrix_RealOnlyLimitation [Misaligned]
- Dot_HalfArray_Works
- Convolve_HalfArrays_Works
- Mean_ScalarHalfArray_Works

Round 5C (5):
- CumSum_HalfArray_Works, CumProd_HalfArray_Works
- CumSum_ComplexArray_Works, CumProd_ComplexArray_Works
- CumSum_DoubleArray_Works (regression)

Round 5D (1):
- Repeat_HalfRepeats_PermissiveTruncate [Misaligned]

Total battletests: 145 (was 134 in Round 5A, +11).
Full suite: 6001/0/11 on both net8.0 and net10.0 (was 5990, +11).
Zero regressions.

NumPy Parity Reference (NumPy 2.4.2)
-------------------------------------
Verified expected outputs against actual NumPy:
- matmul(half2x2, half2x2) = [[19,22],[43,50]] float16
- dot(half[1,2,3], half[4,5,6]) = 32 float16
- convolve(half[1,2,3], half[0,1,0.5]) = [0,1,2.5,4,1.5] float16
- cumsum(half[1,2,3,4]) = [1,3,6,10] float16
- cumprod(half[1,2,3,4]) = [1,2,6,24] float16
- cumsum(complex[1+1j,2,3-1j]) = [1+1j,3+1j,6+0j]
- cumprod(complex[1+1j,2,3-1j]) = [1+1j,2+2j,8+4j]

Misaligned (NumSharp more permissive than NumPy):
- np.repeat with Half repeats: NumSharp truncates, NumPy throws TypeError
- arr[Half(2)] / arr[Complex(2,0)]: NumSharp would truncate (if validation
  switch were expanded), NumPy throws IndexError
- np.left_shift(arr, Half(2)): NumSharp would truncate (defensive), NumPy
  throws TypeError

Documentation
-------------
- docs/plans/LEFTOVER_CONVERTS.md (new): scannable audit reference for the
  20 remaining sites with status tracking (H1-H3 Round 5A, H4-H8 + M1-M4
  Round 5B+5C+5D), proposed fixes per site, and skip rationale for the
  ~50 NF (no fix needed) sites.
Previously all DateTime/TimeSpan numeric conversions threw InvalidCastException
at runtime because the implementations delegated to `IConvertible.ToXxx()` which
is unsupported for DateTime numerics and the interface itself is unimplemented
on TimeSpan. Every primitive -> DateTime path failed, every DateTime -> primitive
path failed, and TimeSpan conversions were commented out as "disallowed".

These aren't NumPy dtypes in NumSharp, but NumPy's datetime64/timedelta64 do
have well-defined conversion semantics (int64 internally; NaT = int64.MinValue;
bool = (int64 != 0); NaN/Inf -> NaT). This change mirrors those semantics
using DateTime.Ticks / TimeSpan.Ticks as the int64 representation.

Key parity points (verified against NumPy 2.4.2):
- DateTime <-> int64 via Ticks (wraps on smaller-int cast, exactly like
  NumPy int64->intN wrapping).
- TimeSpan <-> int64 via Ticks — full int64 range, so TimeSpan.MinValue.Ticks
  == long.MinValue == NumPy NaT exactly. bool(TimeSpan.MinValue) == True.
- NaN/Inf -> TimeSpan.MinValue (exact NaT parity) or DateTime.MinValue
  (best-effort; DateTime cannot represent negative ticks or long.MinValue).
- bool(DateTime/TimeSpan) = Ticks != 0 (NumPy: int64 != 0).
- Out-of-range numeric -> DateTime collapses to DateTime.MinValue. Only
  documented divergence: bool(DateTime.MinValue)=False because DateTime
  cannot hold the int64.MinValue sentinel that makes NumPy bool(NaT)=True.

Changes:
- Converts.Native.cs: replace every `((IConvertible)dt).ToXxx(null)` with
  `ToXxx(dt.Ticks)`; add full TimeSpan conversion family (previously "disallowed");
  add DateTime/TimeSpan cases to every ToXxx(object) dispatcher; add
  ToTimeSpan(...) family; rewrite numeric->DateTime through TicksToDateTime
  helper that clamps to valid DateTime range.
- Converts.cs: add DateTime/TimeSpan cases to every _NumPy fast-path helper and
  route fallback through Converts.ToXxx(object) instead of lossy ToDouble cast.
- ConvertsDateTimeParityTests.cs: 61 parity tests covering DateTime <-> all
  12 dtypes, TimeSpan <-> all 12 dtypes, bool semantics incl. NaT, NaN/Inf
  handling, object-dispatch paths, ChangeType integration, and round-trips.
  Every expected value cross-verified with live NumPy 2.4.2 output.

All 6070 non-OpenBugs/HighMemory tests pass (including 369 casting tests).
…orms

Two corrections to the Round 5B+5C+5D test additions:

1. Restored 5 Misaligned tests removed when prepping Round 5B+5C+5D commit:
   each now asserts the CURRENT divergent behavior (typically a throw) so we
   lock in the divergence. When NumSharp's behavior changes (toward or away
   from NumPy alignment), the test breaks and forces explicit review.

2. Added duplicate test forms preserving original intent for tests where the
   test core was changed during the prior round.

Restored Misaligned tests
-------------------------
CumSum_HalfMatrix_Axis0_NotSupported
CumSum_HalfMatrix_Axis1_NotSupported
  Asserts NotSupportedException ("AxisCumSum not supported for type Half").
  NumPy: returns float16 cumsum. NumSharp: throws — IL kernel doesn't have
  Half axis-cumsum support. H7 fix only enables 1D scalar accumulator.

LeftShift_HalfShiftAmount_AsObject_NotSupported
  Asserts NotSupportedException ("Unable resolve asanyarray for type Half").
  Path: np.left_shift(arr, object) → np.asanyarray(Half) which rejects Half.

LeftShift_HalfShiftAmount_AsNDArray_NotSupported
  Asserts NotSupportedException ("left_shift only supports integer types,
  got Half"). Path: np.left_shift(arr, NDArray) → LeftShift dtype validation
  rejects Half. Both paths are upstream of M2 fix.

Indexing_HalfIndex_Getter_NotSupported
Indexing_ComplexIndex_Getter_NotSupported
  Asserts ArgumentException ("Unsupported indexing type"). Deeper validation
  switch (Getter:70-87) rejects Half/Complex BEFORE reaching M3+M4 fixed
  switch. NumPy also rejects (IndexError), so NumSharp rejection is closer
  to NumPy than silent-truncate would be.

Duplicate test forms (Round 5E region)
--------------------------------------
MatMul_ComplexMatrix_NumPyParity_DropsImaginary [Misaligned]
  Lock in current real-only behavior with explicit reason strings citing
  the NumPy expected values. Complements MatMul_ComplexMatrix_RealOnlyLimitation
  by stating WHY the asserted values diverge from NumPy.

Mean_ScalarHalfArray_DtypeMismatch [Misaligned]
  Lock in dtype divergence: NumSharp returns Double, NumPy returns Half.
  Complements Mean_ScalarHalfArray_Works (which asserts the value but not
  the dtype) by explicitly checking the dtype.

`<<` operator note
------------------
The original test attempted `arr << NDArray.Scalar((Half)2)` which is a
compile error: NDArray defines &, |, ^, ~, +, -, *, /, % operators but NOT
`<<` or `>>`. The np.left_shift function calls are the only way to test
this behavior. Documented in the test file.

Tests
-----
+8 battletests (5 restored Misaligned + 3 duplicate forms; 153 total).
Full suite: 6070/0/11 on both net8.0 and net10.0. Zero regressions.
…DateTime

Exhaustive battletest against NumPy 2.4.2 uncovered two related double-precision
bugs at the int64 boundary. The root cause is the same: `(double)long.MaxValue`
rounds UP to 2^63 (same double bit pattern as (double)(long.MaxValue+1)), so
comparing `value > long.MaxValue` as doubles returns false even when the user
passed a value that NumPy treats as overflow.

Fix 1 - ToInt64(double):
  Before: ToInt64((double)long.MaxValue) = long.MaxValue (saturating cast, .NET 7+)
  After:  returns long.MinValue (NaT), matching NumPy's
          np.float64(long.max).astype(int64) == int64.min
  Technique: exclusive upper bound at 9223372036854775808.0 (= 2^63, the smallest
  double > long.MaxValue). Values >= that constant are rejected as overflow.
  (double)long.MinValue stays representable as long so no lower-bound change.

Fix 2 - ToTimeSpan(double):
  Same issue, same fix. Previously returned long.MaxValue ticks; now returns
  TimeSpan.MinValue (NaT), matching NumPy timedelta64 behavior.

Fix 3 - ToDateTime(double) ArgumentOutOfRangeException:
  Before: ToDateTime(Converts.ToDouble(DateTime.MaxValue)) THREW. Because
  (double)DateTime.MaxValue.Ticks = 3155378976000000000 (rounded up from
  3155378975999999999), and the range check against DateTimeMaxTicksAsDouble
  rounded to the same double so the guard was a no-op. Then `new DateTime(long)`
  threw for the oversized tick count.
  After: routes through TicksToDateTime which re-validates after the long cast,
  collapsing out-of-range to DateTime.MinValue (NaT-equivalent).

Battletest methodology:
- Ran side-by-side comparison of NumPy 2.4.2 output and NumSharp output across
  a 49-case grid covering all 12 dtypes, zero/pos/neg, int-boundary values,
  NaN/Inf, 1e20, and tick counts of real DateTimes.
- Verified bit-for-bit float32/float64 parity via hex representation (both
  produce identical IEEE 754 bit patterns for datetime64 -> float casts).
- Confirmed TimeSpan has 100% NumPy parity for every case including NaT.
- Documented 5 DateTime-only divergences that are inherent to .NET DateTime
  range constraints (cannot hold negative ticks or 2^63 sentinel).

Added 5 regression tests to ConvertsDateTimeParityTests covering the exact
failure modes, bringing suite to 66 parity tests. All 6075 non-OpenBugs tests
pass on net10.0.
Exhaustive dtype × dtype battletest against NumPy 2.4.2 (14×14 = 196 pairs,
1162 test cases) found ONE unexpected divergence:

  double 1e20 -> uint32:
    NumPy:    0
    NumSharp: 4294967295

Root cause: `ToUInt32(double)` only checked NaN/Inf, then did
`unchecked((uint)(long)value)`. In .NET 7+, `(long)1e20` saturates to
long.MaxValue (= 0x7FFFFFFFFFFFFFFF), and the low 32 bits are 0xFFFFFFFF
= uint.MaxValue. NumPy's int64 intermediate treats 1e20 as overflow and
returns int64.MinValue, whose low 32 bits are 0.

Fix: add the same exclusive overflow check used in ToInt64(double) —
values < long.MinValue or >= 2^63 short-circuit to 0, matching NumPy's
NaT-propagation through the int64 intermediate.

Verified fix:
  1e20 / -1e20 / 1e25 / NaN / +/-Inf -> uint32 = 0 (was 4294967295 for +)
  3.7 / -3.7 -> uint32 = 3 / 4294967293 (normal wrap, unchanged)

Battletest methodology:
- Generated side-by-side dtype × dtype matrix (14 source dtypes × 14
  destination dtypes × representative values per type = 1162 rows).
- Normalized cosmetic formatting (float source labels "1.9" vs
  "1.899999976158142", etc.) and compared per-cell results.
- After this fix + label normalization: 0 unexplained diffs remain.
  All 64 remaining divergences are inherent DateTime range limits
  (negative ticks, > DateTime.MaxValue.Ticks, NaT sentinel) and are
  already documented.

Added NumPyParity_DoubleToUInt32_LargePositiveOverflowsToZero test
covering the exact failure + sanity checks for normal path.

All 6482 non-OpenBugs tests pass on net10.0.
Self-double-check after claiming '100% parity' uncovered SIX additional bugs
that the initial battletest missed because the test inputs were too clean
(round numbers like 1e20, not fractional like 2147483647.4).

Bug 1 — ToInt32(double) for fractional values near int32 boundary:
  Before: ToInt32(2147483647.4) = int.MinValue (treated as overflow)
  After:  ToInt32(2147483647.4) = 2147483647 (truncate then range-check)
  NumPy parity: np truncates toward zero FIRST, then range-checks the
  truncated integer. Old check `value > int.MaxValue` compares doubles —
  2147483647.4 > 2147483647.0 is true, tripping the overflow guard for a
  value that should truncate cleanly inside int32 range.

Bug 2-5 — ToSByte / ToByte / ToInt16 / ToUInt16 / ToChar (double):
  Same root cause and same user-visible failure. All now route through
  ToInt32(value) (NumPy's int32 intermediate for small int targets), then
  unchecked-wrap to the narrower type.
  Before: ToSByte(2147483647.4) = 0 (incorrect overflow sentinel)
  After:  ToSByte(2147483647.4) = -1 (low byte of 2147483647)

Bug 6 — ToUInt64(double) at the (double)long.MaxValue boundary:
  Previously ToUInt64((double)long.MaxValue) = 9223372036854775807
  (saturating .NET cast leaking through). NumPy returns 9223372036854775808
  (= 2^63 = uint64 overflow sentinel). Same precision-boundary pattern:
  (double)long.MaxValue rounds to 2^63 and the guard `value > long.MaxValue`
  compares doubles that are equal. Fixed with explicit 2^63/2^64 bounds plus
  a dedicated `[2^63, 2^64)` branch that uses the direct `(ulong)value` cast.
  Also catches (double)ulong.MaxValue which is exactly 2^64 (overflow).

Verification:
- Full dtype×dtype matrix (1162 rows): 64 diffs remain, all 64 are
  documented DateTime clamping (physical .NET limit).
- 24216-case randomized fuzz against NumPy 2.4.2: 0 diffs.
- Full test suite: 6483 tests pass, 0 failures on net10.0.

The earlier 'self-audit' was incomplete because the fuzz inputs were
strongly biased toward round numbers via random.randint/uniform in a wide
range, which almost never generated values of the form "int_max + epsilon"
that trigger these precision-boundary bugs.
Closes three bug clusters identified during battletest vs NumPy 2.4.2
(tracked in docs/plans/LEFTOVER.md). Pure NumPy-parity fixes. No regressions:
all 6483 pre-existing tests pass; +35 new battletests.

B11 — Half/Complex unary math ----------------------------------------------
Added log10, log2, cbrt, exp2, log1p, expm1 for Half.
Added log10, log2, exp2, log1p, expm1 for Complex. (cbrt intentionally left
unsupported — NumPy's np.cbrt raises TypeError for complex inputs.)

Implementation:
  - Half: direct dispatch to BCL Half.Log10/Log2/Cbrt/Exp2/LogP1/ExpM1.
  - Complex: Complex.Log10 direct, then composed — log2 via helper
    ComplexLog2Helper (Log(z) * 1/ln(2) via scalar mul on Real/Imaginary
    to avoid Complex.Log(z, base) producing NaN imag for z=0+0j),
    exp2 via Pow(2+0j, z), log1p via Log(1+z), expm1 via Exp(z)-1.

Files:
  - src/NumSharp.Core/Backends/Kernels/ILKernelGenerator.cs
    (new CachedMethods: HalfLog10, HalfLog2, HalfCbrt, HalfExp2, HalfLogP1,
     HalfExpM1, ComplexLog10, ComplexLogBase, ComplexOpSubtraction)
  - src/NumSharp.Core/Backends/Kernels/ILKernelGenerator.Unary.Decimal.cs
    (emit cases for Half + Complex; ComplexLog2Helper)

B10 + B17 — Half/Complex maximum/minimum/clip ------------------------------
Previously threw 'ClipNDArray not supported for dtype Half/Complex'.
Added Half (NaN-propagating) and Complex (lex comparison with first-NaN-wins)
paths to both contiguous and general dispatchers in Default.ClipNDArray.cs.
This single fix closes BOTH np.maximum/np.minimum (which route through
np.clip) AND np.clip itself for Half+Complex.

Semantics matched against NumPy 2.4.2:
  - Half: Math.Max/Min don't exist for Half, so explicit HalfMaxNaN/HalfMinNaN
    helpers: if either operand is NaN, return NaN (matches np.maximum NaN rule).
  - Complex: "NaN-containing" = Real or Imag is NaN. If either operand is
    NaN-containing, return it (first operand wins when both NaN-containing).
    Non-NaN pairs compared lex: real-then-imag.

Files:
  - src/NumSharp.Core/Backends/Default/Math/Default.ClipNDArray.cs

B14 — Half/Complex nanmean/nanstd/nanvar -----------------------------------
Previously these returned NaN for Half/Complex because the scalar/axis paths
fell through to regular mean/std/var (which propagate NaN).

Implementation:
  - Half nanmean/nanstd/nanvar return Half (NumPy parity: np.nanmean(float16)
    returns float16). Accumulate in double for precision, convert to Half at
    the end.
  - Complex nanmean returns Complex; nanstd/nanvar return float64 (NumPy
    parity). Variance formula: mean(|z - mean(z)|²), consistent with NumPy's
    complex variance definition.
  - NaN detection for Complex: Re or Im is NaN.
  - All-NaN slice → NaN (parity).
  - ddof parameter preserved.

Files:
  - src/NumSharp.Core/Statistics/np.nanmean.cs (+nanmean_axis_half, +nanmean_axis_complex, +ApplyKeepdims shared helper)
  - src/NumSharp.Core/Statistics/np.nanstd.cs  (+nanstd_axis_half, +nanstd_axis_complex)
  - src/NumSharp.Core/Statistics/np.nanvar.cs  (+nanvar_axis_half, +nanvar_axis_complex)

Tests ---------------------------------------------------------------------
+ test/NumSharp.UnitTest/NewDtypes/NewDtypesBattletestRound6Tests.cs
  35 battletests covering all three bug clusters. Each expected value
  mirrors a python -c "import numpy as np" invocation documented in the
  test's XML comment.
  - 7 Half unary math tests (incl. NaN propagation)
  - 6 Complex unary math tests (incl. log2(0+0j) = -inf+0j edge case
    and cbrt-NotSupportedException parity)
  - 8 Half/Complex maximum/minimum/clip tests (NaN, lex, first-NaN-wins,
    imag-only NaN)
  - 14 Half/Complex nanmean/nanstd/nanvar tests (scalar, axis, all-NaN, dtype)

docs/plans/LEFTOVER.md — updated Round 6 sprint entry noting B10/B11/B14
closed in this PR.
Closes three Complex axis-reduction bugs from docs/plans/LEFTOVER.md.
All fixes extend existing dispatcher/helper conventions rather than
introducing parallel code paths.

B18 — np.cumprod(Complex, axis=N) dropped imaginary part
---------------------------------------------------------
`Default.Reduction.CumMul.cs::ExecuteAxisCumProdFallback` iterated each
axis slice as `AsIterator<double>()` which collapses Complex to its Real
part. NumPy preserves Complex: np.cumprod([[1+1j,2+2j,3+3j],[4+4j,...],...], axis=0)
must yield [[1+1j,2+2j,3+3j],[0+8j,0+20j,0+36j],[-56+56j,...]].

Fix: Added a Complex branch at the top of the fallback (mirrors the
existing Complex branch in `cumprod_elementwise_fallback` right below it)
that accumulates via `Complex.One` and `Complex *= ` operator.

B19 — np.max/min(Complex, axis=N) returned all zeros
-----------------------------------------------------
Axis reduction flows through `CreateAxisReductionKernelScalar<Complex, Complex>`
→ `AxisReductionScalarHelper` → `CombineScalarsPromoted<Complex,Complex>`.
The existing Complex branch handled Sum/Mean/Prod but for Min/Max fell
through to `_ => cAccum`, so the accumulator stayed at its identity
(`Complex.Zero`) for every output element.

Fix: Two minimal edits to existing functions —
  - `CombineScalarsPromoted<Complex,Complex>`: route Min/Max through a
    new private `ComplexLexPick(a, b, pickGreater)` helper that does
    NumPy-parity lex compare on (Real, Imaginary) with NaN-first-wins
    propagation (NaN-containing = Re OR Im is NaN).
  - `GetIdentityValueTyped<Complex>`: return `Complex(+inf,+inf)` for Min
    and `Complex(-inf,-inf)` for Max so the first finite element
    displaces the identity under lex comparison (parallels how
    `double.PositiveInfinity` works for the scalar double path right
    below this branch).

No new kernel/dispatcher paths — Complex flows through the same
scalar/promoted pipeline as every other type.

B20 — np.std/var(Complex, axis=N) computed real-only variance
--------------------------------------------------------------
`CreateAxisVarStdReductionKernel` had no Complex branch, so Complex fell
through to `CreateAxisVarStdKernelGeneral` whose `ReadAsDouble(Complex)`
discards imaginary. The general path then computed
`Var(Re(z)) = E[(Re(z) - mean(Re(z)))²]`, not the complex variance
`E[|z - mean(z)|²]`.

Fix: Added a Complex branch to the same dispatcher switch, following
the existing Decimal convention exactly —
  - `CreateAxisVarStdKernelTypedComplex` factory (mirrors
    `CreateAxisVarStdKernelTypedDecimal`)
  - `AxisVarStdComplexHelper` two-pass helper (mirrors
    `AxisVarStdDecimalHelper`): Pass 1 computes Complex mean via
    component sums; Pass 2 accumulates |z - mean|² = dR² + dI² and
    divides by `axisSize - ddof`. Output dtype is double (NumPy parity:
    np.var/std of complex input returns float64).

No changes to Decimal/Int/Single/Double/General kernel code paths.

Tests
-----
+ test/NumSharp.UnitTest/NewDtypes/NewDtypesBattletestRound7Tests.cs
  19 battletests. Each expected value is documented inline with the
  python -c "import numpy as np; ..." invocation that produced it.
  - B17 (2 regression checks for np.clip, closed in Round 6)
  - B18 (3 tests: axis=0, axis=1, elementwise-unchanged)
  - B19 (7 tests: max/min × axis=0/1, lex-tiebreak, NaN propagation,
    Sum/Prod/Mean regression)
  - B20 (7 tests: var/std × axis=0/1, ddof, elementwise-unchanged,
    double-path regression)

All 6537 pre-existing + new tests pass on net8.0 and net10.0; no
regressions.
Round 6 closes the NDArray operator-overload gaps vs NumPy discovered
during the Round 1-5E dtype-conversion audit. Python verified:
  arr=[1,2,4,8] << 2        -> [4, 8, 16, 32]
  arr=[16,8,4,2] >> 1       -> [8, 4, 2, 1]
  [1,2,4,8] << [0,1,2,3]    -> [1, 4, 16, 64]

New file: src/NumSharp.Core/Operations/Elementwise/NDArray.Shift.cs
  - operator <<(NDArray, NDArray) -> TensorEngine.LeftShift
  - operator <<(NDArray, object)  -> lhs << np.asanyarray(rhs)
  - operator >>(NDArray, NDArray) -> TensorEngine.RightShift
  - operator >>(NDArray, object)  -> lhs >> np.asanyarray(rhs)

Pattern mirrors NDArray.OR.cs / .AND.cs / .XOR.cs. Two overloads per
direction instead of three, because C# shift-operator rules require the
declaring type on the LHS — so "object << NDArray" is NOT possible.
Callers needing that form use np.left_shift(object, NDArray) or cast
explicitly. C# 11+ relaxed the "RHS must be int" restriction (net8/net10
with LangVersion=latest qualify), enabling NDArray << NDArray.

Compound <<= / >>= are synthesized by the C# compiler from the binary
operators (sugar for "a = a << b"). Unlike NumPy these are NOT in-place
— C# compound operators on class types cannot mutate the original
storage. The compound test in the battletests locks in this divergence.

NOT added in Round 6:
  - implicit operator NDArray(Half)     — already present (line 35)
  - explicit operator Half(NDArray)     — already present (line 137)
  - explicit operator Complex(NDArray)  — already present (line 143)
The handover assumed these were missing; audit of current code shows
they were landed in prior commits using the EnsureCastableToScalar
pattern. No change needed.

Tests: +13 under "Round 6: Operator overloads" region in
ConvertsBattleTests.cs:
  - LeftShift_Operator_IntScalar_Works
  - LeftShift_Operator_NDArrayRhs_Works
  - LeftShift_Operator_NDArrayScalarRhs_Works
  - LeftShift_Operator_ObjectRhs_Works (boxed int path)
  - LeftShift_Operator_Compound_ReassignsReference (C# semantics doc)
  - RightShift_Operator_IntScalar_Works
  - RightShift_Operator_NDArrayRhs_Works
  - RightShift_Operator_Compound_ReassignsReference
  - LeftShift_Operator_UnsignedByte_TypePromotion_Works
  - LeftShift_Operator_HalfObjectRhs_NotSupported [Misaligned]
  - LeftShift_Operator_HalfNDArrayRhs_NotSupported [Misaligned]
  - RightShift_Operator_HalfObjectRhs_NotSupported [Misaligned]
  - RightShift_Operator_HalfNDArrayRhs_NotSupported [Misaligned]

The 4 Misaligned duplicates mirror the existing Round 5D function-form
rejections (LeftShift_HalfShiftAmount_As{Object,NDArray}_NotSupported)
and exercise the same two upstream rejection paths (np.asanyarray Half
rejection + TensorEngine.LeftShift dtype validation), now reachable via
operator form. Remove [Misaligned] and flip assertions if Half support
is added to either path.

Also updated the stale comment in the Round 5D region that read
"NDArray does NOT define a '<<' operator" — it did, until this commit.

Test counts:
  ConvertsBattleTests: 153 -> 166 (+13)
  Full suite (CI filter): 6550 / 0 / 11 on both net8.0 and net10.0

No changes to TensorEngine, np.left_shift, np.right_shift, Default.Shift,
or any engine code. Pure user-facing API surface addition.
…eTimeOffset

Adds a new `DateTime64` struct to NumSharp (`src/NumSharp.Core/DateTime64.cs`),
modeled after .NET 10's `System.DateTime` but with NumPy `datetime64` semantics:
full `long.MinValue..long.MaxValue` tick range (no `DateTimeKind` bits) and a
`NaT == long.MinValue` sentinel that propagates through arithmetic and compares
like IEEE NaN (NaT != NaT, any ordering with NaT returns False).

Closes the 64 DateTime-related diffs discovered in the earlier battletest:
  * Group A (src=dt64): 32 cases where NumPy's `datetime64` can hold raw int64
    values (-1, int.MinValue, long.MinValue) that `System.DateTime` physically
    cannot — `new DateTime(-1L)` throws, forcing NumSharp's path to collapse
    the source to `DateTime.MinValue` (Ticks=0) and then converting from 0.
  * Group B (dst=dt64): 32 cases where a value (-1, NaN, 1e20, long.MinValue)
    has to become dt64 — NumPy stores the raw int64; NumSharp previously
    clamped to `DateTime.MinValue` because `DateTime.Ticks` must be in
    [0, 3_155_378_975_999_999_999].

`DateTime64` sidesteps this by storing `long _ticks` directly; the 64 diffs are
all covered by NumPy-exact behavior now.

Files:
  * src/dotnet/src/libraries/System.Private.CoreLib/src/System/DateTime.cs
    src/dotnet/src/libraries/System.Private.CoreLib/src/System/DateTimeOffset.cs
    Downloaded verbatim from dotnet/runtime main (.NET 10) as the reference
    template for DateTime64. Serves as source-of-truth for .NET behavior.

  * src/dotnet/INDEX.md — indexed the two new files + updated purpose.

  * src/NumSharp.Core/DateTime64.cs — the new struct (~550 lines). Interop:
      - Implicit widenings: DateTime -> DateTime64 (drops Kind),
        DateTimeOffset -> DateTime64 (uses UtcTicks), long -> DateTime64.
      - Explicit narrowings: DateTime64 -> DateTime (throws for NaT/out-of-range),
        DateTime64 -> DateTimeOffset (UTC, throws similarly),
        DateTime64 -> long (returns raw ticks; NaT = long.MinValue).
      - Plus `ToDateTime(fallback)` / `TryToDateTime(out)` non-throwing variants.
    Mirrors DateTime's public API: Year/Month/Day/Hour/Minute/Second/Millisecond/
    Microsecond/Nanosecond (delegated to System.DateTime when in range; throw for
    NaT/out-of-range), DayOfWeek, DayOfYear, Date, TimeOfDay, Now/UtcNow/Today,
    Add/AddDays/AddHours/AddMinutes/AddSeconds/AddMilliseconds/AddMicroseconds/
    AddTicks/AddMonths/AddYears/Subtract (NaT propagates, overflow saturates to
    NaT matching NumPy), DaysInMonth, IsLeapYear, Parse/TryParse/ParseExact/
    TryParseExact, ToString/TryFormat (ISO-8601 default; "NaT" for NaT;
    `DateTime64(ticks=N)` for out-of-.NET-range), ToUnixTimeSeconds/
    ToUnixTimeMilliseconds + FromUnixTimeSeconds/FromUnixTimeMilliseconds.
    Implements IComparable, IComparable<DateTime64>, IEquatable<DateTime64>,
    IConvertible, IFormattable, ISpanFormattable.

  * src/NumSharp.Core/Utilities/Converts.DateTime64.cs — partial file with
    all ToX(DateTime64) (routes through Ticks as int64: wrap/truncate/promote
    matching `datetime64.astype(dtype)`) and ToDateTime64(X) (sign-extends /
    reinterprets to int64; float NaN/Inf/overflow -> NaT; NumPy-exact).
    Object dispatcher `ToDateTime64(object)` handles all primitive and date
    types including DateTime, DateTimeOffset, TimeSpan, and string ("NaT").

  * src/NumSharp.Core/Utilities/Converts.Native.cs — added
    `DateTime64 d64 => ToX(d64),` case to each of the 16 `ToX(object)`
    dispatchers (ToBoolean, ToChar, ToSByte, ToByte, ToInt16, ToUInt16, ToInt32,
    ToUInt32, ToInt64, ToUInt64, ToSingle, ToDouble, ToDecimal, ToHalf,
    ToComplex, ToTimeSpan).

  * src/NumSharp.Core/Utilities/Converts.cs — mirrored the DateTime64 dispatch
    case into every `ToX_NumPy(object)` helper (16 places) and `ToLong_NumPy`.

  * src/NumSharp.Core/Backends/NPTypeCode.cs — fixed latent collision:
    `TypeCode.DateTime (16) == (int)NPTypeCode.Half (16)`, which meant
    `InfoOf<DateTime>.NPTypeCode` previously resolved to NPTypeCode.Half.
    `GetTypeCode(typeof(DateTime))` now returns `NPTypeCode.Empty` (DateTime
    is not a NumPy dtype).

  * src/NumSharp.Core/Utilities/InfoOf.cs — changed the default `Size` path
    from `Marshal.SizeOf<T>()` to `Unsafe.SizeOf<T>()`. `Marshal.SizeOf`
    rejects non-unmanaged structs like `System.DateTime`
    ("Type 'System.DateTime' cannot be marshaled as an unmanaged structure");
    `Unsafe.SizeOf` works for any managed struct and gives the correct
    in-memory layout size.

  * test/NumSharp.UnitTest/Casting/ConvertsDateTime64ParityTests.cs — 46 tests
    verifying NumPy-exact behavior on:
      - DateTime64 -> every primitive (Group A cases: -1, int.MinValue, NaT,
        long.MaxValue, Jan1_2024_Ticks), with reference values from NumPy 2.4.2.
      - Every primitive -> DateTime64 (Group B cases: -1, long.MinValue=NaT,
        long.MaxValue, NaN, +/-Inf, 1e20, decimal/Complex overflow).
      - Interop: DateTime/DateTimeOffset/long <-> DateTime64.
      - NaT semantics: NaT != NaT, comparisons with NaT return False, NaT
        propagates through arithmetic (+TimeSpan, -TimeSpan, AddDays, AddHours).
      - Formatting: "NaT" / ISO-8601 / "DateTime64(ticks=N)" for out-of-range.
      - InfoOf: DateTime / DateTime64 / TimeSpan all resolve to NPTypeCode.Empty
        with Size=8 (previously DateTime collided with Half).
      - Object dispatcher: every ToX(object) handles DateTime64 correctly;
        ToDateTime64(object) handles every source type.

Battletest (in-terminal `python -c` vs `dotnet_run`) — not committed, but:
  * 1,476 dtype x dtype cases (covering the original 64 diffs): 0 real diffs.
  * 6,168 fuzz cases (500 random int64/float64 values x 12 target dtypes):
    0 real diffs. All remaining "diffs" are float32 string-formatting only
    (same IEEE 754 bits, different decimal digits).

Full test suite: 6,596 passed / 0 failed on both net8.0 and net10.0
(46 new DateTime64 tests + 67 existing DateTime tests all pass).

Design notes:
  * We keep the full System.DateTime conversion surface in Converts.* — users
    passing DateTime values continue to get the .NET-range-clamped behavior.
    DateTime64 is the escape-hatch for full NumPy parity when int64 ticks
    outside [0, DateTime.MaxValue.Ticks] are needed.
  * DateTime64 does NOT get an NPTypeCode entry — it's not (yet) a NumPy dtype
    in NumSharp's supported-dtype list. It's a "conversion-only" type, the way
    DateTime/TimeSpan are handled.
  * `DaysInMonth` and `IsLeapYear` are provided as static helpers for DateTime
    API parity even though they don't involve the DateTime64 instance.
Per clarification that DateTime64 is a CONVERSION HELPER TYPE (not an NPTypeCode
dtype), trim the API surface, fix contract issues, and harden edge-case paths.

Removed (scope creep — not needed for the 64-diff goal):
  * Calendar properties that throw for NaT: Year/Month/Day/Hour/Minute/Second/
    Millisecond/Microsecond/Nanosecond, DayOfWeek/DayOfYear, Date/TimeOfDay.
  * Calendar arithmetic shortcuts: AddDays/AddHours/AddMinutes/AddSeconds/
    AddMilliseconds/AddMicroseconds/AddMonths/AddYears. (Keep AddTicks +
    Add(TimeSpan) + Subtract(TimeSpan) + operator +/-, which is all NumPy's
    dt64 + td64 math needs.)
  * Static calendar helpers: IsLeapYear, DaysInMonth.
  * Wall-clock helpers: Now, UtcNow, Today.
  * Unix-time helpers: ToUnixTimeSeconds/Milliseconds, FromUnixTimeSeconds/
    Milliseconds.
  * Parse extras: ParseExact, TryParseExact (kept basic Parse/TryParse + NaT).
  * Calendar constructors: Year/Month/Day, DateOnly+TimeOnly.

Users needing calendar arithmetic should convert to System.DateTime first
(lossless for in-range ticks), do the math there, then convert back.

Fixed:
  * IConvertible.GetTypeCode() now returns TypeCode.Object, not
    TypeCode.DateTime. DateTime64 is NOT System.DateTime — returning the
    DateTime code would make Convert.ChangeType treat them as the same type
    and take the fast-path that assumes DateTime semantics.

  * Equality contract split (mirrors System.Double's NaN handling):
      - Equals(DateTime64) is bit-equal on ticks (NaT.Equals(NaT) → true) so
        GetHashCode is contract-compliant and NaT can be used as a
        Dictionary<DateTime64, _> key.
      - operator == / != / < / > / <= / >= follow NumPy (NaT vs anything
        returns false for ==/</>/<=/>=, true for !=).
    This is the exact split .NET uses for double: NaN.Equals(NaN)==true,
    NaN==NaN==false. Both a Dictionary<double, _> and IEEE arithmetic work.

  * Hardened float → int64 rule centralised in
    DateTime64.FromDoubleOrNaT(double). Explicitly rejects values outside
    (−2^63, +2^63) before the cast, so the result no longer depends on
    CLR implementation-defined behavior for out-of-range `(long)double`.
    Converts.ToDateTime64(double) now delegates to this helper.

  * TryFormat no longer allocates on the hot path. Writes "NaT" and
    ISO-8601 directly into the destination span via DateTime.TryFormat.
    Only the rare out-of-.NET-range case still allocates (for the
    "DateTime64(ticks=N)" string).

New tests (11 additions; 46 → 57 tests in the parity suite):
  * NaT_EqualsFollowsDotNetContract: verifies NaT.Equals(NaT)==true, the
    hash contract holds, and NaT works as a Dictionary key.
  * Arithmetic_OverflowSaturatesToNaT: MaxValue + 1 tick → NaT.
  * ToString_CustomFormat_DelegatesToDateTime: "yyyy-MM-dd", "HH:mm:ss" formats.
  * TryFormat_WritesDirectlyIntoSpan: covers NaT, out-of-range, valid,
    and destination-too-small paths.
  * Parse_NaTLiteral_IsCaseSensitive: "NaT" works, "nat" throws (NumPy parity).
  * Parse_ValidISO_RoundTripsFromToString.
  * TryParse_InvalidInput_ReturnsFalse.
  * IConvertible_GetTypeCode_IsObject: not TypeCode.DateTime.
  * IConvertible_ToType_HandlesCommonTargets: long/ulong/double/DateTime/
    DateTimeOffset/TimeSpan/DateTime64/string.
  * IConvertible_NaT_ToDateTime_ClampsToMinValue: verifies numeric members
    return raw tick bits (NumPy parity for NaT).
  * ConvertChangeType_RoundTripViaIConvertible: standard Convert.ChangeType
    path works end-to-end.

Tests updated:
  * NaT_ArithmeticPropagates: replaced removed AddDays/AddHours calls with
    AddTicks / operator +/- / Subtract to match the trimmed surface.
  * NaT_EqualityFollowsNumPy → NaT_OperatorEqualityFollowsNumPy: now only
    asserts the operator behavior (not Equals), since Equals has moved to
    the .NET contract semantics.

Results:
  * Full suite: 6713 passed / 0 failed on both net8.0 and net10.0.
  * Fuzz battletest (6168 dtype × dtype cases vs NumPy 2.4.2):
    0 real diffs maintained.
  * File size: DateTime64.cs went from 820 → 559 lines (-261 lines of
    scope-creep calendar code).
…fixes

Expands Round 6/7 happy-path coverage (B10/B11/B14/B17/B18/B19/B20) with 111
edge-case tests in a new file `NewDtypesEdgeCasesRound6and7Tests.cs`. Every
expected value is pinned to a NumPy 2.4.2 invocation captured in-line. 106
pass; 5 flag newly-identified parity bugs filed as B21–B24 in LEFTOVER.md.

Test areas added
----------------
  B11 Half unary math     22 tests  subnormals, +/-inf, NaN, MaxValue boundary
  B11 Complex unary math  20 tests  log10/log2(-0), log1p(-inf), exp2(-1+0j),
                                    VeryLarge |z|, NaN carriers, principal branches
  B10/B17 max/min/clip    11 tests  broadcasting, both-NaN, subnormals vs 0,
                                    Inf vs finite, clip lo>hi, Complex lex ties,
                                    Inf+real / NaN+imag edge, 0j vs -0j
  B14 nan*                21 tests  all-NaN slice, single-valid, ddof boundary,
                                    keepdims, axis=-1, 3D axis=0/1/2,
                                    subnormal precision, NaN-real / NaN-imag only
  B18 cumprod             7 tests   zero propagation, Inf/NaN carriers,
                                    axis=-1, 3D axis=0/1/2, single-elem axis
  B19 max/min             13 tests  axis=-1, keepdims, all-equal, +/-Inf axis,
                                    3D axis=0/1/2, lex ties on real
  B20 std/var             13 tests  single-elem axis (0 var), ddof=n/>n,
                                    keepdims, axis=-1, 3D, large-magnitude
                                    (cancellation check), subnormal precision
  Parity regression       10 tests  Complex log10(-0+0j), log10(-inf+0j),
                                    log10(inf+infj), log1p(-inf+0j),
                                    expm1(inf+0j), Half subnormal cbrt/exp2,
                                    Half log1p near -1, 2D clip broadcast,
                                    np.var/std Complex dtype lock-in

Newly-identified bugs (filed in LEFTOVER.md, tagged [OpenBugs])
---------------------------------------------------------------

  B21 Half log1p/expm1 precision loss on subnormals
      Half.LogP1(2^-24) returns 0 because (1 + 2^-24) rounds to 1 in Half
      precision. NumPy promotes to double internally; fix is one line in the
      Half log1p IL branch (Conv_R8 -> Math.Log1p -> Conv_Half).

  B22 Complex exp2 at +-Inf real returns (NaN, NaN)
      np.exp2(-inf+0j) should be 0+0j; NumSharp returns NaN+NaNj.
      np.exp2(+inf+0j) should be inf+0j; NumSharp returns NaN+NaNj.
      BCL Complex.Pow(2+0j, z) quirk for infinite real. Fix: inline special
      cases in the Complex exp2 IL branch.

  B23 np.var/np.std(Complex, axis=N) returns Complex array for single-elem axis
      When the reduced axis has size 1, trivial-axis fast path skips the
      Var/Std output-dtype promotion and returns the input element verbatim.
      Should return Double [0.0] like NumPy. Fix: route Complex through the
      Var/Std kernel even for axis size 1.

  B24 np.var/np.std(Complex, axis=N, ddof>n) returns negative value not +inf
      NumPy clamps divisor=max(n-ddof, 0); NumSharp's AxisVarStdComplexHelper
      uses raw (n-ddof). For ddof > n the divisor becomes negative giving a
      negative variance. Fix: one-line Math.Max(n-ddof, 0).

Test methodology
----------------
1. Enumerated edge-case categories per bug: subnormals, +-Inf, NaN carriers,
   +/-0, empty/single-element axis, keepdims, ddof boundaries, 3D axis=0/1/2,
   broadcasting, principal-branch checks.
2. Captured expected values by running python -c "import numpy as np; ..."
   against NumPy 2.4.2, pinning each reference in an in-line comment.
3. Probed NumSharp output via dotnet run file-based script to confirm
   divergences before tagging [OpenBugs].
4. Added regression-guard tests for edge cases that DO match NumPy so any
   future refactor of ILKernelGenerator's unary Complex / Half branch
   doesn't silently regress the working edge cases.

Results
-------
  New file:  106 passed (CI-mode) + 5 [OpenBugs] fail (expected)
  Full suite: 6713 passed / 0 failed / 11 skipped per framework (up from
              6537 baseline; CI-style filter excludes [OpenBugs]+[HighMemory])

No source code changed. All 4 new bugs come with ready-to-pass tests that
will turn green automatically once the surgical fixes land (total fix
scope: ~30 lines across 3 files).
Four surgical fixes that bring NumSharp's Half/Complex behavior to 100%
parity with NumPy 2.4.2 on the edge cases surfaced in Round 8. Each bug
was investigated end-to-end: NumPy's algorithm reverse-engineered,
NumSharp's divergence pinpointed, fix applied to match NumPy exactly.
All 5 previously-failing tests now pass; [OpenBugs] tags removed.

B21 — Half log1p/expm1 subnormal precision
------------------------------------------
NumPy: np.log1p(float16(2**-24)) → 5.96e-08
NumSharp: 0 (.NET's Half.LogP1 computes (1 + x) in Half precision; for
x ≤ Half.Epsilon ≈ 2^-11, this rounds to 1, so LogP1 → log(1) = 0)

Why: Half has only 10 mantissa bits; Half.Epsilon (2^-11) ≫ smallest
subnormal 2^-24. NumPy promotes to float32 internally. However .NET's
float.LogP1 has the SAME problem — float32 epsilon (2^-23) is also
coarser than Half's smallest subnormal. Double (2^-52 epsilon) is
required to preserve precision.

Fix (ILKernelGenerator.Unary.Decimal.cs + ILKernelGenerator.cs):
  Emit IL that promotes Half → double, calls double.LogP1/ExpM1, then
  converts back to Half:
    [Half x]
    → call Half.op_Explicit(Half):double        // HalfToDouble (already cached)
    → call double.LogP1(double):double          // new DoubleLogP1 cache
    → call Half.op_Explicit(double):Half        // DoubleToHalf (already cached)
  Same pattern for Expm1 with double.ExpM1. Removed the now-unused
  HalfLogP1 and HalfExpM1 CachedMethods entries.

B22 — Complex exp2(±inf+0j) returns (NaN, NaN)
-----------------------------------------------
NumPy: np.exp2(-inf+0j) = 0+0j;  np.exp2(+inf+0j) = inf+0j
NumSharp: NaN+NaNj for both

Why: NumSharp used inline IL Complex.Pow(new Complex(2,0), z). .NET's
Complex.Pow evaluates as exp(z * log(2)); for z = ±inf+0j, the complex
multiplication (±inf + 0j) * 0.693 produces ±inf + NaN·j (IEEE inf·0 = NaN
in the imaginary dimension), then exp of that propagates NaN.

Fix (ILKernelGenerator.Unary.Decimal.cs):
  Replaced inline IL with a helper `ComplexExp2Helper(Complex z)`, modeled
  after the existing ComplexLog2Helper (same refactor pattern Round 6 used
  for log2):
    if (z.Imaginary == 0.0)
        return new Complex(Math.Pow(2.0, z.Real), 0.0);  // IEEE ±inf/NaN
    return Complex.Pow(new Complex(2.0, 0.0), z);        // general case
  Math.Pow(2, ±inf) correctly gives 0 and +inf per IEEE. All Round 6
  finite-input tests still pass (Math.Pow(2, r) == Complex.Pow(2+0j, r+0j)
  for finite r).

B23 — Complex var/std single-element axis returns Complex dtype
---------------------------------------------------------------
NumPy: np.var([[1+2j]], axis=0) → array([0.], dtype=float64) shape=(1,)
NumSharp: returns NDArray(dtype=Complex, value=(0,0)) — wrong dtype,
correct value

Why: Var/Std's trivial-axis fast path (`if (shape[axis] == 1) return
np.zeros(..., typeCode ?? arr.GetTypeCode.GetComputingType())`) used
GetComputingType() which for Complex returns Complex. NumPy's rule: complex
variance is real-valued, so output dtype is float64. The main IL axis path
already returns Double correctly; only the trivial-axis fast path diverged.

Fix (Default.Reduction.Var.cs + Default.Reduction.Std.cs):
  Override the default dtype for Complex input in the trivial-axis path:
    var zerosType = typeCode
        ?? (arr.GetTypeCode == NPTypeCode.Complex
            ? NPTypeCode.Double
            : arr.GetTypeCode.GetComputingType());

  GetComputingType() is a general-purpose helper used by np.sin and similar
  where Complex→Complex IS correct, so it cannot be changed globally.

B24 — ddof > n returns negative variance instead of +inf
--------------------------------------------------------
NumPy: np.var([[1+2j, 3+4j, 5+6j]], axis=1, ddof=4) → array([inf])
NumSharp: array([-16])   (ddof=5 → -8,  etc.)

Why (revised from Round 8's initial diagnosis): The per-dtype axis Var/Std
kernels all take ddof=0 by design — ddof is applied post-hoc in the
dispatcher as `var_ddof = var_0 * n / (n - ddof)`. For ddof == n the raw
formula gives +inf (correct); for ddof > n it gives `n / -k` (a negative
multiplier), silently turning variance negative. NumPy clamps the divisor
to max(n-ddof, 0) making ddof >= n uniformly yield +inf.

Fix (Default.Reduction.Var.cs + Default.Reduction.Std.cs):
  double divisor = Math.Max(axisSize - ddof, 0);
  double adjustment = (double)axisSize / divisor;         // Var
  double adjustment = Math.Sqrt((double)axisSize / divisor); // Std

This fix applies to ALL dtypes flowing through the IL Var/Std path, not
just Complex. Prior to this, any caller doing np.var(float_array, axis=N,
ddof > n) would silently receive negative variance.

Test updates
------------
Removed [OpenBugs] attribute from the 5 tests that were flagging these
bugs. They now pass under the CI-style filter (TestCategory!=OpenBugs).
Per-test inline comments amended to reference the Round 9 fix.

Results
-------
  Before Round 9: 6713 passed / 0 failed / 11 skipped (CI-style)
                  5 [OpenBugs] tests documenting B21-B24 failing as expected
  After  Round 9: 6718 passed / 0 failed / 11 skipped (CI-style)
                  Edge-case file: 111 pass / 0 fail (was 106 / 5)

OpenBugs count delta: 53 → 48 failed (exactly the 5 I retagged, zero
accidental closures of unrelated bugs). Confirms these fixes are tightly
scoped to the 4 bugs they target.

Source change footprint
-----------------------
  ILKernelGenerator.cs                 — +14 lines  (2 cached methods + notes)
  ILKernelGenerator.Unary.Decimal.cs   — +30 lines  (4 IL emit changes + helper)
  Default.Reduction.Var.cs             — +12 lines  (B23 dtype + B24 clamp)
  Default.Reduction.Std.cs             — +12 lines  (same B23/B24 fixes)
  4 source files, ~40 net-new lines of code, ~30 lines of comments/notes.
Round 9's B22 fix introduced a `ComplexExp2Helper` method that the IL emit
called into — a cheap shortcut that added helper-method overhead (method
prologue/epilogue, no inlining guarantee from the JIT) for what should be
~15 IL instructions. Replaced with inline IL following the established
pattern used by EmitUnaryOperation Complex Sign (ILKernelGenerator.Unary.Math.cs
lines 712-744) for the zero/non-zero branch structure.

IL structure (reads like the equivalent C#):
  Stack: [Complex z]
  -> stloc locZ
  -> ldloca locZ; call Complex.get_Imaginary; ldc.r8 0.0; bne.un lblGeneral
     // Pure-real branch
  -> ldc.r8 2.0
     ldloca locZ; call Complex.get_Real; call Math.Pow
     ldc.r8 0.0
     newobj Complex(double, double)
     br lblEnd
  -> MarkLabel(lblGeneral)
     // General branch (existing Complex.Pow path)
     ldc.r8 2.0; ldc.r8 0.0; newobj Complex(double, double)
     ldloc locZ
     call Complex.Pow(Complex, Complex)
  -> MarkLabel(lblEnd)

Bne_Un branches when values are not equal OR either is unordered (NaN), so
z.Imaginary = NaN correctly falls through to Complex.Pow rather than being
treated as "pure real" — preserves NumPy's exp2(r+nanj) = nan+nanj behavior.

Added to CachedMethods:
  ComplexGetReal       (Complex.get_Real PropertyInfo.GetGetMethod())
  ComplexGetImaginary  (Complex.get_Imaginary PropertyInfo.GetGetMethod())

Removed the now-dead ComplexExp2Helper method.

Regression
----------
  All 5 B11_Complex_Exp2 tests still pass (including the ±inf ones).
  All 165 Round 6/7/8 tests pass.
  Full suite: 6718 passed / 0 failed / 11 skipped (unchanged from Round 9).
…all hops

Following the same eliminate-the-helper refactor as ComplexExp2Helper, inline
six more helpers whose bodies are small enough that a direct IL emit is
cleaner than a static-method call. The helpers ran via
`typeof(ILKernelGenerator).GetMethod(nameof(XxxHelper))` reflection lookup
which (a) adds method-call overhead per element in a hot unary/comparison
loop, (b) doesn't benefit from JIT inlining across the emit boundary, and
(c) duplicates logic already expressed inline in neighboring emit paths.

Helpers inlined
---------------

1. ComplexIsNaNHelper      → inline: `double.IsNaN(z.Real) | double.IsNaN(z.Imag)`
2. ComplexIsInfinityHelper → inline: `double.IsInfinity(z.Real) | double.IsInfinity(z.Imag)`
3. ComplexIsFiniteHelper   → inline: `double.IsFinite(z.Real) & double.IsFinite(z.Imag)`

   All three share the same IL shape — a predicate applied to both components
   combined via `and`/`or`. Factored into `EmitComplexComponentPredicate(il,
   predicate, combineWithAnd)` helper (~10 IL ops each).

4. ComplexLog2Helper       → inline: Complex.Log(z) + scale both components
                                     by constant 1/ln(2).
   Stack: Ldsfld LogE_Inv_Ln2Field (new FieldInfo in CachedMethods pointing
   at the existing runtime-computed internal static double).

5. ComplexSignHelper       → removed entirely; replaced by a call to the
                             existing inline emission at EmitSignCall(il,
                             NPTypeCode.Complex) in Unary.Math.cs:712. The
                             helper was duplicating logic that already lived
                             inline — just dead code now.

6. ComplexLessThanHelper,
   ComplexLessEqualHelper,
   ComplexGreaterThanHelper,
   ComplexGreaterEqualHelper
                           → all four collapsed into one parameterized emit:
                             `EmitComplexLexCompare(il, ComparisonOp op)`.

   The four lex-compare variants have identical structure —
     if (strict(aR, bR)) return true;
     if (reverseStrict(aR, bR)) return false;
     return imagCmp(aI, bI) [| (aI == bI) if inclusive];
   — parameterized by three OpCodes (realBranchTrue, realBranchFalse,
   imagStrictCmp) and a bool (inclusive). The switch on ComparisonOp picks
   the parameter tuple; emit is a single pass. ~35 IL ops per variant.

New cached handles
------------------
  DoubleIsInfinity, DoubleIsFinite  (MethodInfo on System.Double)
  ComplexGetReal, ComplexGetImaginary (instance property getters, already
                                       added in the ComplexExp2 inline commit)
  LogE_Inv_Ln2Field                  (FieldInfo pointing at the existing
                                       1/ln(2) runtime-computed constant,
                                       repurposed from file scope to
                                       internal static for reflection access)

Kept as helpers (not inlined)
-----------------------------
  HalfSignHelper                  — 3-way NaN/Zero/sign branch on Half, Half
                                     has fewer IL opcodes, emit would be ~25
                                     ops with Half-specific method lookups.
  AxisVarStd*Helper, NanSum*Helper,
  ArgMax*Helper, ArgMin*Helper,
  CountTrueSimdHelper             — 50+ line kernel bodies with SIMD loops;
                                     inlining them would bloat every emit
                                     instance and gain nothing (the kernel
                                     is already the hot path, not the call).

Semantics verified
------------------
  All 6 inlined paths probed via dotnet run with representative edge inputs
  (NaN-carrying Complex for the 3 predicates, equal/unequal real+imag
  combinations for all 4 lex-compare variants including ties and inclusive
  equality, 0+0j/principal-branch for log2).

Regression
----------
  165 Round 6/7/8 tests pass.
  Full CI-style suite: 6718 / 0 / 11 per framework — unchanged.
  OpenBugs count unchanged (no accidental closures or new failures).

Helpers deleted
---------------
  ComplexSignHelper, ComplexIsNaNHelper, ComplexIsInfinityHelper,
  ComplexIsFiniteHelper, ComplexLog2Helper, ComplexLessThanHelper,
  ComplexLessEqualHelper, ComplexGreaterThanHelper, ComplexGreaterEqualHelper
  — 9 helpers gone, no behavioral change.
…etest

Side-by-side battletest of the inlined IL kernels vs NumPy 2.4.2 (232 test
cases spanning IEEE corners: ±inf, NaN, subnormals, ±0, lex ties) uncovered
two pre-existing parity bugs that the prior helper-method implementations
had also been silently returning wrong values for — plus three sign-of-zero
IEEE divergences introduced by .NET BCL limitations.

B25 — Complex ordered comparison with NaN returns True (pre-existing)
---------------------------------------------------------------------

  np.array([complex(nan, 0)]) >= np.array([complex(1, 0)])  → False  (NumPy)
                                                            → True   (NumSharp)

The lex-compare emit uses Blt/Bgt which are *ordered* (NaN → branch not
taken). For aR = NaN, bR = 1, both branches skip and the code falls through
to the imag-component compare, which returns True when imag happens to be
equal. NumPy's rule: any NaN in any component → result is False.

Fix: added a 4-check NaN short-circuit at the top of EmitComplexLexCompare.
If isnan(aR) || isnan(aI) || isnan(bR) || isnan(bI), branch directly to
lblFalse before the real-part compares. This matches NumPy on all 4 ops
(lt/le/gt/ge).

B26 — Complex Sign for infinite magnitude returns NaN+NaNj (pre-existing)
-------------------------------------------------------------------------

  np.sign(complex(+inf, 0))    → (1+0j)      (NumPy)
                               → (nan+nanj)  (NumSharp)
  np.sign(complex(-inf, 0))    → (-1+0j)
  np.sign(complex(0, +inf))    → (0+1j)
  np.sign(complex(0, -inf))    → (0-1j)
  np.sign(complex(inf, inf))   → (nan+nanj)  # NumSharp already matches here

The Complex Sign emit (EmitSignCall's Complex case) used `z / |z|`
unconditionally. For single-component infinite z, |z| = inf, so
Complex.op_Division(inf+0j, inf) returns nan+nanj.

NumPy's rule: when |z| is infinite:
  both components infinite → nan+nanj  (direction indeterminate)
  one component infinite   → unit vector along that component

Fix: added a magnitude-is-infinite branch in the Complex Sign emit. Extract
z.Real and z.Imaginary to locals, check each with double.IsInfinity, and
emit the appropriate result:
  isinf(r) && isinf(i) → new Complex(NaN, NaN)
  isinf(r)             → new Complex(Math.CopySign(1.0, r), 0.0)
  isinf(i)             → new Complex(0.0, Math.CopySign(1.0, i))
Otherwise fall through to the existing z / |z| path. Added MathCopySign
MethodInfo to CachedMethods.

Sign-of-zero preservation (.NET BCL workaround)
-----------------------------------------------

Three minor IEEE divergences caused by .NET BCL operations dropping the
sign of zero:

  np.log1p(float16(-0)) → -0     (NumPy)
                        → +0      (NumSharp: double.LogP1 drops sign)
  np.expm1(float16(-0)) → -0      (same)
  np.exp2(complex(-0, -0))        → (1, -0)   (NumPy)
                                   → (1, +0)   (NumSharp: hardcoded 0.0)

Fix for Half log1p/expm1: wrap the result in Math.CopySign(result, input)
in the IL. Safe because log1p and expm1 preserve the sign of their argument
over their entire domain (log1p(x) has the same sign as x when x ∈ (-1, ∞);
expm1(x) has the same sign as x for all x).

Fix for Complex exp2: the pure-real branch of the exp2 inline IL now passes
z.Imaginary through instead of hardcoded 0.0. The branch is only entered
when z.Imaginary == 0 (per the Bne_Un check), so the value is always ±0 —
the switch preserves the input's sign-of-zero.

Battletest results
------------------

After all fixes: 230 of 232 cases match NumPy exactly.

Remaining 2 divergences (accepted as documented):
- exp2(complex(1e300, 0)) → NumSharp inf+0j vs NumPy inf+nanj
  NumPy computes via exp(z·ln2); 1e300·ln2 = inf; then imag dimension gets
  inf·0 = NaN. NumSharp's Math.Pow(2, 1e300) = inf path skips this IEEE
  quirk entirely and returns a clean inf+0j. Arguably preferable.

- exp2(complex(inf, inf)) → NumSharp nan+nanj vs NumPy inf+nanj
  The general case z.Imaginary != 0 routes through .NET's Complex.Pow,
  which has a separate BCL quirk for this input. Fixing would require a
  full exp(z·ln2) inline rewrite — not justified for a single edge case in
  the dual-infinity regime.

Both are far outside practical numerical-computing usage.

Test coverage
-------------

15 new tests in NewDtypesEdgeCasesRound6and7Tests.cs:
  4× B25: NaN in real or imag of a or b (lt/le/gt/ge); + non-NaN regression
  7× B26: ±inf real, ±inf imag, both-inf NaN, + finite-nonzero and zero regressions
  4× sign-of-zero: Half log1p/expm1(-0), Complex exp2(-0 imag) preservation,
                   + +0-stays-+0 regression

Full suite: 6733 / 0 / 11 per framework (up 15 from Round 9's 6718).
OpenBugs count unchanged (no accidental closures or new failures).

Files changed
-------------
  ILKernelGenerator.cs                     +2 lines  (MathCopySign cached)
  ILKernelGenerator.Comparison.cs          +17 lines (NaN short-circuit)
  ILKernelGenerator.Unary.Math.cs          +56 lines (Complex Sign inf branch)
  ILKernelGenerator.Unary.Decimal.cs       +27 lines (CopySign wraps + exp2 imag pass-through)
  NewDtypesEdgeCasesRound6and7Tests.cs    +159 lines (15 tests + region comments)
  LEFTOVER.md                             +91 lines  (Round 10 analysis + summary)
…sweep

First systematic coverage sweep: every supported np.* Creation function
battletested against NumPy 2.4.2 across the three new dtypes. 189-case
pipe-delimited matrix, pre-fix parity 93.7% (177/189), post-fix 100%.

Bugs closed
-----------

B27 - np.eye(N, M, k) wrong diagonal stride for non-square matrices or k != 0
    Affected all dtypes (not specific to the new ones). Previous implementation
    used `j += N+1` as the diagonal stride through the flat row-major buffer,
    but for a (N, M) matrix in C-order, consecutive diagonal elements are M+1
    apart, not N+1. Carried an unused `int i` variable and broken `skips`
    adjustment for negative k.
    Fix: rewrote with explicit row iteration formula
      rowStart = max(0, -k); rowEnd = min(N, cols - k)
      for i in [rowStart, rowEnd): flat[i*cols + (i+k)] = 1
    Also inlined Half/Complex/SByte-safe `one` construction (same pattern as
    np.ones) so Convert.ChangeType is never asked to cast double->Half/Complex.
    Site: src/NumSharp.Core/Creation/np.eye.cs

B28 - np.asanyarray(NDArray, Type dtype) ignores dtype override on NDArray input
    The final `astype` conversion at the bottom of asanyarray was unreachable
    for NDArray inputs because the NDArray case returned early. Also the
    post-switch check compared `a.GetType() != dtype` (always true for
    container object vs element dtype) which is the wrong comparison.
    Fix: route the NDArray case through the bottom branch and compare against
    `ret.dtype` (the NDArray's element dtype) instead of the container type.
    Site: src/NumSharp.Core/Creation/np.asanyarray.cs

B29 - np.asarray(NDArray, Type dtype) overload missing (API gap vs NumPy)
    NumPy supports `np.asarray(arr, dtype=X)` returning `arr` as-is when dtype
    matches, else an astype'd copy. NumSharp only had scalar/array overloads.
    Fix: added explicit NDArray overload with same-dtype fast path and
    astype fallback for conversion. Uses ReferenceEquals for the null check
    because NDArray overrides `operator==` to return a broadcast NDArray<bool>.
    Site: src/NumSharp.Core/Creation/np.asarray.cs

Test coverage
-------------

New file: test/NumSharp.UnitTest/NewDtypes/NewDtypesCoverageSweep_Creation_Tests.cs
83 tests, all passing, across all Creation APIs x 3 new dtypes:
  zeros/ones  (11), empty (3), full (8), arange (9), linspace (6),
  eye (11, incl. B27 regression), identity (3), _like variants (11),
  meshgrid (3), frombuffer (4), copy (3), asarray (3, incl. B29),
  asanyarray (4, incl. B28), np.array (6).

Full suite after Round 11: 6816 / 0 / 11 per framework (up 83 from
Round 10's 6733). OpenBugs count unchanged.

Methodology
-----------

Python ref generator emits pipe-delimited KERNEL|FUNC|INPUT|SHAPE|DTYPE|VALUES
rows for each Creation function x {Half, Complex, SByte} with edge-case
inputs. C# mirror (file-based dotnet_run script with
`#:project NumSharp.Core.csproj`) produces identical rows. Python diff
script parses both and compares with tolerance per dtype (Half 1e-3,
Complex 1e-12, SByte exact). Divergences triaged into bug vs acceptable
divergence vs same-throw-behavior.

Files changed
-------------

src/NumSharp.Core/Creation/np.eye.cs          (B27 rewrite)
src/NumSharp.Core/Creation/np.asanyarray.cs   (B28 fix)
src/NumSharp.Core/Creation/np.asarray.cs      (B29 new overload)
test/NumSharp.UnitTest/NewDtypes/NewDtypesCoverageSweep_Creation_Tests.cs (new, 83 tests)
docs/plans/LEFTOVER.md                        (Round 11 section)
Second-pass coverage search of Creation APIs closed three more parity bugs
uncovered by 141 additional probe cases targeting gaps from Round 11
(dtype inference, error paths, byte-order prefixes, frombuffer string codes,
extreme eye dimensions, 4D+ arrays, asanyarray list/scalar inputs, etc.).

Pre-fix parity: 92% (130/141). Post-fix: 100% (141/141).
Total Creation sweep coverage: 330 probe cases at 100%, 111 regression tests.

Bugs closed
-----------

B30 - frombuffer(buffer, string dtype) parser incomplete and wrong for int8
    ParseDtypeString switch missing Half ("f2"/"e"), Complex ("c16"/"D"/"c8"/
    "F"), and INCORRECTLY mapped "i1"/"b" to NPTypeCode.Byte (uint8). NumPy's
    "i1"/"b" codes mean SIGNED 8-bit int (int8/SByte) — the existing comment
    even admitted "signed byte maps to byte" as a known wrong.
    Fix: added Half/Complex branches, corrected i1/b to SByte. Single-precision
    complex codes (c8/F) widen to complex128 since NumSharp does not ship a
    separate complex64 type.
    Site: src/NumSharp.Core/Creation/np.frombuffer.cs

B31 - ByteSwapInPlace doesn't handle Half or Complex
    After B30 enabled "f2"/"c16" in the parser, big-endian prefixed dtypes
    (">f2", ">c16") triggered byte-swap path that silently fell through for
    Half/Complex because ByteSwapInPlace only had Int16/UInt16, Int32/UInt32/
    Single, Int64/UInt64/Double branches. Half came back as subnormals, Complex
    as denormals.
    Fix: Half reuses the 2-byte (ushort*) swap path (same underlying width).
    Complex loops `count * 2` 8-byte doubles since Complex = [real, imag]
    pair, each needing independent big-endian-to-native swap. SByte (1 byte)
    needs no swap — noted in comment.
    Accepted divergence: NumPy's dtype string ">f2"/">c16" preserves byte
    order in the dtype; NumSharp returns "float16"/"complex128" (dtype carries
    no byte-order info). Values are correct after the in-place swap.
    Site: src/NumSharp.Core/Creation/np.frombuffer.cs

B32 - np.eye(N, M, k) doesn't validate negative N / M
    Shape.Matrix(-1, -1) computed size as (-1)*(-1) = 1 via integer multiply,
    producing a 1-element array with shape = (-1, -1). NumPy raises
    ValueError: negative dimensions are not allowed.
    Fix: argument validation at top of eye() - throws ArgumentException
    with NumPy-aligned message.
    Site: src/NumSharp.Core/Creation/np.eye.cs

Test coverage
-------------

28 new tests appended to NewDtypesCoverageSweep_Creation_Tests.cs:
  B30: 6 tests covering all new string dtype codes (f2, e, c16, D, i1, b)
  B31: 2 tests verifying big-endian Half and Complex swap correctly
  B32: 3 tests (negative N, negative M, 0x0 edge case still works)
  Extended coverage: 17 tests (full inference, arange int-truncation,
    extreme eye diagonals, linspace n=2 no-endpoint, 4D/5D zeros/ones,
    3D np.array, meshgrid sparse/ij, _like from views, large-N arange,
    all-zero-dim shape, scalar shape, frombuffer count=0).

Local test class: 83 -> 111 tests, all passing.
Full suite: 6816 -> 6844 / 0 / 11 per framework.

Methodology
-----------

Three probe matrices (`ref_creation2.py`, `ref_creation3.py`, `ref_creation4.py`)
with matching C# mirrors (`ns_creation2.cs` etc.) ran against NumPy 2.4.2.
Each probe targeted a different angle: dtype inference / error paths;
byte-order prefixes + scalar-shape edge cases; overload equivalence +
meshgrid variants + extreme dimensions. Same diff_creation.py with
tolerance per dtype.

Files changed
-------------

src/NumSharp.Core/Creation/np.frombuffer.cs   (B30 parser + B31 swap)
src/NumSharp.Core/Creation/np.eye.cs          (B32 validation)
test/NumSharp.UnitTest/NewDtypes/NewDtypesCoverageSweep_Creation_Tests.cs (+28 tests)
docs/plans/LEFTOVER.md                        (Round 12 section)
…osed

Systematic battletest of every arithmetic function / operator for the three
new dtypes vs NumPy 2.4.2. 109-case probe matrix covering +, -, *, /, %, //,
**, unary -, np.{add,subtract,multiply,divide,power,mod,floor_divide,
true_divide,negative,positive,abs,absolute,reciprocal,sign,square,sqrt,
floor,ceil,trunc,sin,cos,tan,exp,log}, broadcasting, overflow, div/0,
NaN propagation.

Pre-fix parity: 84.4% (92/109). Post-fix: 96.3% (105/109).
Remaining 4 cases are accepted BCL divergences (documented below).

Bugs closed
-----------

B3 / B38 - Complex 1/0 returned (NaN, NaN) vs NumPy (inf, NaN)
    .NET Complex.op_Division uses Smith's algorithm, which returns (NaN, NaN)
    for a/(0+0j) regardless of a. NumPy does component-wise IEEE division:
    (a.real/0, a.imag/0), giving (inf, NaN) for (1+0j)/(0+0j), (inf, inf)
    for (1+1j)/(0+0j), (NaN, NaN) for (0+0j)/(0+0j).
    Fix: Replaced op_Division call in EmitComplexOperation with a
    ComplexDivideNumPy helper that special-cases b==(0,0) and defers to
    BCL for all finite divisors (ULP-identical for finite inputs).
    Site: ILKernelGenerator.cs (EmitComplexOperation + new helper)

B33 - Half/float/double floor_divide(inf, x) returned inf vs NumPy NaN
    NumPy's npy_floor_divide_@type@ rule: if a/b is non-finite, return NaN.
    NumSharp did `Math.Floor(a/b)` which preserves inf (.NET Math.Floor(inf)
    = inf). Applied to both Half path (ILKernelGenerator.cs Half-specific
    emit) and MixedType / SIMD kernel paths.
    Fix: EmitFloorWithInfToNaN helper that wraps Math.Floor with an
    IsInfinity check, replacing the result with NaN when infinite. Patched
    three call sites covering all float dtypes.
    Sites: ILKernelGenerator.cs x2, ILKernelGenerator.Binary.cs x1

B35 - Integer power (int8/byte/int16-64) overflow wrong
    np.power(np.int8[50], np.int8[7]) returned -1 (NumSharp) vs -128 (NumPy).
    EmitPowerOperation routed integer inputs through Math.Pow(double, double),
    which loses precision past 2^52 and then casts back via undefined runtime
    behavior. NumPy uses native integer exponentiation with modular wrap.
    Fix: New PowerInteger fast-path in DefaultEngine.Power that uses native
    C# repeated squaring with unchecked multiplication. Covers all 8 integer
    dtypes (SByte/Byte/Int16/UInt16/Int32/UInt32/Int64/UInt64). Includes
    NumPy-parity negative-exponent handling: (1)^-n=1, (-1)^-n=pm1 per
    parity, (|a|>1)^-n=0.
    Site: Default.Power.cs

B36 - np.reciprocal(int_array) returned float64 (auto-promoted via
    ResolveUnaryReturnType) instead of preserving int dtype with C-truncated
    1/x. NumPy: reciprocal(int8 2) = 0, dtype int8.
    Fix: ReciprocalInteger fast-path in DefaultEngine.Reciprocal when no
    dtype override and input is integer dtype. Loops all 8 int types with
    x==0 ? 0 : 1/x via native integer division (so 1/2 = 0 in C).
    Site: Default.Reciprocal.cs

B37 - np.floor / np.ceil / np.trunc(int_array) returned float64 instead of
    preserving input dtype as no-op. Same root cause as B36
    (ResolveUnaryReturnType promotes integer to Double, then applies Math.X
    and returns Double).
    Fix: Early-return `Cast(nd, nd.GetTypeCode, copy: true)` when input is
    integer and no dtype override requested. Uses existing NPTypeCodeExtensions
    .IsInteger() helper.
    Sites: Default.Floor.cs, Default.Ceil.cs, Default.Truncate.cs

Accepted divergences
--------------------

1. Complex (inf+0j)^(1+1j): BCL Complex.Pow via exp(b*log(a)) fails at
   inf inputs; NumPy handles via C complex math library. Matching would
   require rewriting Complex.Pow manually. [Misaligned] same rationale
   as Round 10's accepted exp2(inf+infj) divergence.

2. SByte integer // 0 and % 0: NumSharp returns garbage via double-cast
   path (infinity -> undefined int cast); NumPy with seterr=ignore returns
   0. Neither is "correct" in absolute terms; documented as runtime-
   seterr-dependent behavior.

Test coverage
-------------

New file: test/NumSharp.UnitTest/NewDtypes/NewDtypesCoverageSweep_Arithmetic_Tests.cs
33 tests covering all 6 closed bugs + 12 smoke tests for +/-/*/% across
the three dtypes, overflow wraps, unary negate semantics, abs for complex,
square, sign IEEE semantics, broadcasting.

Updated: test/NumSharp.UnitTest/Backends/Kernels/KernelMisalignmentTests.cs
Reciprocal_Integer_TypePromotion now asserts NumPy-parity behavior (int8
preserved, 1/2 = 0) instead of the previously-documented wrong behavior.
[Misaligned] attribute retained since int->int32 scalar promotion is
orthogonal.

Full suite: 6844 -> 6877 / 0 / 11 per framework.

Files changed
-------------

src/NumSharp.Core/Backends/Default/Math/Default.Floor.cs
src/NumSharp.Core/Backends/Default/Math/Default.Ceil.cs
src/NumSharp.Core/Backends/Default/Math/Default.Truncate.cs
src/NumSharp.Core/Backends/Default/Math/Default.Reciprocal.cs
src/NumSharp.Core/Backends/Default/Math/Default.Power.cs
src/NumSharp.Core/Backends/Kernels/ILKernelGenerator.cs
src/NumSharp.Core/Backends/Kernels/ILKernelGenerator.Binary.cs
test/NumSharp.UnitTest/Backends/Kernels/KernelMisalignmentTests.cs
test/NumSharp.UnitTest/NewDtypes/NewDtypesCoverageSweep_Arithmetic_Tests.cs (new)
docs/plans/LEFTOVER.md
…losed

Systematic battletest of every reduction function against NumPy 2.4.2 for
the three new dtypes. 80-case probe matrix surfaced ten of the twelve
remaining open bugs. Pre-fix parity 72.5%, post-fix 100%.

Scope: sum, prod, cumsum, cumprod, min, max, amax, amin, argmax, argmin,
mean, std, var, all, any, count_nonzero, nansum, nanprod, nanmin, nanmax,
nanmean, nanstd, nanvar - elementwise + axis variants.

Bugs closed
-----------

B1 - Half min/max elementwise returned +/-inf
    IL OpCodes.Bgt/Blt don't work on Half struct; the accumulator stayed at
    identity (negative/positive infinity) since no comparison ever succeeded.
    Fix: Half-specific iterator fallbacks that promote to double for
    comparison with NaN propagation.
    Site: Default.ReductionOp.cs

B2 - Complex mean axis returned Double, dropping imaginary
    Unconditional typeCode ?? Double forced axis kernels into the Double path.
    Fix: Dedicated MeanAxisComplex iterator that accumulates in Complex and
    divides by slice length, preserving the full complex mean.
    Site: Default.Reduction.Mean.cs

B4 - np.prod(Half/Complex) threw NotSupportedException
    Switch statement in prod_elementwise_il had no Half/Complex/SByte branches.
    Fix: Added SByte to IL path, Half/Complex iterator-based fallbacks.
    Site: Default.ReductionOp.cs

B5 - SByte axis reduction threw NotSupportedException
    GetIdentityValue<T> and CombineScalars<T> in the SIMD factory had no SByte.
    Fix: Added SByte branches with identity values and pair combiner.
    Site: ILKernelGenerator.Reduction.Axis.Simd.cs

B6 - Half/Complex cumsum axis threw at kernel execution
    The axis scan helpers AxisCumSumGeneral/SameType throw NotSupportedException
    mid-execution for Half/Complex. The factory try-catch doesn't help since
    the exception fires on delegate invocation.
    Fix: Skip IL fast path for Half/Complex; route to iterator fallback which
    already handles the arithmetic. Added Complex-specific branch in the
    fallback to preserve imaginary (default uses AsIterator<double>).
    Site: Default.Reduction.CumAdd.cs

B7 - argmax/argmin axis threw NotSupportedException for Half/Complex/SByte
    CreateAxisArgReductionKernel factory has no branches for these types; the
    exception occurs at kernel-creation time inside GetOrAdd and propagates.
    Fix: Short-circuit to iterator fallback for Half/Complex/SByte that calls
    argmax_elementwise_il per slice. Also fixed Half/Complex elementwise
    argmax/argmin (same Bgt/Blt-on-Half issue + lex-compare for Complex).
    Sites: Default.Reduction.ArgMax.cs, Default.ReductionOp.cs

B8 - Complex min/max elementwise threw NotSupportedException
    No Complex branch in min/max_elementwise_il.
    Fix: Iterator fallbacks using NumPy-parity lexicographic comparison
    (real first, imag as tie-break). NaN in either component produces a
    NaN result.
    Site: Default.ReductionOp.cs

B12 - Complex argmax tiebreak returned wrong index
    IL kernel used non-lex comparison (likely magnitude-based).
    Fix: Replaced with lex-compare iterator fallbacks.
    Site: Default.ReductionOp.cs

B15 - Complex nansum propagated NaN instead of skipping
    Dispatcher had an early-return for non-float types that sent Complex to
    regular Sum (which doesn't skip NaN).
    Fix: Dedicated NanSumComplex path that iterates with Complex accumulator,
    skipping entries with NaN in Real or Imag. Supports both elementwise
    and axis reductions.
    Site: Default.Reduction.Nan.cs

B16 - Half std/var axis returned Double
    Same pattern as B2 - unconditional Double output. NumPy preserves Half
    input dtype for var/std (Complex -> Double since variance is non-negative
    real).
    Fix: axisOutType = typeCode ?? (Complex ? Double : GetComputingType()).
    Sites: Default.Reduction.Var.cs, Default.Reduction.Std.cs

Test coverage
-------------

New file: test/NumSharp.UnitTest/NewDtypes/NewDtypesCoverageSweep_Reductions_Tests.cs
34 tests (B1 x4, B2 x2, B4 x4, B5 x2, B6 x2, B7 x3, B8 x4, B12 x2,
B15 x3, B16 x3, smoke x5).

Updated pre-existing [Misaligned] tests in ConvertsBattleTests.cs that
documented wrong behavior - now assert NumPy-correct values and removed
[Misaligned] attributes: Mean_ScalarHalfArray_Works,
Mean_ScalarHalfArray_DtypeMismatch, CumSum_HalfMatrix_Axis0_NotSupported,
CumSum_HalfMatrix_Axis1_NotSupported.

Full suite: 6877 -> 6911 / 0 / 11 per framework.

Progress
--------

Round 14 closed 10 of 12 remaining open bugs in a single pass.
Before: B1, B2, B4, B5, B6, B7, B8, B9, B12, B13, B15, B16 (12 open)
After:  B9, B13 (2 open, 34 closed so far)

Files changed
-------------

src/NumSharp.Core/Backends/Default/Math/DefaultEngine.ReductionOp.cs
src/NumSharp.Core/Backends/Default/Math/Reduction/Default.Reduction.ArgMax.cs
src/NumSharp.Core/Backends/Default/Math/Reduction/Default.Reduction.CumAdd.cs
src/NumSharp.Core/Backends/Default/Math/Reduction/Default.Reduction.Mean.cs
src/NumSharp.Core/Backends/Default/Math/Reduction/Default.Reduction.Nan.cs
src/NumSharp.Core/Backends/Default/Math/Reduction/Default.Reduction.Std.cs
src/NumSharp.Core/Backends/Default/Math/Reduction/Default.Reduction.Var.cs
src/NumSharp.Core/Backends/Kernels/ILKernelGenerator.Reduction.Axis.Simd.cs
test/NumSharp.UnitTest/NewDtypes/NewDtypesCoverageSweep_Reductions_Tests.cs (new)
test/NumSharp.UnitTest/Casting/ConvertsBattleTests.cs (updated)
docs/plans/LEFTOVER.md
Closes the last two open parity bugs from the new-dtypes coverage sweep.
All 34 tracked bugs (B1-B37 minus B34 accepted-divergence and B38 alias)
are now closed.

Bugs closed
===========

B9 — np.unique(Complex) threw NotSupportedException
---------------------------------------------------

Root cause: The NDArray.unique() switch dispatch had no case for
NPTypeCode.Complex. The generic unique<T>() had an IComparable<T>
constraint that System.Numerics.Complex cannot satisfy.

Fix (src/NumSharp.Core/Manipulation/NDArray.unique.cs):
  - Added `case NPTypeCode.Complex: return uniqueComplex();` to the switch.
  - Added dedicated protected method uniqueComplex() that mirrors the
    generic hash-dedup path but uses the Comparison<T>-overload of
    LongIntroSort.Sort (no IComparable constraint needed).
  - Added NaNAwareComplexComparer class providing lexicographic compare
    (real, then imag) with any-NaN values sorted to end — matching the
    NaN-at-end semantics already used by NaNAwareDoubleComparer /
    NaNAwareSingleComparer for the float/double path and consistent with
    NumPy's unique sort order.

Verified against NumPy 2.4.2 across 7 input patterns (sorted, reversed,
all-duplicates, single-element, same-real-different-imag, NaN-mid, pure-
imaginary-NaN). All match.

B13 — Complex argmax/argmin with NaN returned wrong index
---------------------------------------------------------

Root cause: The Round-14 ArgMaxComplexFallback / ArgMinComplexFallback
(closes of B12) used pure lexicographic compare and silently skipped
NaN-bearing values — they satisfied neither the "greater" nor "less"
branch. NumPy returns the index of the first Complex value that has NaN
in either component.

Example divergences (pre-fix):

  argmax([1+2j, nan+0j, 3+1j])   NumPy=1 NumSharp=2
  argmax([1+2j, 3+0j, nan+1j])   NumPy=2 NumSharp=1
  argmax([1+2j, 3+nanj, 5+1j])   NumPy=1 NumSharp=2
  argmin([3+1j, nan+0j, 1+2j])   NumPy=1 NumSharp=2

Fix (src/NumSharp.Core/Backends/Default/Math/DefaultEngine.ReductionOp.cs):
Added NaN-first check at the top of both loops in ArgMaxComplexFallback /
ArgMinComplexFallback: if the first element has NaN in either component,
return 0 immediately; if any subsequent element has NaN in either
component, return its index immediately. Mirrors the pattern already used
in ArgMaxHalfFallback / ArgMinHalfFallback (Round 14 B1).

Axis coverage: ArgReductionAxisFallback (B7 fix) calls the scalar
argmax_elementwise_il per slice, so the axis variant inherits the same
NaN-first semantics without additional changes.

Test coverage
=============

Appended 18 new tests to NewDtypesCoverageSweep_Reductions_Tests.cs:
  - B9:  9 tests (basic dedup, sorted, reversed, all-dup, single, same-
         real, NaN-mid, pure-imag-NaN, non-contig view)
  - B13: 9 tests (argmax NaN mid/first/last/imag-only, argmin NaN
         mid/first, lex-regression protecting B12, argmax axis with NaN)

Full suite: 6911 -> 6929 / 0 / 11 per framework (net8.0 + net10.0).

Comprehensive audit
===================

Verified all 34 closed bugs B1-B37 link to existing fix files and
passing regression tests:

  Bug  Round  Fix site                                               Test file
  ---  -----  -----------------------------------------------------  -----------------------
  B1   14     Default.ReductionOp.cs (Half min/max fallbacks)        Sweep_Reductions
  B2   14     Default.Reduction.Mean.cs (MeanAxisComplex)            Sweep_Reductions
  B3   13     ILKernelGenerator.cs (ComplexDivideNumPy)              Sweep_Arithmetic
  B4   14     Default.ReductionOp.cs (Prod SByte + Half/Complex)     Sweep_Reductions
  B5   14     ILKernelGenerator.Reduction.Axis.Simd.cs (SByte ID)    Sweep_Reductions
  B6   14     Default.Reduction.CumAdd.cs (skip IL + Complex iter)   Sweep_Reductions
  B7   14     Default.Reduction.ArgMax.cs (ArgReductionAxisFallback) Sweep_Reductions
  B8   14     Default.ReductionOp.cs (Min/MaxComplex lex)            Sweep_Reductions
  B9   15     NDArray.unique.cs (uniqueComplex)                      Sweep_Reductions
  B10  6      Clip Half/Complex                                      BattletestRound6Tests
  B11  6      Unary math Half/Complex (log10/log2/cbrt/exp2/log1p)   BattletestRound6Tests
  B12  14     Default.ReductionOp.cs (ArgMax/MinComplex lex)         Sweep_Reductions
  B13  15     Default.ReductionOp.cs (NaN-first Complex arg)         Sweep_Reductions
  B14  6      nanmean/nanstd/nanvar Half + Complex                   BattletestRound6Tests
  B15  14     Default.Reduction.Nan.cs (NanSumComplex)               Sweep_Reductions
  B16  14     Default.Reduction.{Std,Var}.cs (preserve Half)         Sweep_Reductions
  B17  6      Clip Half/Complex axis                                 BattletestRound7Tests
  B18  7      Complex cumprod axis                                   BattletestRound7Tests
  B19  7      Complex max/min axis                                   BattletestRound7Tests
  B20  7      Complex std/var axis                                   BattletestRound7Tests
  B21  9      Half log1p/expm1 subnormal Double promotion            EdgeCasesRound6and7
  B22  9      Complex exp2 inf-real via Math.Pow(2,r)                EdgeCasesRound6and7
  B23  9      Complex var/std single-elem axis Double zero           EdgeCasesRound6and7
  B24  9      Var/Std ddof>n clamps max(n-ddof, 0)                   EdgeCasesRound6and7
  B25  10     Complex lex compare NaN short-circuit                  EdgeCasesRound6and7
  B26  10     Complex Sign inf magnitude                             EdgeCasesRound6and7
  B27  11     np.eye.cs (rewrite diagonal stride)                    Sweep_Creation
  B28  11     np.asanyarray.cs (NDArray fast-path astype)            Sweep_Creation
  B29  11     np.asarray.cs (NDArray+Type overload)                  Sweep_Creation
  B30  12     np.frombuffer.cs (ParseDtypeString)                    Sweep_Creation
  B31  12     np.frombuffer.cs (ByteSwapInPlace)                     Sweep_Creation
  B32  12     np.eye.cs (negative-dim validation)                    Sweep_Creation
  B33  13     ILKernelGenerator.Binary.cs (EmitFloorWithInfToNaN)    Sweep_Arithmetic
  B34  —      Accepted BCL divergence (Complex.Pow inf)              n/a
  B35  13     Default.Power.cs (PowerInteger wrap)                   Sweep_Arithmetic
  B36  13     Default.Reciprocal.cs (ReciprocalInteger C-trunc)      Sweep_Arithmetic
  B37  13     Default.{Floor,Ceil,Truncate}.cs (IsInteger no-op)     Sweep_Arithmetic
  B38  —      Alias of B3 (combined during Round 13)                 n/a

Verification pass:
  - Every listed fix file exists at documented path (20/20 spot-checked)
  - Every listed regression test method exists (184+ across new-dtypes
    test files)
  - Full suite passes on both frameworks: 6929 / 0 / 11
  - Probe matrices re-run post-R15:
      Creation (189):   100.0%
      Creation-2 (68):  100.0%
      Creation-3 (41):  95.1% (2 dtype-name-string divergences — behavior
                                correct, representation differs)
      Creation-4 (32):  100.0%
      Arithmetic (109): 96.3% (4 accepted BCL divergences: 2 Complex.Pow
                                inf + 2 SByte int-div-by-zero)
      Reductions (80):  100.0%
  - Spot-checks for 14 representative fixes (B1, B3, B6, B8, B9, B13,
    B14, B16, B26, B27, B30, B35, B36, B37) all verified post-commit

Totals
======

  Closed:      34 bugs (B1-B33, B35-B37, minus B9/B13 previously open)
  Not-a-bug:   2       (B34 accepted BCL divergence, B38 alias of B3)
  Still open:  0

Coverage sweep complete for Half / Complex / SByte across Creation,
Arithmetic, and Reductions API surface.
All 11 tests pass on both frameworks (net8.0 + net10.0) after Rounds 6-15
fixes. Their [OpenBugs] attribute was filtering them out of the CI
default run, hiding passing coverage. Stale tag removed along with the
inaccurate "not supported yet" comments.

Tests re-enabled in CI:
  NewDtypesArithmeticTests.cs
    Complex_Multiply                    (closed by R6)
    Complex_Multiply_Scalar             (closed by R6)
  NewDtypesComparisonTests.cs
    Half_AsType_ToComplex               (closed pre-R11)
  NewDtypesCumulativeTests.cs
    Complex_CumProd                     (closed by R7 / B18)
  NewDtypesEdgeCaseTests.cs
    Complex_Dot                         (closed by R6)
  NewDtypesReductionTests.cs
    Half_Mean                           (closed by R14 / B2+B16)
    Half_Std                            (closed by R14 / B16)
    Complex_Mean                        (closed by R14 / B2)
    Complex_Std                         (closed by R7  / B20)
    Complex_Sum_Axis                    (closed by R7  / B19)
  NewDtypesTypePromotionTests.cs
    Half_Plus_Complex_PromotesToComplex (closed pre-R11)

Full suite: 6929 -> 6940 / 0 / 11 per framework (default CI filter).
Completes the NumPy 2.4.2 parity pass started with Rounds 1-15 by aligning
np.* class-level type aliases, rewriting np.dtype(string) with a full
FrozenDictionary lookup, extending finfo/iinfo to the new dtypes, adding
TypeError / IndexError throwing at NumPy-canonical rejection sites, and
plumbing Complex through matmul + UnmanagedMemoryBlock fills. ~1,912 new
test LoC across 9 new files + updates to 6 existing test files.

np.* type aliases (src/NumSharp.Core/APIs/np.cs)
------------------------------------------------
Breaking changes to match NumPy 2.4.2:

  np.byte         byte (uint8)  -> sbyte (int8)  NumPy C-char convention
  np.complex64    complex128    -> throws NSE    no silent widening
  np.csingle      complex128    -> throws NSE    no silent widening
  np.uint         uint64        -> uintp (ptr)   NumPy 2.x
  np.intp         nint          -> long on 64-bit (nint has NPTypeCode.Empty
                                                   which breaks dispatch)
  np.uintp        nuint         -> ulong on 64-bit
  np.int_         long          -> intp          NumPy 2.x (int_ == intp)

Added aliases:
  np.short, np.ushort, np.intc, np.uintc, np.longlong, np.ulonglong,
  np.single, np.cdouble, np.clongdouble

Platform-detected (C-long convention: 32-bit MSVC / 64-bit *nix LP64):
  np.@long, np.@Ulong

np.dtype(string) parser (src/NumSharp.Core/Creation/np.dtype.cs)
-----------------------------------------------------------------
Regex parser replaced with a FrozenDictionary<string, Type> built once
at static init. Platform-detection helpers (_cLongType, _cULongType,
_intpType, _uintpType) declared BEFORE the dictionary since static
initializers run top-down and BuildDtypeStringMap reads them.

Covers:
  - Single-char NumPy codes: ? b B h H i I l L q Q p P e f d g D G
  - Sized forms: b1 i1 u1 i2 u2 i4 u4 i8 u8 f2 f4 f8 c16
  - Lowercase names: bool int8..int64 uint8..uint64 float16..float64
                     complex complex128 half single double byte ubyte
                     short ushort intc uintc int_ intp uintp bool_ int
                     uint long ulong longlong ulonglong longdouble
                     clongdouble
  - NumSharp-friendly: SByte Byte UByte Int16..UInt64 Half Single Float
                       Double Complex Bool Boolean boolean Char char
                       decimal

Unsupported codes throw NotSupportedException:
  - Bytestring (S / a), Unicode (U), datetime (M), timedelta (m),
    object (O), void (V) - NumSharp has no equivalents
  - complex64 / 'F' / 'c8' - NumSharp only has complex128

np.finfo + np.iinfo (src/NumSharp.Core/APIs/np.{finfo,iinfo}.cs)
----------------------------------------------------------------
np.finfo gains:
  - Half (IEEE binary16: bits=16, eps=2^-10, smallest_subnormal=2^-24,
    maxexp=16, minexp=-14)
  - Complex (reports underlying float64 values with dtype=float64 per
    NumPy parity: finfo(complex128).dtype == float64)

np.iinfo gains SByte (int8) with signed min/max and 'i' kind.
IsSupportedType extended to accept Half, Complex, SByte.

find_common_type table (src/NumSharp.Core/Logic/np.find_common_type.cs)
-----------------------------------------------------------------------
~30 table entries swapped from np.complex64 -> np.complex128 to reflect
NumPy 2.4.2 rules and avoid relying on the now-throwing alias. No
behavioral change for callers: the previous complex64 alias pointed at
Complex anyway.

NDArray implicit/explicit casts
-------------------------------
src/NumSharp.Core/Casting/Implicit/NdArray.Implicit.ValueTypes.cs

Added implicit scalar -> NDArray for `sbyte` and `Half`.
Added explicit NDArray -> `sbyte` scalar.

Common validation factored into EnsureCastableToScalar(nd, targetType,
targetIsComplex):
  - ndim != 0                           -> IncorrectShapeException
  - non-complex target + complex source -> TypeError

Python's `int(complex(1, 2))` raises TypeError; NumSharp matches.
NumPy's ComplexWarning (silent imaginary drop) treated as a hard error
since NumSharp has no warning mechanism.

NumPy-parity error types at rejection sites
--------------------------------------------
  - Default.Shift.ValidateIntegerType: NotSupportedException -> TypeError
    ("ufunc 'left_shift' not supported for the input types, ... safe casting")
  - NDArray.Indexing.Selection.{Getter,Setter}: ArgumentException -> IndexError
    ("only integers, slices (':'), ellipsis ('...'), numpy.newaxis ('None')
     and integer or boolean arrays are valid indices")
  - np.repeat: permissive Half/Complex truncation -> TypeError
    ("Cannot cast array data from dtype('float16') to dtype('int64')
     according to the rule 'safe'")

New exception (src/NumSharp.Core/Exceptions/IndexError.cs):
  public class IndexError : NumSharpException
  Mirrors Python's IndexError. Raised for out-of-range subscripts and
  invalid index types (e.g. float/complex index on an ndarray).

UnmanagedMemoryBlock.Allocate cross-type fill
---------------------------------------------
src/NumSharp.Core/Backends/Unmanaged/UnmanagedMemoryBlock.cs

Replaced direct boxing casts `(Half)fill` / `(Complex)fill` / etc with
Utilities.Converts.ToXxx(fill) dispatchers. Previously `fill = 1` passed
to a Half array threw InvalidCastException because a boxed int cannot
unbox to Half. Now follows the same NumPy-parity wrapping path as the
rest of the casting subsystem (int -> Half, double -> Complex, etc).

Complex matmul
--------------
src/NumSharp.Core/Backends/Default/Math/BLAS/Default.MatMul.2D2D.cs

MatMulMixedType<TResult> now short-circuits to MatMulComplexAccumulator
when TResult is Complex - the double-precision accumulator was dropping
imaginary parts for Complex outputs. The dedicated path accumulates in
Complex across K and writes Complex-precision results.

np.asanyarray
-------------
src/NumSharp.Core/Creation/np.asanyarray.cs

Half and System.Numerics.Complex added to the scalar-detection branch.
Previously fell through to "Unable to resolve asanyarray for type
Half/Complex" because neither matched IsPrimitive.

Test coverage (~1,912 new LoC)
------------------------------
  NpTypeAliasParityTests              174 LoC - every np.* alias vs NumPy 2.4.2
  np.finfo.NewDtypesTests             262 LoC - Half / Complex finfo
  np.iinfo.NewDtypesTests              95 LoC - SByte iinfo
  UnmanagedMemoryBlockAllocateTests   226 LoC - cross-type fills
  ComplexToRealTypeErrorTests         170 LoC - Complex -> int/float scalar cast
  NDArrayScalarCastTests              384 LoC - 0-d cast matrix (implicit + explicit)
  Complex64RefusalTests               116 LoC - complex64 / csingle throw
  DTypePlatformDivergenceTests        166 LoC - 'l'/'L'/'int' platform behavior
  DTypeStringParityTests              319 LoC - every dtype string vs NumPy

Updates to existing tests:
  - ConvertsBattleTests.cs: [Misaligned] tags removed from Half/Complex
    repeat/shift/index cases; assertions aligned to NumPy-parity TypeError
  - ShiftOpTests.cs: NotSupportedException -> TypeError
  - np.finfo.BattleTest / np.iinfo.BattleTest: "float" now -> 64 bits
    (alias for float64); "int" now -> intp (64 on 64-bit)
  - np.dtype.Test: split into Case1_ValidForms / renamed classes
  - np.find_common_type.Test: complex64 -> complex128; added
    Case4b_c8_ThrowsNotSupported guard

Docs
----
  docs/website-src/docs/NDArray.md       663 LoC - user-facing NDArray guide
  docs/website-src/docs/dtypes.md        610 LoC - dtype reference
  docs/website-src/docs/toc.yml          NDArray + Dtypes added to TOC
  docs/plans/REVIEW_FINDINGS.md          306 LoC - review notes
  docs/releases/RELEASE_0.51.0-prerelease.md - release notes for the branch
Audit of "NPTypeCode.Single =>" arrow-switch expressions across 23 files
found 11 gaps where Half/SByte/Complex were missing. Fixed each and
tightened several related behaviors to strict NumPy 2.x parity.

## Core fixes (Half/SByte/Complex coverage)

- np.repeat: add SByte/Half/Complex to RepeatScalarTyped/RepeatArrayTyped
  switches. Previously threw NotSupportedException for these dtypes.

- np.any / np.all axis: add SByte/Half/Complex to axis dispatch
  (generic ComputePerAxis<T> already supports via unmanaged constraint).

- ILKernelGenerator.Reduction.Axis.Arg: add SByte/Half/Complex to
  argmax/argmin axis dispatch. Added ArgReduceAxisHalfNaN (NumPy
  first-NaN-wins semantics via double) and ArgReduceAxisComplex
  (lexicographic real-then-imag, NaN propagates). sbyte added to
  CompareGreater/Less.

- ReductionKernel.GetMinValue/GetMaxValue: add SByte/Half/Complex
  identities (sbyte.Min/MaxValue, Half.Negative/PositiveInfinity,
  Complex(inf,0) sentinels for Max/Min identity on empty arrays).

- Default.Reduction.Nan ExecuteNanAxisReductionScalar: add Half case +
  ReduceNanAxisScalarHalf helper covering NanSum/NanProd/NanMin/NanMax.
  Previously silently returned 0 for Half axis NaN reductions.

- ILKernelGenerator.Reduction.Axis.NaN: updated doc comment clarifying
  Half/Complex route to scalar fallback (resolved by the above fix).

- Default.ATan2: add SByte/Half to ConvertToDouble/ConvertToDecimal and
  Half to result-type switch. Complex excluded (NumPy arctan2 rejects
  complex inputs — matches np.arctan2 TypeError).

- np.can_cast ValueFitsInType: add Half (range-checked ±65504) and
  Complex (always true from real) to every `case` arm; added
  `case Half h:` and `case Complex c:`. Full 13×13 can_cast matrix
  now matches NumPy exactly.

- ILKernelGenerator EmitDecimalConversion: added SByte conversion via
  new CachedMethods.DecimalImplicitFromSByte / DecimalToSByte. Previously
  sbyte↔decimal IL conversions threw NotSupportedException.

- np.sctype2char: fix Boolean '?' (was incorrectly 'b'), add SByte 'b',
  add Half 'e'. Matches NumPy 2.x np.dtype(x).char.

## Strict-parity fixes discovered during verification

- ATan2 auto-promotion now matches NumPy 2.x per-input targeting:
  bool/i8/u8 → float16, i16/u16 → float32, i32+/i64+/char → float64,
  float types preserved, binary takes max. Added PromoteATan2Single +
  PromoteATan2Binary helpers. Previously everything except f32+f32
  promoted to double.

- common_type_code rewritten to match NumPy exactly:
  * Boolean input: raises TypeError "non-numeric array" (NumPy parity)
  * Any Complex → Complex
  * Any Decimal → Decimal (NumSharp extension)
  * Any integer/char → Double (forces float64 even if smaller float present)
  * Otherwise: max pure float (Half < Single < Double)
  12×12 matrix now identically matches NumPy.

- Empty reduction dtype: sum/prod of empty array now uses
  GetAccumulatingType() so int/bool → Int64/UInt64, floats preserved.
  Previously returned input dtype (sum([], sbyte) gave SByte, NumPy
  gives int64). Fixed in Default.Reduction.Add (HandleEmptyArrayReduction
  + IsEmpty path) and Default.Reduction.Product (both IsEmpty paths).

## Test additions

- test/NumSharp.UnitTest/APIs/np.common_type.BattleTest.cs:
  Complete rewrite — 77 comprehensive tests covering:
  - Boolean TypeError (5 tests)
  - Single integer inputs → Double (9 tests)
  - Single float preserved (3 tests)
  - Complex/Decimal (2 tests)
  - Pure float combos → max float (9 tests)
  - Integer+Integer combos (7 tests)
  - Integer+Float combos (10 tests)
  - Complex combos (9 tests)
  - Decimal combos with float/int/complex (5 tests)
  - NDArray / Type overloads (12 tests)
  - Argument validation (3 tests)

- test/NumSharp.UnitTest/Backends/Kernels/BinaryOpTests.cs:
  8 new ATan2_* tests pinning Half/SByte/Int16 NumPy parity:
  ATan2_Float16_ReturnsHalf, ATan2_Int8_ReturnsFloat16,
  ATan2_UInt8_ReturnsFloat16, ATan2_Int16_ReturnsFloat32,
  ATan2_Float16_Int8_ReturnsFloat16,
  ATan2_Float16_Int32_ReturnsFloat64,
  ATan2_Int16_Float16_ReturnsFloat32.

- test/NumSharp.UnitTest/APIs/np.type_checks.BattleTest.cs:
  Updated Sctype2Char_Boolean to expect '?' (matches NumPy);
  added Sctype2Char_SByte ('b') and Sctype2Char_Half ('e').

## Verification methodology

Every change verified against NumPy 2.x via python_run reference runs.
Side-by-side 13×13 can_cast grid and 12×12 common_type grid both
produce identical output to NumPy. Cast correctness (Half↔double) is
lossless per IEEE 754 and matches NumPy's internal float16 handling.

## Test results

7192 passed / 0 failed / 11 skipped on both net8.0 and net10.0.
(+63 net tests vs pre-audit; rewritten common_type suite replaced
14 older tests with 77 parity-locked ones.)

## Behavioral breaking changes (NumPy parity)

- np.sctype2char(Boolean): 'b' → '?'
- np.common_type(Boolean): returned Double → now throws TypeError
- np.arctan2(i8/u8/bool): returned Double → now returns Half
- np.arctan2(i16/u16): returned Double → now returns Single
- np.arctan2(f16): returned Double → now returns Half
- np.sum/np.prod of empty integer array: returned input dtype →
  now returns Int64/UInt64 accumulating type
@Nucs Nucs marked this pull request as ready for review April 22, 2026 20:13
@Nucs Nucs merged commit 5eedb81 into master Apr 22, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant