Skip to content

fix(us-bf-023): remove duplicate methods in OptimizationResult#200

Merged
ooples merged 1 commit intomerge-dev2-to-masterfrom
fix/us-bf-023-resolve-duplicate-methods-optimizationresult
Oct 23, 2025
Merged

fix(us-bf-023): remove duplicate methods in OptimizationResult#200
ooples merged 1 commit intomerge-dev2-to-masterfrom
fix/us-bf-023-resolve-duplicate-methods-optimizationresult

Conversation

@ooples
Copy link
Copy Markdown
Owner

@ooples ooples commented Oct 23, 2025

Summary

Removes duplicate method definitions in OptimizationResult that were causing CS0111 compilation errors.

Changes:

  • Removed duplicate DeepCopy() method definition (kept first occurrence)
  • Removed duplicate WithParameters(Vector<T>) method definition (kept first occurrence)

Build Status:

  • Before: 2 CS0111 errors for duplicate methods
  • After: 0 errors in OptimizationResult.cs

User Story: US-BF-023

🤖 Generated with Claude Code

- Removed duplicate DeepCopy() method definition (lines 511-528)
- Removed duplicate WithParameters() method definition (lines 535-552)
- Resolves CS0111 compilation errors for OptimizationResult class

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings October 23, 2025 15:45
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes CS0111 compilation errors in OptimizationResult.cs by removing duplicate method definitions that were causing build failures.

Key Changes:

  • Removed duplicate DeepCopy() method definition (lines 507-531)
  • Removed duplicate WithParameters(Vector<T>) method definition (lines 533-556)

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@ooples ooples merged commit e96b172 into merge-dev2-to-master Oct 23, 2025
0 of 2 checks passed
@ooples ooples deleted the fix/us-bf-023-resolve-duplicate-methods-optimizationresult branch October 23, 2025 17:43
ooples pushed a commit that referenced this pull request Apr 18, 2026
…ting (gaps 9+14)

## Gap 9 — autotune cache diagnostics

Tensors ships AutotuneCache (Helpers.Autotune namespace) with per-kernel
autotune storage. Users couldn't see whether it was active. Filed Tensors
issue #200 for the WarmupCommonKernelsAsync convenience; added diagnostics
surface here so users at least see the cache path + hardware fingerprint.

- src/Diagnostics/AccelerationDiagnostics.cs: GetReport now emits
  AutotuneCache.DefaultCachePath + CurrentHardwareFingerprint.
  AccelerationSnapshot carries both as AutotuneCachePath / AutotuneHardwareFingerprint.

## Gap 14 — subclass Predict() routing through PredictCompiled

11 NeuralNetwork subclasses overrode Predict as literally `return Forward(input);`,
bypassing the base's PredictCompiled auto-compile path. Refactored each to
override PredictEager (the base's compile-lambda eager fallback) instead,
keeping Forward as the implementation but routing through CompiledModelHost.

After: every Predict on these 11 models goes through _compileHost.Predict,
which traces → compiles → replays (and triggers disk caching via PlanCache
when configured, from Gap 1).

Files touched:
- src/NeuralNetworks/ConvolutionalNeuralNetwork.cs
- src/NeuralNetworks/EfficientNetNetwork.cs
- src/NeuralNetworks/FastText.cs
- src/NeuralNetworks/GloVe.cs
- src/NeuralNetworks/MobileNetV2Network.cs
- src/NeuralNetworks/ResNetNetwork.cs
- src/NeuralNetworks/SiameseNeuralNetwork.cs
- src/NeuralNetworks/UNet3D.cs
- src/NeuralNetworks/VGGNetwork.cs
- src/NeuralNetworks/VoxelCNN.cs
- src/NeuralNetworks/Word2Vec.cs

Forward methods unchanged — they still have their GPU-resident fast path
(TryForwardGpuOptimized etc.) and shape-validation logic. The base's
PredictCompiled treats Forward as the eager fallback but AutoTracer fires
on first call regardless.

## Verify

- dotnet build -f net10.0 — clean
- dotnet build -f net471 — clean
- dotnet test CompiledTapeTrainingStep + FusedOptimizer — 9/9 passing

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ooples pushed a commit that referenced this pull request Apr 19, 2026
…ting (gaps 9+14)

## Gap 9 — autotune cache diagnostics

Tensors ships AutotuneCache (Helpers.Autotune namespace) with per-kernel
autotune storage. Users couldn't see whether it was active. Filed Tensors
issue #200 for the WarmupCommonKernelsAsync convenience; added diagnostics
surface here so users at least see the cache path + hardware fingerprint.

- src/Diagnostics/AccelerationDiagnostics.cs: GetReport now emits
  AutotuneCache.DefaultCachePath + CurrentHardwareFingerprint.
  AccelerationSnapshot carries both as AutotuneCachePath / AutotuneHardwareFingerprint.

## Gap 14 — subclass Predict() routing through PredictCompiled

11 NeuralNetwork subclasses overrode Predict as literally `return Forward(input);`,
bypassing the base's PredictCompiled auto-compile path. Refactored each to
override PredictEager (the base's compile-lambda eager fallback) instead,
keeping Forward as the implementation but routing through CompiledModelHost.

After: every Predict on these 11 models goes through _compileHost.Predict,
which traces → compiles → replays (and triggers disk caching via PlanCache
when configured, from Gap 1).

Files touched:
- src/NeuralNetworks/ConvolutionalNeuralNetwork.cs
- src/NeuralNetworks/EfficientNetNetwork.cs
- src/NeuralNetworks/FastText.cs
- src/NeuralNetworks/GloVe.cs
- src/NeuralNetworks/MobileNetV2Network.cs
- src/NeuralNetworks/ResNetNetwork.cs
- src/NeuralNetworks/SiameseNeuralNetwork.cs
- src/NeuralNetworks/UNet3D.cs
- src/NeuralNetworks/VGGNetwork.cs
- src/NeuralNetworks/VoxelCNN.cs
- src/NeuralNetworks/Word2Vec.cs

Forward methods unchanged — they still have their GPU-resident fast path
(TryForwardGpuOptimized etc.) and shape-validation logic. The base's
PredictCompiled treats Forward as the eager fallback but AutoTracer fires
on first call regardless.

## Verify

- dotnet build -f net10.0 — clean
- dotnet build -f net471 — clean
- dotnet test CompiledTapeTrainingStep + FusedOptimizer — 9/9 passing

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ooples added a commit that referenced this pull request Apr 20, 2026
…loses #1015) (#1155)

* chore(deps): bump AiDotNet.Tensors 0.38.0 → 0.42.3

Also bumps AiDotNet.Native.OneDNN in lockstep. Picks up the recent
perf work (backward-op primitive fast paths, net471 SIMD gap fix,
memory planning / tile scheduling / operator reordering, plan
serialization + stitching audit fixes) and is the baseline for the
dead-JIT-scaffolding cleanup in issue #1015.

Both net10.0 and net471 build with 0 errors. No source changes needed
— the Tensors API is backward-compatible across 0.38 → 0.42.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: remove dead src/InferenceOptimization/ tree and TapeLayerBridge

The src/InferenceOptimization/ directory (16 .cs files + Core/IR/Kernels/Passes
subdirs + README + ARCHITECTURE) was orphaned JIT IR/optimization-pass
scaffolding from a compilation model that no longer exists. Zero production
code in src/ referenced it — only tests and benchmarks did. The AiDotNet.Tensors
compilation system (AutoTracer + CompiledInferencePlan + CompiledTrainingPlan,
auto-enabled) is the replacement.

Removed:
- src/InferenceOptimization/ (full tree, 36 .cs files)
- tests/AiDotNet.Tests/InferenceOptimization/ + IntegrationTests/InferenceOptimization/
- AiDotNetBenchmarkTests/InferenceOptimization/
- The InferenceOptimization PR #768 regression-tests region in MergedPRBugFixTests.cs
- global using MemoryLayout / QuantizationParams aliases in src/Helpers/UsingsHelper.cs
  and tests/AiDotNet.Tests/GlobalUsings.cs (nothing else in either project uses
  either of those short names)
- src/NeuralNetworks/SyntheticData/TapeLayerBridge.cs — public method
  ExportMLPGeneratorGraph had zero callers; xmldoc falsely claimed WGAN-GP use
  but WGANGP.cs:450-456 already uses GradientTape<T> directly
- SyntheticTabularGeneratorBase.ExportMLPGeneratorGraph wrapper (dead)
- SyntheticTabularGeneratorBase.SupportsJitCompilation + ExportComputationGraph
  stubs (both threw NotSupportedException)

Scrubbed TapeLayerBridge mentions from MedSynthGenerator, MisGANGenerator,
TimeGANGenerator private helper xmldoc (the helpers themselves remain —
they're still used by in-progress gradient-penalty code or kept for
analysis).

Cosmetic: the "// InferenceOptimization Operations" and
"// Fused Operations for InferenceOptimization" comment labels in
src/Enums/OperationType.cs are replaced with generic labels. The enum
values themselves are public API and left in place.

LIVE and untouched (different systems that share a prefix):
- src/Inference/InferenceOptimizer.cs (KV-cache, speculative decoding)
- src/Configuration/InferenceOptimizationConfig.cs (quantization config)

Build: net10.0 + net471 both green, 0 errors. Tests build green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor: remove dead AiDotNet-side ExportComputationGraph scaffolding

The removed AiDotNet JitCompiler left behind a graveyard of stub
methods and now-unreachable graph-building helpers. All of them either
threw NotSupportedException or built ComputationNode<T> trees for a
compiler that no longer exists.

Deleted:
- ExportComputationGraph method on: LayerBase, NeuralNetworkBase,
  ClassifierBase, MultiLabelClassifierBase, RegressionBase,
  NonLinearRegressionBase, DecisionTreeRegressionBase,
  DecisionTreeAsyncRegressionBase, DiffusionModelBase, VAEModelBase,
  NoisePredictorBase, AutoMLModelBase, ShardedModelBase, CausalModelBase,
  OnlineLearningModelBase, TimeSeriesModelBase,
  ReinforcementLearningAgentBase, SurvivalModelBase, NBEATSBlock,
  AiModelResult, MCDropoutLayer, BayesianDenseLayer.
- ConvertLayerToGraph helper on NeuralNetworkBase.
- SupportsJitCompilation property on LayerBase + same accompanying
  bases (kept on IActivationFunction / IVectorActivationFunction
  interfaces and their implementations since those are still consumed
  by LoRA / SE / Hyperbolic layer fallback paths).
- Layer-internal graph helpers: LayerBase.ApplyActivationToGraph,
  CanActivationBeJitted, SparseLinearLayer.ApplyActivationToComputationNode,
  SqueezeAndExcitationLayer.ApplyActivationToGraphNode,
  SpyNetLayer.{BuildComputationGraph,BuildPyramidGraph,CreateGridFromFlowGraph,
  CreateIdentityGrid,CreateScaleTensor},
  DeformableConvolutionalLayer.BuildComputationGraph.
- NonLinearRegressionBase kernel-graph helpers:
  ComputeLinearKernel, ComputeRBFKernel, ComputeSigmoidKernel,
  ComputePolynomialKernel, ComputeLaplacianKernel, CreateFilledTensorLike.
- DecisionTreeRegressionBase / DecisionTreeAsyncRegressionBase
  ExportNodeAsComputationGraph + GetMaxFeatureIndexFromTree helpers.
- NBEATSBlock.ApplyBasisExpansionGraph helper.
- TestScaffoldGenerator emitting the SupportsJitCompilation /
  ExportComputationGraph stubs into generated test fixtures.

Stale xmldoc bullets "JIT compilation support via ExportComputationGraph()"
removed from GraphAttentionLayer, GraphIsomorphismLayer, GraphSAGELayer,
PrincipalNeighbourhoodAggregationLayer.

Tests: removed obsolete *_JitRemoved_SupportsJitIsFalse* assertions from
BaseClassesIntegrationTests and the LoRA/KNN/LWR/RotaryPE/QuantizedAttention
SupportsJitCompilation checks. Removed MixedPrecisionIntegrationTests'
TestLayer override of SupportsJitCompilation / ExportComputationGraph.

Compilation is transparent via AiDotNet.Tensors' AutoTracer (auto-enabled,
hot paths compile to CompiledInferencePlan after the 2nd call). Opt out
via TensorCodecOptions.Current.EnableCompilation or the still-supported
AiModelBuilder.ConfigureJitCompilation() builder API (which projects
config onto the live TensorCodecOptions under the hood).

Build: net10.0 + net471 + tests all green, 0 errors.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(deps): bump AiDotNet.Tensors 0.42.3 → 0.46.0

Newer Tensors release while this PR was in flight. Brings the work
after 0.42.x (0.43, 0.44, 0.45, 0.46). Backward-compatible API — no
source changes required.

Build: net10.0 + net471 both green, 0 errors.
Auto-compile regression tests (CompileForwardTests +
CompiledModelHostTests, 14 total): all pass on both TFMs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: address PR #1155 review comments on NBEATSBlock

Three review comments from CodeRabbit, all on src/TimeSeries/NBEATSBlock.cs:

1. (Major) ParameterCount / GetParameters / SetParameters ignored the
   trainable V_b / V_f bases in generic mode — only fc weights + biases
   were exported/imported. That made parameter round-trips drop learned
   basis state. Fixed: include the bases in all three APIs when
   _useInterpretableBasis == false, and also re-register them in
   ReRegisterParameters so the SetParameters tensor swap doesn't drop
   them from the trainable registry.

2. (Major) Constructor validated lookbackWindow / forecastHorizon /
   hiddenLayerSize / numHiddenLayers but accepted thetaSizeBackcast,
   thetaSizeForecast, and polynomialDegree without validation. Invalid
   values deferred failure to tensor allocation/math paths where
   diagnosis is much harder. Added explicit checks: both theta sizes
   must be positive; polynomialDegree must be non-negative when
   useInterpretableBasis is true.

3. (Critical) UpdateParameters(T learningRate) was an empty override
   that silently ignored update calls — the kind of silent no-op that
   becomes an accuracy regression you can only find by bisecting.
   Replaced with a fail-fast InvalidOperationException pointing
   callers at the tape-based training path (CompiledTapeTrainingStep),
   so misuse is caught at the training boundary instead of producing
   silently-undertrained models.

Build: net10.0 + net471 + tests all green, 0 errors.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: two more PR #1155 review comments on NBEATSBlock

1. (Major) Interpretable theta sizes weren't validated against
   polynomialDegree + 1. ComputeBasisTensor + ApplyBasisExpansion both
   cap usable rows at polynomialDegree + 1, so oversized thetaSizeBackcast
   / thetaSizeForecast silently allocated dead trainable weights that
   couldn't influence the output. Added explicit checks for
   interpretable mode.

2. (Critical) ForwardInternal's generic branch in ApplyBasisExpansion
   returned theta directly instead of multiplying by the learned V_b /
   V_f basis tensors. With the Phase 1 fix that made those bases
   round-trip through GetParameters/SetParameters, PredictSingle would
   diverge from both Forward() (which uses _basisBackcast/_basisForecast
   via matmul) and from loaded-model state. Changed
   ApplyBasisExpansion to take a basis tensor argument and multiply
   by it in the generic branch, matching training and tape paths per
   Oreshkin et al. 2020 §3.2.

Build: net10.0 clean, 0 errors.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor: audit pass — remove dead SupportsJitCompilation properties + empty IJitCompilable regions

Sweep #1 of the end-to-end JIT audit. The IJitCompilable interface no
longer exists (removed with the JitCompiler). Every remaining reference
was either a stale region marker with no content, or a vestigial
SupportsJitCompilation property that nothing reads.

Removed:
- SupportsJitCompilation property on 11 model base classes (AutoML,
  CausalInference, Classification, MultiLabelClassification, Diffusion,
  NoisePredictor, VAE, DistributedTraining/ShardedModelBase,
  NeuralNetworkBase, OnlineLearning, Survival). Kept on
  IActivationFunction / IVectorActivationFunction implementations since
  LoRA / SqueezeAndExcitation / HyperbolicLinear / SparseLinear still
  consult those for activation-graph fallback (live code path).
- 21 empty #region IJitCompilable Override markers across the
  SyntheticData generators (AutoDiffTab, CausalGAN, CopulaGAN,
  CTABGANPlus, CTGAN, DPCTGAN, FinDiff, GOGGLE, MedSynth, MisGAN,
  OCTGAN, PATEGAN, REaLTabFormer, TabDDPM, TabFlow, TableGAN, TabLLMGen,
  TabSyn, TabTransformerGen, TimeGAN, TVAE).
- 4 empty #region IJitCompilable Implementation Override markers in
  Regression (AdaBoostR2, ExtremelyRandomizedTrees, GradientBoosting,
  RandomForest) and TransferLearning (TransferRandomForest).
- ExpressionTree.BuildComputationGraph (private method with only
  recursive self-calls; nothing external called it after JitCompiler
  removal).
- VectorModel.VectorToTensor (private method inside #region
  IJitCompilable Implementation; only defined, never referenced inside
  or outside the file).
- SuperNet.ExportOperationGraph + SuperNet.Forward (both dead, only
  existed to satisfy the removed IJitCompilable interface).

Build: net10.0 + net471 both green, 0 errors.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(jit-audit): remove dead BuildComputationGraph chain + IJitCompilable xmldoc

Audit A: continues the dead JIT-scaffolding sweep from 010bc09 / fb9ebb2.
After removal in prior commits of the JitCompiler / InferenceOptimization IR /
ExportComputationGraph surface, these eight layer files still carried a closed
chain of private BuildComputationGraph() methods that only called each other
(zero external callers), plus xmldoc remarks referencing IJitCompilable on
interfaces that were removed. All of it is dead.

- 8 neural-network layers: remove private BuildComputationGraph chain
  (DenseBlock, InvertedResidual, RRDB, RRDBNetGenerator, ResidualDense,
   SqueezeAndExcitation, Transition, UNetDiscriminator)
- 5 KnowledgeDistillation teacher-model xmldocs: strip IJitCompilable
  references; note Tensors' AutoTracer handles auto-compile transparently
- DeepReinforcementLearningAgentBase: same xmldoc fix + point to
  TensorCodecOptions.Current.EnableCompilation for opt-out
- InterfaceGuard / IFullModel: scrub IJitCompilable from the remarks list

Build: net10.0 + net471 clean (0 errors).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(jit-audit): remove orphan IJitCompilable stubs from tests & docs

After the production-side IJitCompilable removal in prior commits, test
mocks, testconsole examples, the benchmark harness, and the golden-standard
pattern doc still carried SupportsJitCompilation + ExportComputationGraph
stubs for a removed interface — purely orphan code that compiled only
because no interface required those members.

This sweep removes all of them (29 test files + 3 testconsole examples +
1 benchmark helper + GOLDEN_STANDARD_PATTERNS.md). The only remaining
SupportsJitCompilation references are on IActivationFunction /
IVectorActivationFunction, which are part of the live ComputationNode
graph-mode autodiff path (distinct from the removed JIT compiler).

Build: net10.0 + net471 clean across src/, tests/, testconsole/, and
AiDotNetBenchmarkTests/.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(jit-audit): remove stale JitCompiler/InferenceOptimization filters from CI

Deep Audit A found 5 dangling references to deleted test namespaces in CI and
test-sharding configs. These filters targeted tests that were deleted in the
JIT/InferenceOptimization sweep — now no-ops that confuse CI dashboards.

- .github/workflows/sonarcloud.yml:
  - Unit-06 shard: drop UnitTests.JitCompiler (deleted); rename "JIT/KD/..." -> "KD/..."
  - Exclusion filter: drop UnitTests.JitCompiler (no longer emits tests)
  - Drop "Other - InferenceOptimization" shard entirely (namespace deleted)
- scripts/run-tests-sharded.ps1:
  - Drop "JitCompiler" from unitNamespaceRoots
  - Rename Unit-07 shard "Interpretability/JIT/KD" -> "Interpretability/KD"
  - Drop shard 13 "InferenceOptimization" (namespace deleted); renumber PromptEngineering -> 13

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(tensors-parity): acceleration diagnostics + SymbolicShape dynamic-batch compile (gaps 2+5)

Part of the Tensors-0.46.0 parity work (closes 2 of 14 gaps; see Tensors
issues #197 + #198 for the two gaps that need upstream Tensors features).

## Gap 5 — acceleration diagnostics (PlatformDetector + NativeLibraryDetector)

Tensors already ships PlatformDetector (SIMD level, cache sizes, GPU
support flags) and NativeLibraryDetector (OpenBLAS / CLBlast / cuDNN /
MKL availability). AiDotNet was ignoring both — users had no visibility
into which acceleration path was actually engaged.

- src/Diagnostics/AccelerationDiagnostics.cs — new facade wrapping both
  detectors. GetReport() returns a human-readable summary; GetSnapshot()
  returns a structured AccelerationSnapshot for programmatic assertions.
- AiModelBuilder.ReportAccelerationStatus(Action<string>? logger) — opt-in
  builder method. Runs after ApplyGpuConfiguration so the snapshot reflects
  the engine state the built model will actually see.
- AiModelResult.AccelerationSnapshot — new property on every AiModelResult.
  7 construction sites updated via AttachDiagnostics() helper.

## Gap 2 — SymbolicShape for dynamic batch/seq-len compile keys

CompiledModelHost keyed the compile cache on concrete shape via
GetOrCompileInference(shape, forward). Every batch-size change forced a
fresh trace+compile — wasteful for real inference traffic where request
batches vary. Tensors exposes SymbolicShape.BatchDynamic /
BatchAndSeqDynamic / AllDynamic + a 3-arg GetOrCompileInference overload
for exactly this case.

- src/NeuralNetworks/CompiledModelHost.cs: new SymbolicShapeMode enum
  (Static / BatchDynamic / BatchAndSeqDynamic / AllDynamic). Default =
  BatchDynamic (matches PyTorch torch.compile(dynamic=True) default).
- Predict() builds a SymbolicShape from mode + concrete shape and calls
  the 3-arg overload, falling back to the 2-arg concrete overload when
  rank is too small (e.g. 1-D scalar input with BatchDynamic requested).

## Verify

dotnet build -f net10.0 — clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(tensors-parity): disk-backed plan caching via CompiledPlanLoader (gap 1)

PyTorch-parity equivalent: torch.jit.save + torch.jit.load. Before this
change AiDotNet recompiled every forward-pass plan on every process start —
wasteful for production serving where warm inference latency matters.

Tensors 0.46.0 already ships everything needed: ICompiledPlan.SaveAsync,
CompiledPlanLoader.LoadInferenceAsync, PlanCompatibilityInfo for hardware-
fingerprint gating. We just weren't wiring it.

## Public API

    await new PredictionModelBuilder<float, Tensor<float>, Tensor<float>>()
        .ConfigureModel(myNet)
        .ConfigurePlanCaching(@"C:\PlanCache")   // NEW
        .BuildAsync();

Plans are saved under modelTypeName_T_v{structureVersion}_s{shapeHash}.plan.
Per-(model, T, version, shape) — one directory can host multiple models
without collision. Loads that fail PlanCompatibilityInfo fall through to
a fresh compile silently.

## Files

- src/NeuralNetworks/PlanCache.cs: new. Static Current, directory-based
  storage, atomic writes (tmp + rename). Shape hash = SHA256 of int32[].
- src/NeuralNetworks/CompiledModelHost.cs:
  - ctor now accepts optional modelIdentity — null = disk caching off
  - new fields: _diskCheckedShapes (one load attempt per shape-version),
    _preloadedPlans (in-memory cache of disk-loaded plans)
  - Predict(): before GetOrCompileInference, call TryUseDiskCachedPlan.
    If hit, skip compile entirely.
  - After fresh compile, fire-and-forget save via Task.Run so Predict
    doesn't block on IO.
- src/NeuralNetworks/NeuralNetworkBase.cs: _compileHost is now assigned
  in the ctor so GetType().FullName reflects the concrete subclass —
  different model types don't collide on disk keys.
- src/Diffusion/NoisePredictors/NoisePredictorBase.cs: same change.
- src/AiModelBuilder.cs + src/Interfaces/IAiModelBuilder.cs: new
  ConfigurePlanCaching(directory) fluent method.

## Verify

dotnet build -f net10.0 — clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(tensors-parity): add FusedLinear in FeedForward + Tensors op profiling (gaps 6/7/12)

## Gap 12 — Fused GEMM+Bias+Activation (already mostly wired; completing)

Audit confirmed the major linear layers (DenseLayer, FullyConnectedLayer,
and several others) already dispatch through Engine.FusedLinear /
FusedLinearGpu for CPU + GPU paths.

The notable miss was FeedForwardLayer.Forward, which was doing separate
TensorMatMul + Reshape + TensorBroadcastAdd + ApplyActivation calls (4 kernel
launches per forward). Refactored to use Engine.FusedLinear(input, weights,
biases, fusedActivation) with the standard pre-activation tape-safe fallback
for training.

- src/NeuralNetworks/Layers/FeedForwardLayer.cs: Forward() rewritten.
  Mirror of DenseLayer's fused-inference path.

## Gap 7 — Tensors per-op profiling (orthogonal to existing AiDotNet ProfilerSession)

AiDotNet already has its own ProfilerSession / ProfileReport / AiModelResult.
ProfilingReport surfacing HIGHER-level workflow timings (Welford stats,
hierarchical call trees, reservoir percentiles, memory tracking — richer
than Tensors' simpler per-op profiler). Tensors has no parity with that
feature set, so we keep it.

What was missing: visibility into LOWER-level tensor-op kernel timings.
Tensors ships PerformanceProfiler.Instance which wraps every engine op in
an IDisposable scope — useful for finding which kernel (MatMul, Softmax,
LayerNorm) is the actual bottleneck.

- src/Diagnostics/ProfilingReport.cs: new. TensorsOperationProfile wraps
  PerformanceProfiler output. FormatSummary formats top-N ops.
- src/AiModelBuilder.cs + src/Interfaces/IAiModelBuilder.cs: new
  EnableTensorsOpProfiling() fluent method.
- src/Models/Results/AiModelResult.Diagnostics.cs: new
  TensorsOperationProfile property. Sits alongside existing ProfilingReport
  (not replacing it).

## Verify

dotnet build -f net10.0 — clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(tensors-parity): diagnostics surface + subclass PredictEager routing (gaps 9+14)

## Gap 9 — autotune cache diagnostics

Tensors ships AutotuneCache (Helpers.Autotune namespace) with per-kernel
autotune storage. Users couldn't see whether it was active. Filed Tensors
issue #200 for the WarmupCommonKernelsAsync convenience; added diagnostics
surface here so users at least see the cache path + hardware fingerprint.

- src/Diagnostics/AccelerationDiagnostics.cs: GetReport now emits
  AutotuneCache.DefaultCachePath + CurrentHardwareFingerprint.
  AccelerationSnapshot carries both as AutotuneCachePath / AutotuneHardwareFingerprint.

## Gap 14 — subclass Predict() routing through PredictCompiled

11 NeuralNetwork subclasses overrode Predict as literally `return Forward(input);`,
bypassing the base's PredictCompiled auto-compile path. Refactored each to
override PredictEager (the base's compile-lambda eager fallback) instead,
keeping Forward as the implementation but routing through CompiledModelHost.

After: every Predict on these 11 models goes through _compileHost.Predict,
which traces → compiles → replays (and triggers disk caching via PlanCache
when configured, from Gap 1).

Files touched:
- src/NeuralNetworks/ConvolutionalNeuralNetwork.cs
- src/NeuralNetworks/EfficientNetNetwork.cs
- src/NeuralNetworks/FastText.cs
- src/NeuralNetworks/GloVe.cs
- src/NeuralNetworks/MobileNetV2Network.cs
- src/NeuralNetworks/ResNetNetwork.cs
- src/NeuralNetworks/SiameseNeuralNetwork.cs
- src/NeuralNetworks/UNet3D.cs
- src/NeuralNetworks/VGGNetwork.cs
- src/NeuralNetworks/VoxelCNN.cs
- src/NeuralNetworks/Word2Vec.cs

Forward methods unchanged — they still have their GPU-resident fast path
(TryForwardGpuOptimized etc.) and shape-validation logic. The base's
PredictCompiled treats Forward as the eager fallback but AutoTracer fires
on first call regardless.

## Verify

- dotnet build -f net10.0 — clean
- dotnet build -f net471 — clean
- dotnet test CompiledTapeTrainingStep + FusedOptimizer — 9/9 passing

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: address 5 CodeRabbit review comments on PR #1155

- NBEATSBlock ctor: extract CreateInputShape / CreateOutputShape static
  factories that validate lookbackWindow / forecastHorizon BEFORE the
  base(...) call. Invalid values now surface as ArgumentException with
  the right nameof(...) tag instead of a downstream LayerBase<T> shape
  error.
- InterfaceGuard: class visibility reduced from public to internal to
  match the AiDotNet facade pattern. InternalsVisibleTo on src/AiDotNet.csproj
  already grants access to AiDotNetTests / AiDotNetTestConsole / AiDotNet.Serving /
  AiDotNetBenchmarkTests, so the 58 existing test call sites still compile.
  Doc remark added explaining the visibility choice.
- PretrainedTeacherModel + TransformerTeacherModel: reworded "auto-compiles
  via Tensors' AutoTracer" remarks. The wrapper only invokes the delegate;
  whether auto-compile actually happens depends entirely on what's inside
  the delegate. Removed the unconditional guarantee and added a note that
  external paths (ONNX, REST, etc.) won't pick up engine optimizations.
- SelfTeacherModel.GetLogits: rewrote XML-doc so summary/returns/exception
  match the throw-only behavior (method has no underlying model to run and
  always throws InvalidOperationException). Previous summary said "Gets
  logits from the underlying model" which was misleading.

Verify: dotnet build net10.0 + net471 — 0 errors.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: franklinic <franklin@ivorycloud.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants