Add tensor-based traditional-local model and benchmarkable local model comparison#12
Merged
sharpninja merged 6 commits intomainfrom Mar 18, 2026
Merged
Conversation
Co-authored-by: sharpninja <16146732+sharpninja@users.noreply.github.com>
Co-authored-by: sharpninja <16146732+sharpninja@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Add Specflow tests based on research paper use cases
Add SpecFlow coverage for paper-aligned runtime use cases
Mar 18, 2026
Co-authored-by: sharpninja <16146732+sharpninja@users.noreply.github.com>
Co-authored-by: sharpninja <16146732+sharpninja@users.noreply.github.com>
Copilot
AI
changed the title
Add SpecFlow coverage for paper-aligned runtime use cases
Add local hosted-model benchmarking and model-switchable SpecFlow scenarios
Mar 18, 2026
Co-authored-by: sharpninja <16146732+sharpninja@users.noreply.github.com>
Copilot
AI
changed the title
Add local hosted-model benchmarking and model-switchable SpecFlow scenarios
Add tensor-based Mar 18, 2026
traditional-local model and benchmarkable local model comparison
sharpninja
approved these changes
Mar 18, 2026
Contributor
There was a problem hiding this comment.
Pull request overview
This PR turns traditional-local into a real, tensor-based local model in BitNetSharp.Core and upgrades the app/test surface so multiple local models can be hosted and compared consistently (CLI --model / --compare-model, BenchmarkDotNet benchmarks, and SpecFlow scenarios).
Changes:
- Adds
TraditionalLocalModel(tensor-based ordered-context LM) and wires it into the app viaIHostedAgentModel. - Introduces BenchmarkDotNet model-comparison flow and option parsing for
--model/--compare-model. - Adds SpecFlow feature coverage for hosted response/stream/host build and updates docs to describe benchmarking/model comparison.
Reviewed changes
Copilot reviewed 23 out of 24 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/BitNetSharp.Tests/Steps/PaperAlignedRuntimeSteps.cs | New SpecFlow step bindings for hosted response, streaming, host build, and training. |
| tests/BitNetSharp.Tests/Features/PaperAlignedRuntime.feature | New SpecFlow scenarios parameterized over bitnet-b1.58-sharp and traditional-local. |
| tests/BitNetSharp.Tests/BitNetSharp.Tests.csproj | Adds SpecFlow packages needed to run new feature tests. |
| tests/BitNetSharp.Tests/BitNetModelTests.cs | Adds tests for traditional-local, benchmark option parsing, and host summary fields. |
| src/BitNetSharp.Core/TraditionalLocalModel.cs | Implements the tensor-based traditional-local model (train + generate). |
| src/BitNetSharp.Core/BitNetSharp.Core.csproj | Adds System.Numerics.Tensors dependency for the new model. |
| src/BitNetSharp.App/TraditionalLocalHostedAgentModel.cs | Hosts TraditionalLocalModel behind IHostedAgentModel and training interface. |
| src/BitNetSharp.App/Program.cs | Updates CLI to select models, run benchmarks, support train/visualize/host across models. |
| src/BitNetSharp.App/LocalCommandModelConfig.cs | Adds JSON config loader for invoking other local models via a command. |
| src/BitNetSharp.App/LocalCommandHostedAgentModel.cs | Implements a hosted model that executes an external local command runner. |
| src/BitNetSharp.App/IHostedAgentModel.cs | Introduces shared hosted-model abstraction + train/inspect capability interfaces. |
| src/BitNetSharp.App/HostedAgentModelFactory.cs | Centralizes creation of built-in models and JSON-configured local command models. |
| src/BitNetSharp.App/HostedAgentBenchmarks.cs | Adds BenchmarkDotNet benchmarks for host build/query/stream/train. |
| src/BitNetSharp.App/HostedAgentBenchmarkOptions.cs | Adds benchmark option parsing and environment transport for BDN. |
| src/BitNetSharp.App/BitNetSharp.App.csproj | Adds BenchmarkDotNet dependency. |
| src/BitNetSharp.App/BitNetHostedAgentModel.cs | Wraps BitNetPaperModel behind IHostedAgentModel (+ weight inspection). |
| src/BitNetSharp.App/BitNetChatClient.cs | Replaces paper-model-only chat client with IHostedAgentModel-backed client. |
| src/BitNetSharp.App/BitNetAgentHost.cs | Generalizes agent host build to accept IHostedAgentModel and enriches summary. |
| docs/usage.md | Updates usage docs for --model, benchmark command, and traditional-local training. |
| docs/benchmarking.md | Adds new docs page describing benchmarking/model comparison and JSON configs. |
| docs/architecture.md | Updates architecture docs to reflect multi-model hosting and benchmarking. |
| docs/SUMMARY.md | Adds benchmarking page to GitBook navigation. |
| docs/README.md | Links benchmarking page in the documentation map. |
| .gitignore | Ignores generated SpecFlow *.feature.cs outputs. |
Comment on lines
+117
to
+118
| _host?.Dispose(); | ||
| _model?.Dispose(); |
| } | ||
|
|
||
| return new HostedAgentBenchmarkOptions( | ||
| models.Distinct(StringComparer.Ordinal).ToArray(), |
| DefaultModelId => new BitNetHostedAgentModel(BitNetBootstrap.CreatePaperModel(verbosity)), | ||
| TraditionalLocalModelId => new TraditionalLocalHostedAgentModel(verbosity), | ||
| _ => throw new ArgumentException( | ||
| $"Unknown model specifier '{value}'. Use '{DefaultModelId}', '{TraditionalLocalModelId}', or an absolute path to a local command model JSON file.", |
| return; | ||
| } | ||
|
|
||
| Train(BitNetTrainingCorpus.CreateDefaultExamples(), epochs: DefaultTrainingEpochs); |
Comment on lines
+329
to
+349
| private int SelectNextToken(float[] probabilities, bool allowEndToken) | ||
| { | ||
| var bestTokenId = _endTokenId; | ||
| var bestProbability = float.NegativeInfinity; | ||
|
|
||
| for (var tokenId = 0; tokenId < probabilities.Length; tokenId++) | ||
| { | ||
| if (tokenId == _beginTokenId || tokenId == _unknownTokenId || (!allowEndToken && tokenId == _endTokenId)) | ||
| { | ||
| continue; | ||
| } | ||
|
|
||
| var probability = probabilities[tokenId]; | ||
| if (probability > bestProbability) | ||
| { | ||
| bestTokenId = tokenId; | ||
| bestProbability = probability; | ||
| } | ||
| } | ||
|
|
||
| return bestTokenId; |
| var chatClient = new BitNetChatClient(model); | ||
| var chatClient = new HostedModelChatClient(model); | ||
|
|
||
| builder.Services.AddSingleton(model); |
| return; | ||
| } | ||
|
|
||
| using var model = HostedAgentModelFactory.Create(modelSpecifier, verbosity); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This change upgrades the agent benchmarking/model-comparison path so
traditional-localis a real local model rather than a placeholder. The agent, benchmarks, and SpecFlow scenarios can now compare hosted local models through the same execution surface.Hosted model comparison
IHostedAgentModelabstraction.bitnet-b1.58-sharpandtraditional-localselectable via--model/--compare-model.Tensor-based
traditional-localBitNetSharp.Core.TraditionalLocalModel.System.Numerics.Tensorsfor ordered-context training/inference:traditional-localinto core so it is a real local model, not just a host-side shim.Training/query parity
traditional-localtraining through the same default dataset used for benchmarking/spec coverage.traditional-localto a stronger 24-epoch baseline so hosted queries produce meaningful learned output.Benchmark/SpecFlow alignment
Docs/tests
traditional-localas a tensor-based local comparison model.traditional-localresponsesExample:
dotnet run --configuration Release \ --project /home/runner/work/BitNet-b1.58-Sharp/BitNet-b1.58-Sharp/src/BitNetSharp.App/BitNetSharp.App.csproj \ -- benchmark \ --model=bitnet-b1.58-sharp \ --compare-model=traditional-local \ --prompt="how are you hosted"This now benchmarks a real tensor-based local model against the paper-aligned hosted model through the same agent wrapper.
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.