Add tensor-based `traditional-local` model and benchmarkable local model comparison by Copilot · Pull Request #12 · sharpninja/BitNet-b1.58-Sharp

Copilot · 2026-03-18T00:15:05Z

This change upgrades the agent benchmarking/model-comparison path so traditional-local is a real local model rather than a placeholder. The agent, benchmarks, and SpecFlow scenarios can now compare hosted local models through the same execution surface.

Hosted model comparison
- Generalizes the app host around a shared IHostedAgentModel abstraction.
- Keeps bitnet-b1.58-sharp and traditional-local selectable via --model / --compare-model.
- Preserves support for other local models through local command JSON configs.
Tensor-based traditional-local
- Replaces the previous count/dictionary implementation with BitNetSharp.Core.TraditionalLocalModel.
- Uses System.Numerics.Tensors for ordered-context training/inference:
  - token embeddings
  - context flattening
  - dot-product logits
  - softmax next-token prediction
- Moves traditional-local into core so it is a real local model, not just a host-side shim.
Training/query parity
- Routes traditional-local training through the same default dataset used for benchmarking/spec coverage.
- Raises the default training pass for traditional-local to a stronger 24-epoch baseline so hosted queries produce meaningful learned output.
- Leaves the paper-aligned BitNet runtime as the supported primary path.
Benchmark/SpecFlow alignment
- BenchmarkDotNet continues to measure the same hosted operations the SpecFlow scenarios exercise:
  - host build
  - query
  - streaming
  - training
- SpecFlow model selection remains parameterized, so scenario behavior is directly comparable by swapping models.
Docs/tests
- Updates docs to describe traditional-local as a tensor-based local comparison model.
- Adds focused tests for:
  - hosted traditional-local responses
  - direct training/query behavior of the tensor model
  - benchmark option/model-selection wiring

Example:

dotnet run --configuration Release \
  --project /home/runner/work/BitNet-b1.58-Sharp/BitNet-b1.58-Sharp/src/BitNetSharp.App/BitNetSharp.App.csproj \
  -- benchmark \
  --model=bitnet-b1.58-sharp \
  --compare-model=traditional-local \
  --prompt="how are you hosted"

This now benchmarks a real tensor-based local model against the paper-aligned hosted model through the same agent wrapper.

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Co-authored-by: sharpninja <16146732+sharpninja@users.noreply.github.com>

Copilot

Pull request overview

This PR turns traditional-local into a real, tensor-based local model in BitNetSharp.Core and upgrades the app/test surface so multiple local models can be hosted and compared consistently (CLI --model / --compare-model, BenchmarkDotNet benchmarks, and SpecFlow scenarios).

Changes:

Adds TraditionalLocalModel (tensor-based ordered-context LM) and wires it into the app via IHostedAgentModel.
Introduces BenchmarkDotNet model-comparison flow and option parsing for --model/--compare-model.
Adds SpecFlow feature coverage for hosted response/stream/host build and updates docs to describe benchmarking/model comparison.

Reviewed changes

Copilot reviewed 23 out of 24 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
tests/BitNetSharp.Tests/Steps/PaperAlignedRuntimeSteps.cs	New SpecFlow step bindings for hosted response, streaming, host build, and training.
tests/BitNetSharp.Tests/Features/PaperAlignedRuntime.feature	New SpecFlow scenarios parameterized over `bitnet-b1.58-sharp` and `traditional-local`.
tests/BitNetSharp.Tests/BitNetSharp.Tests.csproj	Adds SpecFlow packages needed to run new feature tests.
tests/BitNetSharp.Tests/BitNetModelTests.cs	Adds tests for `traditional-local`, benchmark option parsing, and host summary fields.
src/BitNetSharp.Core/TraditionalLocalModel.cs	Implements the tensor-based `traditional-local` model (train + generate).
src/BitNetSharp.Core/BitNetSharp.Core.csproj	Adds `System.Numerics.Tensors` dependency for the new model.
src/BitNetSharp.App/TraditionalLocalHostedAgentModel.cs	Hosts `TraditionalLocalModel` behind `IHostedAgentModel` and training interface.
src/BitNetSharp.App/Program.cs	Updates CLI to select models, run benchmarks, support train/visualize/host across models.
src/BitNetSharp.App/LocalCommandModelConfig.cs	Adds JSON config loader for invoking other local models via a command.
src/BitNetSharp.App/LocalCommandHostedAgentModel.cs	Implements a hosted model that executes an external local command runner.
src/BitNetSharp.App/IHostedAgentModel.cs	Introduces shared hosted-model abstraction + train/inspect capability interfaces.
src/BitNetSharp.App/HostedAgentModelFactory.cs	Centralizes creation of built-in models and JSON-configured local command models.
src/BitNetSharp.App/HostedAgentBenchmarks.cs	Adds BenchmarkDotNet benchmarks for host build/query/stream/train.
src/BitNetSharp.App/HostedAgentBenchmarkOptions.cs	Adds benchmark option parsing and environment transport for BDN.
src/BitNetSharp.App/BitNetSharp.App.csproj	Adds BenchmarkDotNet dependency.
src/BitNetSharp.App/BitNetHostedAgentModel.cs	Wraps `BitNetPaperModel` behind `IHostedAgentModel` (+ weight inspection).
src/BitNetSharp.App/BitNetChatClient.cs	Replaces paper-model-only chat client with `IHostedAgentModel`-backed client.
src/BitNetSharp.App/BitNetAgentHost.cs	Generalizes agent host build to accept `IHostedAgentModel` and enriches summary.
docs/usage.md	Updates usage docs for `--model`, benchmark command, and `traditional-local` training.
docs/benchmarking.md	Adds new docs page describing benchmarking/model comparison and JSON configs.
docs/architecture.md	Updates architecture docs to reflect multi-model hosting and benchmarking.
docs/SUMMARY.md	Adds benchmarking page to GitBook navigation.
docs/README.md	Links benchmarking page in the documentation map.
.gitignore	Ignores generated SpecFlow `*.feature.cs` outputs.

+        _host?.Dispose();
+        _model?.Dispose();


+        }
+
+        return new HostedAgentBenchmarkOptions(
+            models.Distinct(StringComparer.Ordinal).ToArray(),


+            DefaultModelId => new BitNetHostedAgentModel(BitNetBootstrap.CreatePaperModel(verbosity)),
+            TraditionalLocalModelId => new TraditionalLocalHostedAgentModel(verbosity),
+            _ => throw new ArgumentException(
+                $"Unknown model specifier '{value}'. Use '{DefaultModelId}', '{TraditionalLocalModelId}', or an absolute path to a local command model JSON file.",


+            return;
+        }
+
+        Train(BitNetTrainingCorpus.CreateDefaultExamples(), epochs: DefaultTrainingEpochs);


+    private int SelectNextToken(float[] probabilities, bool allowEndToken)
+    {
+        var bestTokenId = _endTokenId;
+        var bestProbability = float.NegativeInfinity;
+
+        for (var tokenId = 0; tokenId < probabilities.Length; tokenId++)
+        {
+            if (tokenId == _beginTokenId || tokenId == _unknownTokenId || (!allowEndToken && tokenId == _endTokenId))
+            {
+                continue;
+            }
+
+            var probability = probabilities[tokenId];
+            if (probability > bestProbability)
+            {
+                bestTokenId = tokenId;
+                bestProbability = probability;
+            }
+        }
+
+        return bestTokenId;


-        var chatClient = new BitNetChatClient(model);
+        var chatClient = new HostedModelChatClient(model);

        builder.Services.AddSingleton(model);


+    return;
+}
+
+using var model = HostedAgentModelFactory.Create(modelSpecifier, verbosity);


Initial plan

1f4bba1

Copilot AI assigned Copilot and sharpninja Mar 18, 2026

Copilot started work on behalf of sharpninja March 18, 2026 00:15 View session

Copilot AI and others added 2 commits March 18, 2026 00:21

test: add paper-aligned specflow scenarios

bce41c2

Co-authored-by: sharpninja <16146732+sharpninja@users.noreply.github.com>

test: polish paper-aligned spec wording

448dc95

Co-authored-by: sharpninja <16146732+sharpninja@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Add Specflow tests based on research paper use cases~~ Add SpecFlow coverage for paper-aligned runtime use cases Mar 18, 2026

Copilot AI requested a review from sharpninja March 18, 2026 00:25

Copilot finished work on behalf of sharpninja March 18, 2026 00:25

Copilot started work on behalf of sharpninja March 18, 2026 00:25 View session

Copilot AI and others added 2 commits March 18, 2026 00:38

feat: add local model benchmarking support

68d2389

Co-authored-by: sharpninja <16146732+sharpninja@users.noreply.github.com>

fix: polish benchmark model comparison flow

f172769

Co-authored-by: sharpninja <16146732+sharpninja@users.noreply.github.com>

Copilot AI changed the title ~~Add SpecFlow coverage for paper-aligned runtime use cases~~ Add local hosted-model benchmarking and model-switchable SpecFlow scenarios Mar 18, 2026

Copilot finished work on behalf of sharpninja March 18, 2026 00:48

Copilot started work on behalf of sharpninja March 18, 2026 01:08 View session

feat: make traditional-local tensor-based and functional

f46d4bb

Co-authored-by: sharpninja <16146732+sharpninja@users.noreply.github.com>

Copilot AI changed the title ~~Add local hosted-model benchmarking and model-switchable SpecFlow scenarios~~ Add tensor-based traditional-local model and benchmarkable local model comparison Mar 18, 2026

Copilot finished work on behalf of sharpninja March 18, 2026 01:22

sharpninja approved these changes Mar 18, 2026

View reviewed changes

sharpninja marked this pull request as ready for review March 18, 2026 01:30

Copilot AI review requested due to automatic review settings March 18, 2026 01:30

sharpninja merged commit 673b67f into main Mar 18, 2026
3 checks passed

sharpninja deleted the copilot/create-specflow-tests branch March 18, 2026 01:31

Copilot started reviewing on behalf of sharpninja March 18, 2026 01:31 View session

Copilot AI reviewed Mar 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add tensor-based `traditional-local` model and benchmarkable local model comparison#12

Add tensor-based `traditional-local` model and benchmarkable local model comparison#12
sharpninja merged 6 commits intomainfrom
copilot/create-specflow-tests

Copilot AI commented Mar 18, 2026 •

edited

Loading

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Copilot AI commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Mar 18, 2026 •

edited

Loading