Skip to content

Add tensor-based traditional-local model and benchmarkable local model comparison#12

Merged
sharpninja merged 6 commits intomainfrom
copilot/create-specflow-tests
Mar 18, 2026
Merged

Add tensor-based traditional-local model and benchmarkable local model comparison#12
sharpninja merged 6 commits intomainfrom
copilot/create-specflow-tests

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 18, 2026

This change upgrades the agent benchmarking/model-comparison path so traditional-local is a real local model rather than a placeholder. The agent, benchmarks, and SpecFlow scenarios can now compare hosted local models through the same execution surface.

  • Hosted model comparison

    • Generalizes the app host around a shared IHostedAgentModel abstraction.
    • Keeps bitnet-b1.58-sharp and traditional-local selectable via --model / --compare-model.
    • Preserves support for other local models through local command JSON configs.
  • Tensor-based traditional-local

    • Replaces the previous count/dictionary implementation with BitNetSharp.Core.TraditionalLocalModel.
    • Uses System.Numerics.Tensors for ordered-context training/inference:
      • token embeddings
      • context flattening
      • dot-product logits
      • softmax next-token prediction
    • Moves traditional-local into core so it is a real local model, not just a host-side shim.
  • Training/query parity

    • Routes traditional-local training through the same default dataset used for benchmarking/spec coverage.
    • Raises the default training pass for traditional-local to a stronger 24-epoch baseline so hosted queries produce meaningful learned output.
    • Leaves the paper-aligned BitNet runtime as the supported primary path.
  • Benchmark/SpecFlow alignment

    • BenchmarkDotNet continues to measure the same hosted operations the SpecFlow scenarios exercise:
      • host build
      • query
      • streaming
      • training
    • SpecFlow model selection remains parameterized, so scenario behavior is directly comparable by swapping models.
  • Docs/tests

    • Updates docs to describe traditional-local as a tensor-based local comparison model.
    • Adds focused tests for:
      • hosted traditional-local responses
      • direct training/query behavior of the tensor model
      • benchmark option/model-selection wiring

Example:

dotnet run --configuration Release \
  --project /home/runner/work/BitNet-b1.58-Sharp/BitNet-b1.58-Sharp/src/BitNetSharp.App/BitNetSharp.App.csproj \
  -- benchmark \
  --model=bitnet-b1.58-sharp \
  --compare-model=traditional-local \
  --prompt="how are you hosted"

This now benchmarks a real tensor-based local model against the paper-aligned hosted model through the same agent wrapper.


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI and others added 2 commits March 18, 2026 00:21
Co-authored-by: sharpninja <16146732+sharpninja@users.noreply.github.com>
Co-authored-by: sharpninja <16146732+sharpninja@users.noreply.github.com>
Copilot AI changed the title [WIP] Add Specflow tests based on research paper use cases Add SpecFlow coverage for paper-aligned runtime use cases Mar 18, 2026
Copilot AI requested a review from sharpninja March 18, 2026 00:25
Copilot AI and others added 2 commits March 18, 2026 00:38
Co-authored-by: sharpninja <16146732+sharpninja@users.noreply.github.com>
Co-authored-by: sharpninja <16146732+sharpninja@users.noreply.github.com>
Copilot AI changed the title Add SpecFlow coverage for paper-aligned runtime use cases Add local hosted-model benchmarking and model-switchable SpecFlow scenarios Mar 18, 2026
Co-authored-by: sharpninja <16146732+sharpninja@users.noreply.github.com>
Copilot AI changed the title Add local hosted-model benchmarking and model-switchable SpecFlow scenarios Add tensor-based traditional-local model and benchmarkable local model comparison Mar 18, 2026
@sharpninja sharpninja marked this pull request as ready for review March 18, 2026 01:30
Copilot AI review requested due to automatic review settings March 18, 2026 01:30
@sharpninja sharpninja merged commit 673b67f into main Mar 18, 2026
3 checks passed
@sharpninja sharpninja deleted the copilot/create-specflow-tests branch March 18, 2026 01:31
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR turns traditional-local into a real, tensor-based local model in BitNetSharp.Core and upgrades the app/test surface so multiple local models can be hosted and compared consistently (CLI --model / --compare-model, BenchmarkDotNet benchmarks, and SpecFlow scenarios).

Changes:

  • Adds TraditionalLocalModel (tensor-based ordered-context LM) and wires it into the app via IHostedAgentModel.
  • Introduces BenchmarkDotNet model-comparison flow and option parsing for --model/--compare-model.
  • Adds SpecFlow feature coverage for hosted response/stream/host build and updates docs to describe benchmarking/model comparison.

Reviewed changes

Copilot reviewed 23 out of 24 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
tests/BitNetSharp.Tests/Steps/PaperAlignedRuntimeSteps.cs New SpecFlow step bindings for hosted response, streaming, host build, and training.
tests/BitNetSharp.Tests/Features/PaperAlignedRuntime.feature New SpecFlow scenarios parameterized over bitnet-b1.58-sharp and traditional-local.
tests/BitNetSharp.Tests/BitNetSharp.Tests.csproj Adds SpecFlow packages needed to run new feature tests.
tests/BitNetSharp.Tests/BitNetModelTests.cs Adds tests for traditional-local, benchmark option parsing, and host summary fields.
src/BitNetSharp.Core/TraditionalLocalModel.cs Implements the tensor-based traditional-local model (train + generate).
src/BitNetSharp.Core/BitNetSharp.Core.csproj Adds System.Numerics.Tensors dependency for the new model.
src/BitNetSharp.App/TraditionalLocalHostedAgentModel.cs Hosts TraditionalLocalModel behind IHostedAgentModel and training interface.
src/BitNetSharp.App/Program.cs Updates CLI to select models, run benchmarks, support train/visualize/host across models.
src/BitNetSharp.App/LocalCommandModelConfig.cs Adds JSON config loader for invoking other local models via a command.
src/BitNetSharp.App/LocalCommandHostedAgentModel.cs Implements a hosted model that executes an external local command runner.
src/BitNetSharp.App/IHostedAgentModel.cs Introduces shared hosted-model abstraction + train/inspect capability interfaces.
src/BitNetSharp.App/HostedAgentModelFactory.cs Centralizes creation of built-in models and JSON-configured local command models.
src/BitNetSharp.App/HostedAgentBenchmarks.cs Adds BenchmarkDotNet benchmarks for host build/query/stream/train.
src/BitNetSharp.App/HostedAgentBenchmarkOptions.cs Adds benchmark option parsing and environment transport for BDN.
src/BitNetSharp.App/BitNetSharp.App.csproj Adds BenchmarkDotNet dependency.
src/BitNetSharp.App/BitNetHostedAgentModel.cs Wraps BitNetPaperModel behind IHostedAgentModel (+ weight inspection).
src/BitNetSharp.App/BitNetChatClient.cs Replaces paper-model-only chat client with IHostedAgentModel-backed client.
src/BitNetSharp.App/BitNetAgentHost.cs Generalizes agent host build to accept IHostedAgentModel and enriches summary.
docs/usage.md Updates usage docs for --model, benchmark command, and traditional-local training.
docs/benchmarking.md Adds new docs page describing benchmarking/model comparison and JSON configs.
docs/architecture.md Updates architecture docs to reflect multi-model hosting and benchmarking.
docs/SUMMARY.md Adds benchmarking page to GitBook navigation.
docs/README.md Links benchmarking page in the documentation map.
.gitignore Ignores generated SpecFlow *.feature.cs outputs.

Comment on lines +117 to +118
_host?.Dispose();
_model?.Dispose();
}

return new HostedAgentBenchmarkOptions(
models.Distinct(StringComparer.Ordinal).ToArray(),
DefaultModelId => new BitNetHostedAgentModel(BitNetBootstrap.CreatePaperModel(verbosity)),
TraditionalLocalModelId => new TraditionalLocalHostedAgentModel(verbosity),
_ => throw new ArgumentException(
$"Unknown model specifier '{value}'. Use '{DefaultModelId}', '{TraditionalLocalModelId}', or an absolute path to a local command model JSON file.",
return;
}

Train(BitNetTrainingCorpus.CreateDefaultExamples(), epochs: DefaultTrainingEpochs);
Comment on lines +329 to +349
private int SelectNextToken(float[] probabilities, bool allowEndToken)
{
var bestTokenId = _endTokenId;
var bestProbability = float.NegativeInfinity;

for (var tokenId = 0; tokenId < probabilities.Length; tokenId++)
{
if (tokenId == _beginTokenId || tokenId == _unknownTokenId || (!allowEndToken && tokenId == _endTokenId))
{
continue;
}

var probability = probabilities[tokenId];
if (probability > bestProbability)
{
bestTokenId = tokenId;
bestProbability = probability;
}
}

return bestTokenId;
var chatClient = new BitNetChatClient(model);
var chatClient = new HostedModelChatClient(model);

builder.Services.AddSingleton(model);
return;
}

using var model = HostedAgentModelFactory.Create(modelSpecifier, verbosity);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants