Problem
pkg/miner/miner.go has three // TODO: stubs that keep the miner from executing real inference work:
pkg/miner/miner.go:326 — // TODO: Integrate with actual inference engine (llama.cpp, vllm, etc.) (inside runInference)
pkg/miner/miner.go:356 — // TODO: Integrate with actual chat model (inside runChat)
pkg/miner/miner.go:381 — // TODO: Integrate with embedding model (inside runEmbedding)
All three currently return hard-coded placeholder responses. The miner ships a working queue/API/result loop, but the compute layer is a stub.
Proposal
Introduce a small pluggable backend interface rather than hard-wiring any specific engine binary. That keeps luxfi/ai binary-light while letting operators point at whichever engine they run.
// pkg/miner/backend/backend.go
type InferenceBackend interface {
Chat(ctx, ChatRequest) (ChatResponse, error)
Inference(ctx, InferenceRequest) (InferenceResponse, error)
Embed(ctx, EmbedRequest) (EmbedResponse, error)
Capabilities() Capabilities
Name() string
}
Ship two reference backends:
backend/noop — current deterministic placeholder extracted verbatim. Preserves today's behaviour so existing tests pass unchanged. Default.
backend/openai — OpenAI-compatible HTTP adapter (stdlib net/http, no new deps). Since llama.cpp / vllm / ollama all expose an OpenAI-compatible /v1/chat/completions, one adapter covers all three via OPENAI_API_BASE config.
Config extension: Config.Backend (noop | openai) plus Config.OpenAIBase / Config.OpenAIAPIKey / Config.OpenAIModel. Default stays noop — zero behaviour change for existing callers.
Scope
pkg/miner/backend/backend.go — interface + request/response types
pkg/miner/backend/noop/noop.go — deterministic mock
pkg/miner/backend/openai/openai.go — HTTP adapter
- Wire
Miner.processTask to dispatch through backend
- Unit tests for interface contract, noop, and openai (httptest-mocked server)
pkg/miner/backend/README.md — short doc with llama.cpp / vllm / ollama pointer commands
Target: ~500–800 LOC, no new go.mod deps, default behaviour preserved.
Happy to send the PR directly.
Contributed by kcolbchain (https://kcolbchain.com) · https://abhishekkrishna.com
Problem
pkg/miner/miner.gohas three// TODO:stubs that keep the miner from executing real inference work:pkg/miner/miner.go:326—// TODO: Integrate with actual inference engine (llama.cpp, vllm, etc.)(insiderunInference)pkg/miner/miner.go:356—// TODO: Integrate with actual chat model(insiderunChat)pkg/miner/miner.go:381—// TODO: Integrate with embedding model(insiderunEmbedding)All three currently return hard-coded placeholder responses. The miner ships a working queue/API/result loop, but the compute layer is a stub.
Proposal
Introduce a small pluggable backend interface rather than hard-wiring any specific engine binary. That keeps luxfi/ai binary-light while letting operators point at whichever engine they run.
Ship two reference backends:
backend/noop— current deterministic placeholder extracted verbatim. Preserves today's behaviour so existing tests pass unchanged. Default.backend/openai— OpenAI-compatible HTTP adapter (stdlibnet/http, no new deps). Since llama.cpp / vllm / ollama all expose an OpenAI-compatible/v1/chat/completions, one adapter covers all three viaOPENAI_API_BASEconfig.Config extension:
Config.Backend(noop|openai) plusConfig.OpenAIBase/Config.OpenAIAPIKey/Config.OpenAIModel. Default staysnoop— zero behaviour change for existing callers.Scope
pkg/miner/backend/backend.go— interface + request/response typespkg/miner/backend/noop/noop.go— deterministic mockpkg/miner/backend/openai/openai.go— HTTP adapterMiner.processTaskto dispatch through backendpkg/miner/backend/README.md— short doc with llama.cpp / vllm / ollama pointer commandsTarget: ~500–800 LOC, no new go.mod deps, default behaviour preserved.
Happy to send the PR directly.
Contributed by kcolbchain (https://kcolbchain.com) · https://abhishekkrishna.com