This repository is the greenfield Go re-implementation of LLM Interactive Proxy.
The repository implements the Go core v1 stack from .kiro/specs/go-core-reimplementation-v1: canonical lipapi contracts, core routing/B2BUA/executor, bundled frontend and backend plugins, conformance matrix, and a runnable standard distribution binary (cmd/lipstd) that serves the bundled HTTP APIs when configured.
- API parity (spec + matrices) — vendor-surface claims are tracked under .kiro/specs/llm-api-parity/ with row-level status; the README does not assert parity beyond what those matrices mark
implemented(see also .kiro/specs/go-core-reimplementation-v1/refclient-spec-matrix.md). - canonical Go module and repository layout
- package boundaries aligned with
AGENTS.mdand Kiro steering - database persistence settings for continuity, secure sessions, and managed PostgreSQL pool tuning are documented in
docs/database-persistence.md - typed runtime configuration (
config/config.yaml): optionalserver.read_header_timeout,read_timeout,write_timeout,idle_timeout(Go duration strings; omit for stdhttp defaults), andserver.max_pending_wire_eventsto cap backend adapter pending-event queues per stream (0= unlimited). Multi-instance routing example:config/config.multi-instance.example.yaml. Access / auth — Commented templates at the top of the same file coveraccess.mode,auth.handler,auth.required_level,auth.event_delivery/auth.event_failure_policy,auth.local_api_keys, and optionalauth.remote; when omitted, defaults matchinternal/core/config/access_auth_model.go. Logging (loggingin YAML):level(debug|info|warn|error),format(json|text), optionaladd_source, optionalaccess_log(one structuredhttp.accessline per request withmethod,route_group(first path segment, bounded cardinality),status,duration_ms, andtrace_idwhen present; optionalaccess_log_include_raw_path: trueadds the full URL path), andaccess_log_skip_paths(path prefixes starting with/, e.g./healthz). The process logger is built ininternal/infra/loggingwithslog-multi+slog-formattererror normalization over stdout. - Architecture / drift —
docs/architecture-guardrails.md,docs/adr/0005-architecture-guardrails-and-complexity-budgets.md. Stage-four extension platform (legal stages, SDK facades, inventory, migration from hook-only plugins):docs/extension-platform-authoring.md. Plugin authors bind shared state withpkg/lipsdk/state.BindPluginwhen a feature spans multiple handlers, and inventory surfaces privileged capability flags such asauxiliary_requestsfor stateful/call-gated feature bundles. Routing breaker semantics:docs/routing-health-circuit-breaker.md. Execute-error taxonomy notes:docs/execerr-classification.md. HTTP 5xx responses from bundled frontends use a stable generic message for internal executor/upstream failures (internal error); operators rely on structured server logs for detail (not backward compatible if you depended on error-body echo of raw upstream text). cmd/lipstd— creates apluginreg.Registrywithpluginreg.NewRegistry, resolves default upstream API keys once viapluginreg.ResolveUpstreamAPIKeysFromEnv(when YAML leaves keys empty:OPENAI_API_KEYplus optionalOPENAI_API_KEY_2… up to_32, contiguous; same pattern forANTHROPIC_API_KEY*andGEMINI_API_KEY*), installs the standard bundle viapluginreg.InstallStandardBundleOn(factory wiring inbackends_install.go,frontends_install.go, andfeatures_install.go; mandatory ids inlipsdk.StandardDistributionRequirements), loads config, validates mandatory plugins against that requirements list, assemblesruntime.Appwith a non-nil logger, buildsruntimebundle.Builtviaruntimebundle.Buildwith an explicit registry inruntimebundle.BuildOptionsand a non-nil logger, a shared upstream HTTP client built fromhttpclient.StandardWithTuneusinghttpclient.TransportTuneFromConfig(YAMLhttp_clientpool/timeouts; overridable for tests), then serves HTTP withstdhttp.RunWithRuntime(which also requires a non-nil logger). HTTP request IDs use a per-processdiag.TraceIDGeneratorcreated instdhttp, not package-level global state. Optionalrouting.health.circuit_breaker(enabled,failure_threshold,open_for) wires executor candidate health; the executor emits structuredlip.routerouting observations when logging is configured. Bundled HTTP frontends classify execute failures withinternal/plugins/frontends/execerr(reject vs internal).stdhttp.Runis a convenience that callsBuild(with the provided registry) thenRunWithRuntime.- test, vet, lint, and vuln-check entrypoints
- QA scripts, optional git hooks, and a GitHub Actions workflow aligned with the sibling
go-live-market-data-aggregatorprocess (trimmed for this repo: no domain-specific custom vets) - deterministic IDs/timestamps in frontend encoders and ACP reference paths where reproducibility matters; B2BUA A/B leg ids are opaque random strings (
a_/b_+ 32 hex chars) to avoid trivial enumeration; the standard server path injects a real wall clock and non-deterministic RNG for the executor (seeinternal/infra/runtimebundle)
- Auth audit events — Structured
lip.auth.*logs can include operator-visible PII (principal display name, ids, session/client correlation fields).auth.event_deliveryselects the default log sink,disabled, orcustom(composition root must supplyAuthEventSink).auth.event_failure_policybest_effortignores sink errors;fail_closedfails the request when the auth/session audit sink errors while emitting an auth-decision or session-start event, trading availability for stricter observability guarantees. - Local API keys —
auth.local_api_keys[].keyvalues must be at least 16 Unicode code points after trimming (validated at config load). Foraccess.mode: multi_useron a network-reachable bind, combine strong keys with reverse-proxy rate limiting or a WAF; the proxy does not implement per-client throttling on the auth middleware path. - Diagnostics — With
diagnostics.enabled, routes other thanhealth_path(attempts, inventory, route trace, pprof) can reveal routing and lineage metadata. Bind the listener to loopback or an admin-only network, and/or setdiagnostics.shared_secret(at least 12 characters when set). Clients must then send headerX-LIP-Diagnostics-Secretwith that exact value. - Prometheus metrics — Set
observability.metrics.enabled: trueand optionalobservability.metrics.path(defaults to/metricswhen omitted in config load). Scrapes expose bounded-label HTTP histograms/counters onmethod,status_class, androute_group(same coarse bucketing as OTEL span names;lip_http_request_duration_secondsuses buckets up to 120s for LLM tail latency), pluslip_executor_attempts_total,lip_executor_backend_open_seconds,lip_upstream_request_duration_seconds(when using the bundled upstream client), and Go/process collectors. Optionalobservability.metrics.exemplars_enabled: trueenables OpenMetrics on the scrape handler and attachestrace_idexemplars to selected histograms when a span is present. Whendiagnostics.shared_secretis set, scrapers must sendX-LIP-Diagnostics-Secretfor the metrics route (same as attempts/pprof). The metrics path participates in diagnostic path uniqueness checks (must not overlaphealth_path, etc.). - OpenTelemetry tracing — Set
observability.tracing.enabled: trueand optionalobservability.tracing.service_name(otherwiseOTEL_SERVICE_NAMEorlipstd). Optionalobservability.tracing.sample_ratio(strictly between 0 and 1 when set) appliesParentBased(TraceIDRatioBased)for new root spans; omit or use1for SDK default sampling. Configure OTLP export with standardOTEL_*environment variables (OTEL_EXPORTER_OTLP_ENDPOINT,OTEL_EXPORTER_OTLP_PROTOCOL, trace sampler settings, etc.). Incoming HTTP requests are wrapped withotelhttp(coarse span names); the executor adds child spans (lip.executor.execute,lip.executor.backend_open). The outbound upstreamhttp.Clientpropagates trace context when tracing is active, including whenruntimebundle.BuildOptions.HTTPClientsupplies a custom client (its transport is wrapped). With tracing enabled, JSON logs includetrace_idandspan_idwhen the slog record context carries an active span;diag.TraceIDprefers explicit LIP correlation (WithCallDiag,X-Trace-ID/WithTraceID) and otherwise falls back to the W3C trace id from span context. Responses echoX-Trace-IDwhenever a trace id is available on the request context. - Outbound HTTP proxy — The shared upstream client defaults to honoring
HTTP_PROXY/HTTPS_PROXY. In environments where process environment is not trusted, sethttp_client.trust_environment_proxy: falseso those variables are ignored. Optionalhttp_clientfields (max_idle_conns,max_idle_conns_per_host,idle_conn_timeout,response_header_timeout,client_timeout, dial/TLS timeouts) tune connection pooling for high concurrency; defaults matchhttpclient.DefaultTransportTune. - Bedrock cleartext —
disable_https: trueis only accepted whenbase_endpointresolves to a loopback host, unlessallow_insecure_non_loopback: trueis set on the Bedrock backend row (explicit lab escape hatch). - SQLite path —
continuity.sqlite_pathmust not contain NUL,?,#, or&(avoids ambiguousfile:DSN parsing).
GOMEMLIMIT— In Kubernetes or other memory-capped environments, setGOMEMLIMITto roughly 80–90% of the cgroup limit so the Go runtime can trigger GC before the OOM killer terminates the process (see theruntime/debug.SetMemoryLimit/GOMEMLIMITdocumentation).GOGC— The default target GC percentage is usually fine; change it only when profiling shows GC as a meaningful cost.- PGO — Optional: build with profile-guided optimization (e.g.
go build -pgo=default.pgoafter collecting a representative CPU profile) to improve hot paths in production binaries. GOMAXPROCS— From Go 1.19 onward, the runtime usually respects CPU cgroup quotas automatically; override only when you have a deliberate reason (for example pinning to socket count on bare metal).- CPU and heap profiles — For local investigation, run benchmarks or tests with
go test -cpuprofile=cpu.prof ./path/...thengo tool pprof cpu.prof, or heap profiles with-memprofile. Whendiagnostics.enabledis true, setdiagnostics.pprof_pathin YAML (for example/debug/pprof) to expose the standard library pprof index and endpoints under that prefix. Do not expose pprof on untrusted networks; put the listener behind localhost, a VPN, or an authenticated reverse proxy. Diagnostic paths (health_path,attempts_path,inventory_path,route_trace_path,pprof_path, andobservability.metrics.pathwhen metrics are enabled) must be unique after normalization and must not prefix-overlap (validated at config load).
lipapi.Call.Validateenforces maximum sizes on route selectors, IDs, messages/parts/tool counts, part payloads, extensions, and related option strings (seepkg/lipapi/limits.go). Oversized canonical requests fail validation before orchestration runs.lipapi.CollectappliesDefaultCollectLimitswhen aggregating streaming events into a singleCollectedstruct. UseCollectWithLimitsfor custom caps orCollectUnboundedonly for tests/harnesses that deliberately exceed defaults.b2bua.MemoryStoreapplies a default maximum number of concurrent A-leg rows (DefaultMemoryStoreMaxLegsWithoutTTL, currently 100k) wheneverMaxLegs/continuity.max_legsis unset (zero); set a positive value to override. Negativemax_legsis rejected. TTL (when set) evicts idle rows by age; the max-legs cap still applies and evicts the least-recently-seen rows when the store is over capacity.
- Default route selector when clients omit
X-LIP-Routeis resolved byconfig.EffectiveDefaultRouteSelectorfromrouting.default_routein YAML, then the first enabled backend plus registry default model ids (pluginreg.DefaultWireModel). Optional top-levelmodel_aliasesrewrites the full selector string (regexppatterntoreplacement) before parsing;routing.default_routeis expanded the same way duringruntimebundle.Build. Backend rows useidas the runtime instance id; optionalkindsets the bundled factory when you need multiple instances of the same adapter (kind: openai-responses,id: openai-primary). Afterconfig.LoadFile, callrouting.ValidateModelAliasesConfig(cfg)so invalid alias rules fail at startup. Seeinternal/core/config/effective_default_route.goandinternal/core/routing/aliases.go. - SQLite continuity (
continuity.store: sqliteandcontinuity.sqlite_path) persists A-leg rows and attempt lineage across process restarts (internal/core/continuity/sqlitestore, pure-Go drivermodernc.org/sqlite).continuity.ttl/max_legsapply only to the in-memory store; combining them with SQLite is rejected at config load until durable pruning exists. - Hook bus: root
hooks.tool_reactor_error_policyselectsfail_open(default),fail_closed, orswallow_eventfor tool-reactor errors. Optional reference feature plugins are documented ininternal/plugins/features/REFERENCE_PLUGINS.md.
Fast checks (format, go mod tidy drift, build, vet, architecture guardrails in internal/archtest, plus go mod verify in CI or when LIP_VERIFY_MODULE_CACHE=1) plus staged-package tests mirror the sibling repo’s quality-checks / test-staged pattern. See docs/architecture-guardrails.md.
make quality-checks # gofmt -l, tidy+diff guard, go build, go vet; add LIP_VERIFY_MODULE_CACHE=1 to include mod verify locally
make test # quality-checks + go test -parallel=8 ./... (excludes precommit-tagged tests)
make test-precommit-extra # hygiene + executor matrices (-tags=precommit; included in make qa + CI)
make test-fast # quality-checks + tests for staged packages (or all if none staged)
make test-race # skipped on Windows; on Linux/macOS runs scripts/race-check.sh (CI: strict race on Linux)
make test-fuzz # short fuzz smoke on internal/testkit (override: FUZZTIME=30s make test-fuzz)
make bench # benchmarks: testkit, core stream/runtime/routing/diag, frontend streaming encoders (see docs/performance-checks.md)
make qa # quality-checks + unit tests + golangci-lint + govulncheck (via `go tool`, see go.mod)
make hooks-install # set core.hooksPath to .githooks (secret scan on staged files, then quality-gate when .go is staged)Pre-commit runs scripts/check-staged-secrets first (gitleaks when installed, otherwise high-signal git grep patterns; allowlist via .gitleaks.toml and scripts/secret-scan-allowlist.txt). When staged .go files exist, it then runs scripts/quality-gate (quality checks, staged tests, staged race scan on Linux/macOS only — skipped on Windows, golangci-lint if present, go tool govulncheck).
CI (.github/workflows/qa.yml) runs make quality-checks, unit tests, strict race on Linux, golangci-lint-action, and go tool govulncheck.
Linter config lives in .golangci.yml (golangci-lint v2 schema: staticcheck, govet, revive, the modernize pass, gofumpt as a formatter, and small correctness linters). Install a v2.x binary locally so make lint matches CI.
lipsdk/continuity.Store—GetALegwas renamed toFetchALeg(golang naming: avoidGeton accessors).runtime.ErrNilConfigandlipapi.ErrRecoverablePreOutputError()text now include aruntime:/lipapi:prefix; useerrors.Isinstead of string equality.lipapi.SessionRef— the proxy-owned session id field isAuthoritativeSessionID(renamed fromSessionIDto avoid stutter with the struct name). JSON still uses the keySessionIDfor wire compatibility.lipsdk.FrontendMountnow takes a singlelipsdk.FrontendMountOptionsafter thehttp.ServeMux.pluginreg.(*Registry).MountFrontendmatches that shape (factory id, mux, options).stdhttp.MountBundledFrontendstakesstdhttp.MountBundledFrontendsInputinstead of six separate parameters.- Composition roots:
pluginreg.InstallStandardBundleOn/InstallStandardBackendsOntakepluginreg.UpstreamAPIKeys(per-family ordered slices; useResolveUpstreamAPIKeysFromEnvinmain, orUpstreamAPIKeys{}in tests). Env resolution includes numbered keys (OPENAI_API_KEY_2, … contiguous suffixes only: scanning stops at the first missing or empty_N, so a gap like_2unset while_3is set will not load_3; same pattern for Anthropic/Gemini). Hosted backend YAML accepts optionalapi_keysalongsideapi_key(seeconfig/config.multi-instance.example.yaml).runtime.New,runtimebundle.Build,stdhttp.Run, andstdhttp.RunWithRuntimerequire a non-nil*slog.Logger.pluginreg.(*Registry).RegisterBackendfactories returnexecbackend.Backenddirectly (notany).sqlitestore.Newaccepts an existing*sql.DBfor tests. - Hosted provider backends (multi-key pools) — The bundled
openai-responses,openai-legacy,anthropic, andgeminibackends keep ordered credentials per instance, classify pre-output 401/429 from the official SDKs, and may returnlipapi.RecoverablePreOutputErrorfromexecbackend.Backend.Openwhen no key is usable before the first canonical stream event (including single-key rate limit or auth failure). They do not return anlipapi.EventStreamthat fails only duringlipapi.Collectfor that case. 401 handling: HTTP 401 from the hosted upstream is treated as permanently invalid for that credential inside the process (the pool marks it unusable until restart); this matches static API keys but is not suitable for short-lived tokens that might recover without a restart. 429 / Gemini: OpenAI and Anthropic readRetry-Afterfrom the SDK error where available; the genai client does not attach response headers togenai.APIError, so the Gemini adapter also readsgoogle.rpc.RetryInfofrom error JSONdetailswhen present, otherwise it uses a conservative fixed cooldown fallback. - Terminal stream errors —
lipapi.Collectand bundled frontend SSE encoders surface terminal upstream failures aslipapi.ErrStreamTerminal/*lipapi.StreamError(stableError()stringlipapi: stream error; useerrors.AsforCodeandMessage). This replaces embedding provider text directly inerr.Error(). - ACP JSON-RPC errors — The bundled ACP client returns
*acp.RPCErrorwith stableError()text per RPC method; useCode/Messagefor vendor detail. Optionalacp.Config.Logenables debug logs when a best-effort cancel RPC fails after consumer cancellation.
cmd/lipstd/- standard distribution entrypoint (registry + runtimebundle + stdhttp)internal/pluginreg/- standard bundle registration (register_standard.go,*_install.go) and registry helpers; mandatory bundled ids are defined inpkg/lipsdkpkg/lipapi/- canonical public contractspkg/lipsdk/- stable plugin SDK contracts (includingpkg/lipsdk/requestfor request-wide transforms,pkg/lipsdk/toolcatalogfor tool catalog filters, andpkg/lipsdk/completionfor whole-completion gates; these merge throughfeature.FeatureBundleinto the runtime snapshot)internal/core/- runtime, routing, stream, config, admin, capabilitiesinternal/plugins/- bundled frontend, backend, and feature pluginsinternal/stdhttp/- standard distribution HTTP wiring (mount +RunWithRuntime)internal/infra/runtimebundle/- assembles executor, continuity, shared upstream HTTP, health/observer seamsinternal/infra/logging/- YAML-drivenslogroot logger (samberslog-multi/slog-formatterpipeline)internal/infra/- shared infrastructure seamsinternal/testkit/- test support surface scaffoldinternal/qa/- repo hygiene tests (//go:build precommit; run viamake test-precommit-extraor pre-commit hook, not defaultmake test)internal/archtest/- architecture guardrail tests (budgets, forbidden patterns)internal/refbackend/- spec-shaped HTTP emulator servers for tests (*_test.goimports only)internal/refclient/- official-SDK reference clients for conformance/matrix testsinternal/plugins/stores/- bundled persistence / continuity store plugins (intentional seam alongside backends)scripts/- quality gate scripts (bash + PowerShell).githooks/- optional git hooks.github/workflows/- CI QA pipelinetestdata/- fixtures and goldens.kiro/- steering and spec artifacts
make test
make vet
go run ./cmd/lipstd --config ./config/config.yamlUse config/config.yaml as the default sample; optional access / auth blocks are spelled out in comments at the top of that file before the active server and logging keys.
Install golangci-lint for the full make qa profile; govulncheck is invoked as go tool govulncheck (version pinned via the tool line and golang.org/x/vuln in go.mod).