Skip to content

matdev83/go-llm-interactive-proxy

Repository files navigation

Go LLM Interactive Proxy

This repository is the greenfield Go re-implementation of LLM Interactive Proxy.

The repository implements the Go core v1 stack from .kiro/specs/go-core-reimplementation-v1: canonical lipapi contracts, core routing/B2BUA/executor, bundled frontend and backend plugins, conformance matrix, and a runnable standard distribution binary (cmd/lipstd) that serves the bundled HTTP APIs when configured.

Current state

  • API parity (spec + matrices) — vendor-surface claims are tracked under .kiro/specs/llm-api-parity/ with row-level status; the README does not assert parity beyond what those matrices mark implemented (see also .kiro/specs/go-core-reimplementation-v1/refclient-spec-matrix.md).
  • canonical Go module and repository layout
  • package boundaries aligned with AGENTS.md and Kiro steering
  • database persistence settings for continuity, secure sessions, and managed PostgreSQL pool tuning are documented in docs/database-persistence.md
  • typed runtime configuration (config/config.yaml): optional server.read_header_timeout, read_timeout, write_timeout, idle_timeout (Go duration strings; omit for stdhttp defaults), and server.max_pending_wire_events to cap backend adapter pending-event queues per stream (0 = unlimited). Multi-instance routing example: config/config.multi-instance.example.yaml. Access / auth — Commented templates at the top of the same file cover access.mode, auth.handler, auth.required_level, auth.event_delivery / auth.event_failure_policy, auth.local_api_keys, and optional auth.remote; when omitted, defaults match internal/core/config/access_auth_model.go. Logging (logging in YAML): level (debug|info|warn|error), format (json|text), optional add_source, optional access_log (one structured http.access line per request with method, route_group (first path segment, bounded cardinality), status, duration_ms, and trace_id when present; optional access_log_include_raw_path: true adds the full URL path), and access_log_skip_paths (path prefixes starting with /, e.g. /healthz). The process logger is built in internal/infra/logging with slog-multi + slog-formatter error normalization over stdout.
  • Architecture / driftdocs/architecture-guardrails.md, docs/adr/0005-architecture-guardrails-and-complexity-budgets.md. Stage-four extension platform (legal stages, SDK facades, inventory, migration from hook-only plugins): docs/extension-platform-authoring.md. Plugin authors bind shared state with pkg/lipsdk/state.BindPlugin when a feature spans multiple handlers, and inventory surfaces privileged capability flags such as auxiliary_requests for stateful/call-gated feature bundles. Routing breaker semantics: docs/routing-health-circuit-breaker.md. Execute-error taxonomy notes: docs/execerr-classification.md. HTTP 5xx responses from bundled frontends use a stable generic message for internal executor/upstream failures (internal error); operators rely on structured server logs for detail (not backward compatible if you depended on error-body echo of raw upstream text).
  • cmd/lipstd — creates a pluginreg.Registry with pluginreg.NewRegistry, resolves default upstream API keys once via pluginreg.ResolveUpstreamAPIKeysFromEnv (when YAML leaves keys empty: OPENAI_API_KEY plus optional OPENAI_API_KEY_2 … up to _32, contiguous; same pattern for ANTHROPIC_API_KEY* and GEMINI_API_KEY*), installs the standard bundle via pluginreg.InstallStandardBundleOn (factory wiring in backends_install.go, frontends_install.go, and features_install.go; mandatory ids in lipsdk.StandardDistributionRequirements), loads config, validates mandatory plugins against that requirements list, assembles runtime.App with a non-nil logger, builds runtimebundle.Built via runtimebundle.Build with an explicit registry in runtimebundle.BuildOptions and a non-nil logger, a shared upstream HTTP client built from httpclient.StandardWithTune using httpclient.TransportTuneFromConfig (YAML http_client pool/timeouts; overridable for tests), then serves HTTP with stdhttp.RunWithRuntime (which also requires a non-nil logger). HTTP request IDs use a per-process diag.TraceIDGenerator created in stdhttp, not package-level global state. Optional routing.health.circuit_breaker (enabled, failure_threshold, open_for) wires executor candidate health; the executor emits structured lip.route routing observations when logging is configured. Bundled HTTP frontends classify execute failures with internal/plugins/frontends/execerr (reject vs internal). stdhttp.Run is a convenience that calls Build (with the provided registry) then RunWithRuntime.
  • test, vet, lint, and vuln-check entrypoints
  • QA scripts, optional git hooks, and a GitHub Actions workflow aligned with the sibling go-live-market-data-aggregator process (trimmed for this repo: no domain-specific custom vets)
  • deterministic IDs/timestamps in frontend encoders and ACP reference paths where reproducibility matters; B2BUA A/B leg ids are opaque random strings (a_/b_ + 32 hex chars) to avoid trivial enumeration; the standard server path injects a real wall clock and non-deterministic RNG for the executor (see internal/infra/runtimebundle)

Security and trust boundaries

  • Auth audit events — Structured lip.auth.* logs can include operator-visible PII (principal display name, ids, session/client correlation fields). auth.event_delivery selects the default log sink, disabled, or custom (composition root must supply AuthEventSink). auth.event_failure_policy best_effort ignores sink errors; fail_closed fails the request when the auth/session audit sink errors while emitting an auth-decision or session-start event, trading availability for stricter observability guarantees.
  • Local API keysauth.local_api_keys[].key values must be at least 16 Unicode code points after trimming (validated at config load). For access.mode: multi_user on a network-reachable bind, combine strong keys with reverse-proxy rate limiting or a WAF; the proxy does not implement per-client throttling on the auth middleware path.
  • Diagnostics — With diagnostics.enabled, routes other than health_path (attempts, inventory, route trace, pprof) can reveal routing and lineage metadata. Bind the listener to loopback or an admin-only network, and/or set diagnostics.shared_secret (at least 12 characters when set). Clients must then send header X-LIP-Diagnostics-Secret with that exact value.
  • Prometheus metrics — Set observability.metrics.enabled: true and optional observability.metrics.path (defaults to /metrics when omitted in config load). Scrapes expose bounded-label HTTP histograms/counters on method, status_class, and route_group (same coarse bucketing as OTEL span names; lip_http_request_duration_seconds uses buckets up to 120s for LLM tail latency), plus lip_executor_attempts_total, lip_executor_backend_open_seconds, lip_upstream_request_duration_seconds (when using the bundled upstream client), and Go/process collectors. Optional observability.metrics.exemplars_enabled: true enables OpenMetrics on the scrape handler and attaches trace_id exemplars to selected histograms when a span is present. When diagnostics.shared_secret is set, scrapers must send X-LIP-Diagnostics-Secret for the metrics route (same as attempts/pprof). The metrics path participates in diagnostic path uniqueness checks (must not overlap health_path, etc.).
  • OpenTelemetry tracing — Set observability.tracing.enabled: true and optional observability.tracing.service_name (otherwise OTEL_SERVICE_NAME or lipstd). Optional observability.tracing.sample_ratio (strictly between 0 and 1 when set) applies ParentBased(TraceIDRatioBased) for new root spans; omit or use 1 for SDK default sampling. Configure OTLP export with standard OTEL_* environment variables (OTEL_EXPORTER_OTLP_ENDPOINT, OTEL_EXPORTER_OTLP_PROTOCOL, trace sampler settings, etc.). Incoming HTTP requests are wrapped with otelhttp (coarse span names); the executor adds child spans (lip.executor.execute, lip.executor.backend_open). The outbound upstream http.Client propagates trace context when tracing is active, including when runtimebundle.BuildOptions.HTTPClient supplies a custom client (its transport is wrapped). With tracing enabled, JSON logs include trace_id and span_id when the slog record context carries an active span; diag.TraceID prefers explicit LIP correlation (WithCallDiag, X-Trace-ID / WithTraceID) and otherwise falls back to the W3C trace id from span context. Responses echo X-Trace-ID whenever a trace id is available on the request context.
  • Outbound HTTP proxy — The shared upstream client defaults to honoring HTTP_PROXY / HTTPS_PROXY. In environments where process environment is not trusted, set http_client.trust_environment_proxy: false so those variables are ignored. Optional http_client fields (max_idle_conns, max_idle_conns_per_host, idle_conn_timeout, response_header_timeout, client_timeout, dial/TLS timeouts) tune connection pooling for high concurrency; defaults match httpclient.DefaultTransportTune.
  • Bedrock cleartextdisable_https: true is only accepted when base_endpoint resolves to a loopback host, unless allow_insecure_non_loopback: true is set on the Bedrock backend row (explicit lab escape hatch).
  • SQLite pathcontinuity.sqlite_path must not contain NUL, ?, #, or & (avoids ambiguous file: DSN parsing).

Operations: container runtime and profiling

  • GOMEMLIMIT — In Kubernetes or other memory-capped environments, set GOMEMLIMIT to roughly 80–90% of the cgroup limit so the Go runtime can trigger GC before the OOM killer terminates the process (see the runtime/debug.SetMemoryLimit / GOMEMLIMIT documentation).
  • GOGC — The default target GC percentage is usually fine; change it only when profiling shows GC as a meaningful cost.
  • PGO — Optional: build with profile-guided optimization (e.g. go build -pgo=default.pgo after collecting a representative CPU profile) to improve hot paths in production binaries.
  • GOMAXPROCS — From Go 1.19 onward, the runtime usually respects CPU cgroup quotas automatically; override only when you have a deliberate reason (for example pinning to socket count on bare metal).
  • CPU and heap profiles — For local investigation, run benchmarks or tests with go test -cpuprofile=cpu.prof ./path/... then go tool pprof cpu.prof, or heap profiles with -memprofile. When diagnostics.enabled is true, set diagnostics.pprof_path in YAML (for example /debug/pprof) to expose the standard library pprof index and endpoints under that prefix. Do not expose pprof on untrusted networks; put the listener behind localhost, a VPN, or an authenticated reverse proxy. Diagnostic paths (health_path, attempts_path, inventory_path, route_trace_path, pprof_path, and observability.metrics.path when metrics are enabled) must be unique after normalization and must not prefix-overlap (validated at config load).

Resource bounds (memory / DoS hardening)

  • lipapi.Call.Validate enforces maximum sizes on route selectors, IDs, messages/parts/tool counts, part payloads, extensions, and related option strings (see pkg/lipapi/limits.go). Oversized canonical requests fail validation before orchestration runs.
  • lipapi.Collect applies DefaultCollectLimits when aggregating streaming events into a single Collected struct. Use CollectWithLimits for custom caps or CollectUnbounded only for tests/harnesses that deliberately exceed defaults.
  • b2bua.MemoryStore applies a default maximum number of concurrent A-leg rows (DefaultMemoryStoreMaxLegsWithoutTTL, currently 100k) whenever MaxLegs / continuity.max_legs is unset (zero); set a positive value to override. Negative max_legs is rejected. TTL (when set) evicts idle rows by age; the max-legs cap still applies and evicts the least-recently-seen rows when the store is over capacity.

Routing defaults and continuity

  • Default route selector when clients omit X-LIP-Route is resolved by config.EffectiveDefaultRouteSelector from routing.default_route in YAML, then the first enabled backend plus registry default model ids (pluginreg.DefaultWireModel). Optional top-level model_aliases rewrites the full selector string (regexp pattern to replacement) before parsing; routing.default_route is expanded the same way during runtimebundle.Build. Backend rows use id as the runtime instance id; optional kind sets the bundled factory when you need multiple instances of the same adapter (kind: openai-responses, id: openai-primary). After config.LoadFile, call routing.ValidateModelAliasesConfig(cfg) so invalid alias rules fail at startup. See internal/core/config/effective_default_route.go and internal/core/routing/aliases.go.
  • SQLite continuity (continuity.store: sqlite and continuity.sqlite_path) persists A-leg rows and attempt lineage across process restarts (internal/core/continuity/sqlitestore, pure-Go driver modernc.org/sqlite). continuity.ttl / max_legs apply only to the in-memory store; combining them with SQLite is rejected at config load until durable pruning exists.
  • Hook bus: root hooks.tool_reactor_error_policy selects fail_open (default), fail_closed, or swallow_event for tool-reactor errors. Optional reference feature plugins are documented in internal/plugins/features/REFERENCE_PLUGINS.md.

QA and local workflow

Fast checks (format, go mod tidy drift, build, vet, architecture guardrails in internal/archtest, plus go mod verify in CI or when LIP_VERIFY_MODULE_CACHE=1) plus staged-package tests mirror the sibling repo’s quality-checks / test-staged pattern. See docs/architecture-guardrails.md.

make quality-checks   # gofmt -l, tidy+diff guard, go build, go vet; add LIP_VERIFY_MODULE_CACHE=1 to include mod verify locally
make test             # quality-checks + go test -parallel=8 ./... (excludes precommit-tagged tests)
make test-precommit-extra  # hygiene + executor matrices (-tags=precommit; included in make qa + CI)
make test-fast        # quality-checks + tests for staged packages (or all if none staged)
make test-race        # skipped on Windows; on Linux/macOS runs scripts/race-check.sh (CI: strict race on Linux)
make test-fuzz        # short fuzz smoke on internal/testkit (override: FUZZTIME=30s make test-fuzz)
make bench            # benchmarks: testkit, core stream/runtime/routing/diag, frontend streaming encoders (see docs/performance-checks.md)
make qa               # quality-checks + unit tests + golangci-lint + govulncheck (via `go tool`, see go.mod)
make hooks-install    # set core.hooksPath to .githooks (secret scan on staged files, then quality-gate when .go is staged)

Pre-commit runs scripts/check-staged-secrets first (gitleaks when installed, otherwise high-signal git grep patterns; allowlist via .gitleaks.toml and scripts/secret-scan-allowlist.txt). When staged .go files exist, it then runs scripts/quality-gate (quality checks, staged tests, staged race scan on Linux/macOS only — skipped on Windows, golangci-lint if present, go tool govulncheck).

CI (.github/workflows/qa.yml) runs make quality-checks, unit tests, strict race on Linux, golangci-lint-action, and go tool govulncheck.

Linter config lives in .golangci.yml (golangci-lint v2 schema: staticcheck, govet, revive, the modernize pass, gofumpt as a formatter, and small correctness linters). Install a v2.x binary locally so make lint matches CI.

SDK and HTTP mount API (breaking)

Repository layout

  • cmd/lipstd/ - standard distribution entrypoint (registry + runtimebundle + stdhttp)
  • internal/pluginreg/ - standard bundle registration (register_standard.go, *_install.go) and registry helpers; mandatory bundled ids are defined in pkg/lipsdk
  • pkg/lipapi/ - canonical public contracts
  • pkg/lipsdk/ - stable plugin SDK contracts (including pkg/lipsdk/request for request-wide transforms, pkg/lipsdk/toolcatalog for tool catalog filters, and pkg/lipsdk/completion for whole-completion gates; these merge through feature.FeatureBundle into the runtime snapshot)
  • internal/core/ - runtime, routing, stream, config, admin, capabilities
  • internal/plugins/ - bundled frontend, backend, and feature plugins
  • internal/stdhttp/ - standard distribution HTTP wiring (mount + RunWithRuntime)
  • internal/infra/runtimebundle/ - assembles executor, continuity, shared upstream HTTP, health/observer seams
  • internal/infra/logging/ - YAML-driven slog root logger (samber slog-multi / slog-formatter pipeline)
  • internal/infra/ - shared infrastructure seams
  • internal/testkit/ - test support surface scaffold
  • internal/qa/ - repo hygiene tests (//go:build precommit; run via make test-precommit-extra or pre-commit hook, not default make test)
  • internal/archtest/ - architecture guardrail tests (budgets, forbidden patterns)
  • internal/refbackend/ - spec-shaped HTTP emulator servers for tests (*_test.go imports only)
  • internal/refclient/ - official-SDK reference clients for conformance/matrix tests
  • internal/plugins/stores/ - bundled persistence / continuity store plugins (intentional seam alongside backends)
  • scripts/ - quality gate scripts (bash + PowerShell)
  • .githooks/ - optional git hooks
  • .github/workflows/ - CI QA pipeline
  • testdata/ - fixtures and goldens
  • .kiro/ - steering and spec artifacts

Bootstrap commands

make test
make vet
go run ./cmd/lipstd --config ./config/config.yaml

Use config/config.yaml as the default sample; optional access / auth blocks are spelled out in comments at the top of that file before the active server and logging keys.

Install golangci-lint for the full make qa profile; govulncheck is invoked as go tool govulncheck (version pinned via the tool line and golang.org/x/vuln in go.mod).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages