banditcallback: emit per-device first-seen to lantern-cloud#668
Merged
Conversation
The lantern-cloud bandit catalog now selects http-proxy arms for legacy clients via the unified action space, but those arms had no EXP3 reward signal because clients on the lantern-http-proxy backend don't make the callback at URL-test completion the way sing-box clients do natively. Without a signal, EXP3 weights for http-proxy arms stayed flat at their cold-start prior — explored via gamma but never reinforced. This change has the http-proxy daemon emit the callback on behalf of those clients. On the first request from a device-id within a TTL window, the new banditcallback.Emitter fires an async best-effort GET to the API's /v1/bandit/callback?token=<arm-token>&did=<device_id> endpoint. The arm-token is plumbed at provisioning via two new INI keys (banditcallbacktoken / banditcallbackurl); the API discriminates arm-tokens from per-probe tokens by the `arm-` prefix and writes a flat-reward update straight to EXP3 + per-route signals. A device's first connection within the TTL window triggers one emit; subsequent connections from the same device are suppressed in-memory until the window rolls over. Map sweep is opportunistic on the same TTL cadence, so worst-case memory is ~2× unique devices per window without a dedicated reaper goroutine. The API does its own SET-NX-based dedup as defense against a restarted/multi-replica daemon losing the LRU and re-firing within the window. The filter installs after tokenfilter (auth) but before devicefilter (throttling) — devicefilter skips when ReportingRedisClient is nil for pro proxies, but the bandit still wants signal for pro arms. OnFirstOnly because we only need the device-id header once per connection. No-op when Token/URL are empty, so non-bandit-eligible tracks carrying the same binary stay silent. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Adds a new bandit callback mechanism to emit a best-effort, per-device “first seen” signal to lantern-cloud so EXP3 arm weights can be reinforced by real proxy traffic (especially for legacy clients that can’t emit the callback themselves).
Changes:
- Introduces
banditcallbackpackage with anEmitter(TTL-based in-memory dedup) and a proxyFilterthat triggers async callbacks. - Adds CLI/INI configuration (
banditcallbacktoken,banditcallbackurl,banditcallbackttl) and plumbs these into the proxy. - Inserts the callback filter into the proxy filter chain (auth-adjacent placement) and adds unit tests for dedup + concurrency behavior.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| http-proxy/main.go | Adds CLI/INI flags and passes bandit-callback settings into the proxy configuration. |
| http_proxy.go | Stores bandit callback config on Proxy, constructs an emitter, and conditionally appends the filter to the chain. |
| banditcallback/banditcallback.go | Implements the TTL-deduped emitter and filter that emits the callback asynchronously. |
| banditcallback/banditcallback_test.go | Adds unit tests validating disabled behavior, dedup, concurrency, and TTL re-emission. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Address Copilot review feedback on #668: the two comments around the banditcallback filter implied the filter "installs but stays silent" when token/URL are empty, but createFilterChain actually skips the append entirely when the emitter is disabled. Update both comments to match the real behavior: - banditcallback.New is still cheap to call with empty inputs (the emitter's Enabled() reports false), but the filter is never installed in that case — zero per-request work on non-bandit-eligible builds. - Note the benchmark-mode caveat (no tokenfilter), since the "after auth" placement only holds in the production path. No behavior change; comments only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a per-device first-seen callback emitter so the bandit gets EXP3 reward signal for http-proxy arms. Without it, legacy clients can't fire the callback themselves (sing-box clients do at URL-test completion natively), so http-proxy arm weights stayed flat at their cold-start prior — explored via gamma but never reinforced by traffic.
What's new
banditcallbackpackage:Emitterwith TTL-bounded in-memory dedup keyed on device-id, async best-effort GET to the configured URL on first-seen, opportunistic map sweep (no reaper goroutine).Filterwraps the emitter as a proxy filter.banditcallbacktoken,banditcallbackurl,banditcallbackttl. Plumbed by the lantern-cloud provisioner. Empty token → emitterEnabled()is false → filter is installed but no-ops.tokenfilter(auth-gated, no firing on unauthenticated noise) but beforedevicefilter(which skips for pro proxies — we still want signal for pro arms).API side
lantern-cloud PR https://github.com/getlantern/lantern-cloud/pull/{TBD} adds the matching
/v1/bandit/callbackarm-token handler and the provisioner plumbing.Test plan
go test ./banditcallback/— unit tests: disabled-when-unconfigured, first-seen fires, dedup suppresses repeats, concurrent first-seen is single-fire, re-emit after TTLgo vet ./banditcallback/ ./common/ ./devicefilter/cmd/config-test+ SigNoz: confirmarm callback receivedlog lines, EXP3 weight movement forr13:t13arm🤖 Generated with Claude Code