osctrl: defensive hardening round — login timing, rate-limit XFF bypass, JWT rotation, info-disclosure closes#826
Merged
Conversation
Go's ServeMux uses '/' as a wildcard catch-all for any GET request
the mux doesn't have a more-specific pattern for. With RootHandler
returning 200 unconditionally, typos like 'GET /api/v1/totally-fake'
would silently succeed — confusing for clients debugging an
integration and a weak fingerprint signal for scanners
('endpoint X returned 200 → must exist').
Tighten the contract: respond 200 ONLY when r.URL.Path == '/'.
Otherwise fall out to http.NotFound. Doesn't leak endpoint
structure beyond what's already in the public OpenAPI; just stops
the api from silently claiming success on misrouted requests.
…tials
Pentest finding: POST /api/v1/login takes ~300 ms for a valid
username with the wrong password but only ~15-25 ms for an unknown
username, a 10-15x wall-clock differential that lets an
unauthenticated attacker enumerate valid usernames by timing the
response.
Root cause: CheckLoginCredentials short-circuits when m.Get(username)
returns gorm.ErrRecordNotFound and never runs
bcrypt.CompareHashAndPassword in that branch.
Fix: precompute a dummy bcrypt hash at the current BcryptCost
during package init and, on the DB-miss branch, run a discard
CompareHashAndPassword against it. Both branches now spend the
same ~200 ms of bcrypt-cost-12 work; the result is discarded and
the caller still receives (false, AdminUser{}) for non-existent
users.
dummyHash will be nil only if bcrypt init failed (platform crypto
broken); in that case the comparison is skipped and the timing
leak is accepted — but the authentication subsystem is already
unusable in that scenario.
… bypass) Pentest finding: an attacker behind an edge proxy that *appends* to X-Forwarded-For (nginx default, ELB, Cloudflare with default settings) could rotate the header value per request and cycle rate-limit buckets, defeating the limiter that protects /login and the preAuth surfaces. Root cause: ratelimit.KeyByIP returned utils.GetIP(r), which when --trusted-proxies is configured walks the XFF chain right-to-left and returns the first untrusted hop. A typical edge appends the client-supplied XFF on the way in; the right-most-untrusted hop is then the attacker-controlled value. Fix: introduce utils.RemoteIP that returns the direct TCP peer's IP, ignoring every forwarding header even when --trusted-proxies is configured, and switch ratelimit.KeyByIP to use it. utils.GetIP remains the source of truth for audit-log 'real client IP' fields where the operator wants to see past the proxy; the rate-limit key trades that visibility for unspoofability.
Both endpoints accepted anonymous GET and returned osctrl-internal data — the SQL-template starter pack and the carve target list. Two problems: 1. Either response uniquely fingerprints the deployment as osctrl, helping a scanner target version-keyed CVEs rather than guessing what's running. 2. /carves/samples specifically returns the canonical exfiltration shopping list — /etc/passwd, /etc/shadow, \Windows\System32 \config\SAM, browser-keychain paths, shell history. An attacker who reaches the network surface gets a free hint about what data the operator considers carve-worthy. The pre-auth posture was justified on 'the data is static, ships with the binary, the login page can lazy-load it.' Tracing actual consumers shows that's not happening — QuickTemplates lives on /queries/new and the carve-targets picker on /carves/new, both post-login. The pre-auth exposure was a wrong design choice with no consumer benefit. Fix: move both routes inside the existing flagParams.Osquery.Query and flagParams.Osquery.Carve blocks where they're guarded by handlerAuthCheck. The endpoints themselves are unchanged; only the route registration moves.
Two of the four bundled nginx configs (the legacy admin server block and the SPA frontend's dev config) didn't have server_tokens set, so they emit 'Server: nginx/1.27.x' — a free fingerprint of the deployed build for vuln scanners like Shodan, Censys, and nuclei templates that index by exact version string. The other two configs (deploy/nginx/nginx.conf and deploy/docker/conf/nginx/nginx.conf) already have it set; this brings the two stragglers in line. server_tokens off doesn't hide that the server is nginx — there's no portable way to do that in OSS nginx — but it stops version-keyed CVE matching cold.
GET /static/ on the admin returned an HTML directory listing of css/, fonts/, img/, js/. Source was Go's http.FileServer, which enables autoindex by default. Fix: wrap the FileServer in a noDirListing middleware that 404s any request whose URL path ends with '/'. Legitimate file requests (/static/css/custom.css etc.) pass through unchanged. 404 instead of 403 because the listing was the only signal that /static/* exists at all — surfacing a 403 would confirm structure that 404 hides. Verified: GET /static/ -> 404 (was 200 with listing) GET /static/css/ -> 404 (was 200 with listing) GET /static/css/<real> -> 200 (unchanged)
CreateToken's claims were deterministic: Username + Issuer + ExpiresAt (at 1s resolution). HMAC-SHA256 is deterministic for the same key + payload, so two CreateToken calls for the same user in the same second returned identical JWT strings. That silently broke any caller that depends on token rotation as a revocation primitive. The auth middleware in cmd/api/auth.go compares every presented JWT against the stored AdminUser.APIToken (constant- time) — so logging in is supposed to invalidate the previous session by overwriting the stored token to a new value. But if the new "minted" token is bitwise identical to the old one, UpdateToken is a no-op, the stored value doesn't change, and any previously-issued copy (e.g. a JWT exfiltrated from the user's previous device) keeps validating against the unchanged stored value. Stamp a random 16-byte hex jti on every issuance. Even back-to-back calls in the same nanosecond now produce distinct claims → distinct signatures → distinct JWT strings. UpdateToken's overwrite now actually rotates. Regression tests pin both halves: (a) two CreateToken calls return different strings, (b) the parsed claims carry distinct jti values. Verified by neutralising the ID claim — both tests fail loudly.
LoginHandler had a 60s-freshness branch: if the user's stored APIToken had >60s of life left, login returned it as-is instead of minting fresh. The original intent was to avoid handing out a token that would fail mid-request. The side effect was that a second login from a different device (or from the same user after a credential rotation) got the SAME JWT — leaving the previous device's copy valid indefinitely. Drop the reuse branch. Always CreateToken + UpdateToken on successful login. With the jti claim from the previous commit, the new token is provably distinct from any previously-issued token for this user, so UpdateToken actually rotates the stored value. The auth middleware in cmd/api/auth.go compares every presented JWT against the stored APIToken (constant-time), so once UpdateToken overwrites, any previously-issued copy fails 401 on its next request — even though the old JWT is still cryptographically valid against the secret. The DB-row APIToken check IS the revocation primitive; this commit just makes login actually exercise it. Server-side /logout is intentionally NOT in this PR: the richer logout flow (clearing cookies + APIToken + redirecting to IdP end-session URL for federated users) is part of the OIDC PR, which introduces the endpoint with its full behaviour rather than requiring a follow-up rewrite.
javuto
approved these changes
May 18, 2026
Collaborator
javuto
left a comment
There was a problem hiding this comment.
Nice! Thanks for the PR!
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
osctrl: defensive hardening round — login timing, rate-limit XFF bypass, JWT rotation, info-disclosure closes
This is the first of a multi-PR series. Each PR after this one builds
on its predecessor and will go up separately once you've had a chance
to look at this one. This PR is intentionally the smallest of the
series — pure defensive hardening, no new endpoints, no behaviour
change for legitimate users.
Eight discrete fixes, surfaced during the pentest passes I ran against
the dev stack. Numbered below in the order I'd suggest reading the
diff.
1.
pkg/users: close username-enumeration timing leakCheckLoginCredentialsreturned in ~15 ms when the user didn'texist (DB miss → no bcrypt) vs ~300 ms when the user existed but
the password was wrong (DB hit → bcrypt cost-12 compare). A trivial
remote distinguisher for "valid username, wrong password" vs
"unknown username."
Fix: add a package-init
dummyHashbcrypt-hashed at the same cost,and run
bcrypt.CompareHashAndPassword(dummyHash, …)on the DB-misspath. Result is discarded; we still return
false. The compare'ssole purpose is to burn the same wall-clock as a real compare.
Regression test in
pkg/users/users_test.go:TestCheckLoginCredentials_TimingEqualizationasserts ratio < 2.0×.Measured locally: 1.00×. Skipped under
-shortbecause bcryptcost-12 burns ~300ms per iteration.
A second regression test
TestCheckLoginCredentials_UnknownUserStillReturnsFalsepins that the dummyHash compare doesn't accidentally turn an unknown
user into a successful login — the compare result must be discarded.
2.
pkg/ratelimit: close X-Forwarded-For rotation bypassKeyByIPkeyed the token bucket onX-Forwarded-For(orX-Real-IP). When osctrl-api sits behind nginx that's the rightkey — but the same code path runs when osctrl-api is directly
exposed (or when nginx isn't stripping client-supplied
X-Forwarded-For).Result: an attacker rotates the header on each request and the
bucket never fills.
Fix: switch
KeyByIPto use a newutils.RemoteIP(r)helper thatreturns the TCP peer address from
r.RemoteAddr. The trusted-proxycase is handled at the proxy layer (nginx replaces XFF rather than
appending), not in our rate-limiter.
Regression tests:
TestKeyByIPIgnoresForwardingHeaderscovers 5sub-cases (no headers, single XFF, chained XFF, X-Real-IP, both).
TestMiddlewareXFFRotationDoesNotBypassis end-to-end: burst=2 then5 rotated-XFF attempts all 429.
3.
osctrl-api: require auth on/queries/samples+/carves/samplesBoth endpoints returned a sample-template library
(
SELECT * FROM osquery_info LIMIT 5-style) without any auth check.Two issues:
osctrl-api version (each release ships a different starter pack),
letting a scanner version-pin without authenticating.
arguments (
'/etc/shadow','/var/log/secure') that nameprivileged read targets, useful for someone scoping a future
CarveLevel-credential phishing campaign.
Fix: move the route registrations from the pre-auth block to the
authenticated block in
cmd/api/main.go. The handler functionsthemselves were unchanged.
4.
osctrl-api:RootHandler404s on misrouted GETsGo's
ServeMuxuses/as a wildcard catch-all for any GET requestthe mux doesn't have a more-specific pattern for. With
RootHandlerreturning 200 unconditionally, typos like
GET /api/v1/totally-fakesilently succeeded — confusing for clients debugging an integration
and a weak fingerprint signal for scanners ("endpoint X returned
200 → must exist").
Tighten the contract: respond 200 ONLY when
r.URL.Path == "/".Otherwise fall through to
http.NotFound.5.
deploy/nginx: strip nginx version from Server headerWithout
server_tokens off, nginx emitsServer: nginx/1.27.x—a free fingerprint for vuln scanners (Shodan, nuclei) that match
known CVEs by version.
server_tokens offmakes it justServer: nginx. Doesn't hide that this IS nginx (no portable wayin OSS nginx) but stops version-keyed CVE matching.
Applied to both
deploy/docker/conf/nginx/osctrl.confand thedev-stack
frontend-dev.conf.6.
osctrl-admin: disable autoindex on/static/Go's
http.FileServerautoindexed/static/—GET /static/returned an HTML directory listing. Useful recon for a scanner
mapping the deployed JS / CSS / icon set.
Fix: wrap
http.FileServerin anoDirListingmiddleware that404s any path ending in
/. Static asset serving for real files(
/static/js/foo.js, etc.) is unaffected.7.
pkg/users: stamp randomjtion every JWTFoundation for the rotation fix in (8).
CreateToken's claimswere deterministic:
Username+Issuer+ExpiresAt(at 1sresolution). HMAC-SHA256 is deterministic for the same key +
payload, so two
CreateTokencalls for the same user in the samesecond returned identical JWT strings.
That silently broke any caller depending on token rotation as a
revocation primitive. The auth middleware in
cmd/api/auth.gocompares every presented JWT against the stored
AdminUser.APIToken(constant-time) — so logging in is supposed to invalidate the
previous session by overwriting the stored token. But if the new
"minted" token is bitwise identical to the old one,
UpdateTokenis a no-op, the stored value doesn't change, and any previously-
issued copy keeps validating.
Stamp a random 16-byte hex
jti(RFC 7519 §4.1.7) on every issuanceso claims are guaranteed distinct.
Regression tests in
pkg/users/users_test.go:TestCreateTokenIsNonDeterministic— twoCreateTokencallsreturn different strings.
TestCreateTokenStampsJTI— the parsed claims carry distinctnon-empty
jtivalues.Both fail loudly if the
jticlaim is removed (verified locally).8.
osctrl-api: rotate JWT on every loginLoginHandlerhad a 60s-freshness branch: if the user's storedAPITokenhad >60s of life left, login returned it as-is insteadof minting fresh. The original intent was to avoid handing out a
token that would fail mid-request. The side effect was that a
second login from a different device got the SAME JWT — leaving
the previous device's copy valid until natural expiry.
Drop the reuse branch. Always
CreateToken+UpdateTokenonsuccessful login. With the
jticlaim from (7), the new token isprovably distinct, so
UpdateTokenactually rotates the storedvalue. The auth middleware's constant-time compare in
cmd/api/auth.gothen fails the old JWT 401 on its next request —even though the old JWT is still cryptographically valid against
the secret.
The DB-row
APITokencheck IS the revocation primitive; thiscommit just makes login exercise it correctly.
Note: a server-side
/logoutendpoint that explicitly invalidatesthe stored token is part of the next PR in the series (OIDC support
on osctrl-api). The richer logout flow there returns
idp_logout_urlfor federated sessions, which makes more sense tointroduce alongside OIDC than as a half-feature here.
Verified
go build ./...clean.go vet ./...clean.go test ./...green across all packages onpr/security-hardening.bad-user0.21s median vsunknown-user0.21smedian, ratio 1.01×.
X-Forwarded-Forheaders fromthe same TCP peer all returned 429 starting at request A lot of changes everywhere #11.
GET /api/v1/queries/samplesandGET /api/v1/carves/samplesunauthenticated → 401.
GET /api/v1/totally-fake-path→ 404.curl -I http://host:8088/→Server: nginxwith no version.GET /static/on osctrl-admin → 404.jticlaim.APITokeninadmin_users; the previous JWT then fails 401 on/checks-authdespite still being signature-valid.Test plan
/api/v1/loginwith a known-bad user and thenan unknown user; medians should be within 2×.
curl -I http://host/— Server header should notinclude a version.
curl http://admin-host/static/→ 404.Follow-ups (separate PRs, will not open until you signal)
Five more PRs are prepared on my fork and waiting:
pr/auth-shared-package) — pure refactor lifting theexisting OIDC code in
cmd/admin/oidc.gointo a reusablepkg/auth/+pkg/auth/oidc/package. Legacy admin behaviourpreserved.
pr/oidc-api-spa) — OIDC support on osctrl-api + theReact SPA. Introduces
/api/v1/logout,/auth/methods,/auth/oidc/login,/auth/oidc/callback.pr/saml-api-spa) — SAML 2.0 provider for osctrl-api +SPA. 41/41 pentest probes pass against a Keycloak IdP.
pr/spa-users-page) — SPA/userspage reaches paritywith the legacy admin (Add User / Delete User / Reset Password,
env dropdown, bulk grant).
pr/spa-ux-polish) — TargetSelector + carves-new layoutfrom-cookie boot priming.
Happy to open them in sequence as you merge each predecessor, or
file all six at once with stacked-PR base branches set on my
fork — your preference.