Skip to content

feat(logging): Add client_id + service to more logs, expand valid client_ids to log#20233

Draft
LZoog wants to merge 1 commit intomainfrom
FXA-13313
Draft

feat(logging): Add client_id + service to more logs, expand valid client_ids to log#20233
LZoog wants to merge 1 commit intomainfrom
FXA-13313

Conversation

@LZoog
Copy link
Contributor

@LZoog LZoog commented Mar 20, 2026

Because:

  • We need client_id and service as statsd tags on security event metrics
  • We want this logging (and Sentry tags) on all valid client_ids

This commit:

  • Adds client_id/service tags (defaulting to 'none') to the recordSecurityEvent statsd metric
  • Loads client IDs from the fxa_oauth.clients table and passes them to resolveClientTags

closes FXA-13313

follow up to #20213

--

Draft because I have not ran all these tests locally.

You can test this locally by querying the security events table. Try logging in/signing up etc. with an RP, through Sync, through Desktop Relay etc. and check the table entries additionalInfo.

Note: It looks like Firefox always sends us service=sync when they hit session/destroy despite being logged into a different service. That is reflected in the security events table in additionalInfo as service: sync. Not sure that's even worth filing an issue for on their side because if the user enables Sync later which would we use?

// Our OAUTH_CLIENT_IDS env var does not contain all clients.
let registeredClientIds;
try {
const clients = await oauthDb.mysql.getAllClientIds();
Copy link
Contributor Author

@LZoog LZoog Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like that we're having to do this so I'm open to other suggestions 😉

Maybe we don't care about this here because going through our flows should error the user out if they have an invalid client ID in the URL. But since we had a similar issue come up before that resulted in us just removing the tag and because we are now logging this on a ton of statsd events (see the previous PR), I included it for consideration.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate you added it. We do need to restrict to only valid clients.

…ent_ids to log

Because:
* We need client_id and service as statsd tags on security event metrics
* We want this logging (and Sentry tags) on all valid client_ids

This commit:
* Adds client_id/service tags (defaulting to 'none') to the recordSecurityEvent statsd metric
* Loads client IDs from the fxa_oauth.clients table and passes them to resolveClientTags

closes FXA-13313
// Our OAUTH_CLIENT_IDS env var does not contain all clients.
let registeredClientIds;
try {
const clients = await oauthDb.mysql.getAllClientIds();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do change these from time to time. It's probably okay to just pull this on startup, but we'd need to document that to pickup the new client IDs, we'd need to roll the servers.

Other options would be:

  1. Just query the DB. Since this table rarely changes, the db will cache it anyways and it'll be a quick call.
  2. Query it, but cache it in redis, and query redis. Kinda complicated, but lets us have a ttl.
  3. Poor man's cache, we we just use a setInterval to refresh the values. In this case registeredClientIds would become a callback function that'd just fetch the current state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, so I wanted to do 1 initially but I wasn't sure if it'd be kept in memory or would cause some other problem. Thanks for confirming it'll just be cached and it should be fine to do the lookup on every call, I will change the approach 👍

const clients = await oauthDb.mysql.getAllClientIds();
registeredClientIds = new Set(clients.map((c) => c.id.toString('hex')));
} catch (err) {
log.error('server.registeredClientIds', { err });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd capture a sentry errror.

};

const validClientIds = new Set<string>(Object.values(OAuthNativeClients));
const nativeClientIds = new Set<string>(Object.values(OAuthNativeClients));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for this PR, but just mentioning it. It'd be really nice if these lists were data driven instead of hard coded. With RP management in the admin panel, this would be easy todo.

try {
await db.securityEvent(eventData);
this.statsd.increment(`accountEvents.recordSecurityEvent.write.${name}`);
this.statsd.increment(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a sad reality, but all the JS code that calls this won't typecheck if the name is actually a valid security event name. I've been bit this before. It's possible we want to safegaurd this with a runtime check of some sort. I guess it depends on how paranoid we want to be about blowing up metrics labels...

}
}
} catch {
// If metricsContext resolution fails, just return undefined for both
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a stickler for empty catch blocks. Is there a reason metricsContext resolution would fail? If it's just hedging against an unexpected state, we could record a statsd metric or capture a sentry message. If it's actually an expected thing, then maybe its okay.

Copy link
Contributor

@dschom dschom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

R+WC - Just the thing about making sure client ids are stay reasonably up to date. We don't change RPs very often, but rolling servers when we do would be annoying and possibly easy to overlook.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants