Conversation
| // Our OAUTH_CLIENT_IDS env var does not contain all clients. | ||
| let registeredClientIds; | ||
| try { | ||
| const clients = await oauthDb.mysql.getAllClientIds(); |
There was a problem hiding this comment.
I don't like that we're having to do this so I'm open to other suggestions 😉
Maybe we don't care about this here because going through our flows should error the user out if they have an invalid client ID in the URL. But since we had a similar issue come up before that resulted in us just removing the tag and because we are now logging this on a ton of statsd events (see the previous PR), I included it for consideration.
There was a problem hiding this comment.
I appreciate you added it. We do need to restrict to only valid clients.
…ent_ids to log Because: * We need client_id and service as statsd tags on security event metrics * We want this logging (and Sentry tags) on all valid client_ids This commit: * Adds client_id/service tags (defaulting to 'none') to the recordSecurityEvent statsd metric * Loads client IDs from the fxa_oauth.clients table and passes them to resolveClientTags closes FXA-13313
| // Our OAUTH_CLIENT_IDS env var does not contain all clients. | ||
| let registeredClientIds; | ||
| try { | ||
| const clients = await oauthDb.mysql.getAllClientIds(); |
There was a problem hiding this comment.
We do change these from time to time. It's probably okay to just pull this on startup, but we'd need to document that to pickup the new client IDs, we'd need to roll the servers.
Other options would be:
- Just query the DB. Since this table rarely changes, the db will cache it anyways and it'll be a quick call.
- Query it, but cache it in redis, and query redis. Kinda complicated, but lets us have a ttl.
- Poor man's cache, we we just use a
setIntervalto refresh the values. In this caseregisteredClientIdswould become a callback function that'd just fetch the current state.
There was a problem hiding this comment.
Ahh, so I wanted to do 1 initially but I wasn't sure if it'd be kept in memory or would cause some other problem. Thanks for confirming it'll just be cached and it should be fine to do the lookup on every call, I will change the approach 👍
| const clients = await oauthDb.mysql.getAllClientIds(); | ||
| registeredClientIds = new Set(clients.map((c) => c.id.toString('hex'))); | ||
| } catch (err) { | ||
| log.error('server.registeredClientIds', { err }); |
There was a problem hiding this comment.
I'd capture a sentry errror.
| }; | ||
|
|
||
| const validClientIds = new Set<string>(Object.values(OAuthNativeClients)); | ||
| const nativeClientIds = new Set<string>(Object.values(OAuthNativeClients)); |
There was a problem hiding this comment.
Not for this PR, but just mentioning it. It'd be really nice if these lists were data driven instead of hard coded. With RP management in the admin panel, this would be easy todo.
| try { | ||
| await db.securityEvent(eventData); | ||
| this.statsd.increment(`accountEvents.recordSecurityEvent.write.${name}`); | ||
| this.statsd.increment( |
There was a problem hiding this comment.
This is a sad reality, but all the JS code that calls this won't typecheck if the name is actually a valid security event name. I've been bit this before. It's possible we want to safegaurd this with a runtime check of some sort. I guess it depends on how paranoid we want to be about blowing up metrics labels...
| } | ||
| } | ||
| } catch { | ||
| // If metricsContext resolution fails, just return undefined for both |
There was a problem hiding this comment.
I'm a stickler for empty catch blocks. Is there a reason metricsContext resolution would fail? If it's just hedging against an unexpected state, we could record a statsd metric or capture a sentry message. If it's actually an expected thing, then maybe its okay.
dschom
left a comment
There was a problem hiding this comment.
R+WC - Just the thing about making sure client ids are stay reasonably up to date. We don't change RPs very often, but rolling servers when we do would be annoying and possibly easy to overlook.
Because:
This commit:
closes FXA-13313
follow up to #20213
--
Draft because I have not ran all these tests locally.
You can test this locally by querying the security events table. Try logging in/signing up etc. with an RP, through Sync, through Desktop Relay etc. and check the table entries
additionalInfo.Note: It looks like Firefox always sends us
service=syncwhen they hitsession/destroydespite being logged into a different service. That is reflected in the security events table inadditionalInfoasservice: sync. Not sure that's even worth filing an issue for on their side because if the user enables Sync later which would we use?