oauth client_id's that are being logged as expired #324
Comments
|
I only put this in percentage, but if/when the question comes up, the answer is "a significant number of requests". |
|
Thanks @jrgm! I'm glad to see it's all assertion-grant-flow-using client codes in this list, not third-party reliers. The "Firefox Accounts" one should be fixed already IIUC since we're no longer caching the access tokens. The next thing we need to do is confirm whether Firefox and Fennec are handling 401s from expired tokens correctly. If they both recover successfully by e.g. just issuing themselves a new token, then I think we'll be OK to re-enable the expiry. (Ideally, clients would be using the server-provided expiry time to refresh their tokens before they expire, but we can't depend on that in general, so must expect some stable background rate of 401s. If it remains "a significant number of requests" then we should try to prioritize changes to client behaviour to reduce it.) |
Note that this data is from production traffic in the past two days. One other thing to note, is that the tokens that are reporting as expired were issued months ago in some percentage of cases. |
Ah, interesting, we'll have to dig in deeper. Linking to kibana dashboard with these events in it for my reference: https://kibana.fxa.us-west-2.prod.mozaws.net/index.html#/dashboard/elasticsearch/OAuth%20App%20Logs |
|
Summarizing current status: we believe that the latest versions of all of these clients should cope gracefully with expired tokens, but they'll happily keep using a token until we tell them it has expired. These events will therefore continue at their current rate until we start expiring the tokens. @jrgm is going to take a look at the latest breakdown and client versions associated with these logs. We should then try some manual testing with expired tokens on the latest versions of Desktop, Fennec and content-server to confirm that their behaviour is OK. If so, once we're at sufficient penetration of update-to-date versions then we can try re-enabling the expiry. |
|
I looked again at two days and proportions are about the same Unfortunately, as these are server-based calls to /verify, we don't have any UA info. |
|
I tried latest fxa-content-server with an expired oauth token, and it did indeed seem to discard its cached token and fetch a new one. |
|
I tried desktop Fx release channel with an expired oauth token, via the following procedure:
No errors were shown in UI, but it failed to fetch the updated avatar. Console output showed a bunch of errors and unhandled promise rejections. So sadly, I suspect we're not ready to start expiring these just yet :-/ I'll keep digging |
|
Status update for the planning meeting: I'm still working on this, slowly, as a background task. I need to re-test the above with the new Firefox release. |
|
I tried this again in latest release (Firefox 42) and it looks like it's working correctly. The browser detects the 401 from the oauth server, discards its cached token and fetches a new one, and successfully proceeds to update its local version of my profile picture. @jrgm, assuming we haven't started seeing any other client ids in |
|
(putting this back in next and assigning to @jrgm for the next action, which is to confirm that no new client ids have started showing up in the |
|
Current counts don't show any new client_id's using expired tokens, and these proportions: 88.50% 5882386c6d801776 - Firefox |
|
Thanks @jrgm, sounds like we can go ahead with switching the expiry back on. I'm going to close this out. |
I looked at production oauth logs for
token.token.verify.expiredover the past two days and did a rollup of the reported client_id's. So here it is on a percentage basis.The text was updated successfully, but these errors were encountered: