Skip to content
This repository has been archived by the owner. It is now read-only.

fix(db): ensure that devices get deleted with session tokens #160

Merged
merged 2 commits into from Aug 23, 2016

Conversation

@philbooth
Copy link
Contributor

philbooth commented Aug 17, 2016

Fixes mozilla/fxa-auth-server#1423 (although I've found another related problem that we also need to fix, see mozilla/fxa-content-server#4060).

The fix itself is straightforward, just ensuring that we delete any device records inside deleteSessionToken. However there is also another change in the migration that may be contentious: I'm explicitly purging all session-less device records.

This is because any such records only exist in our db because of earlier problems with our implementation of device registration. I can think of no legitimate reason for them to exist and we definitely don't want them to show up in anyone's device list.

@rfk, I'm tagging you for the review because of the contentious bit, are you comfortable with us doing that? I know we discussed earlier on about only doing it for placeholder devices but, having given it some further thought, I think we should be more ruthless.

@philbooth philbooth force-pushed the phil/auth-server-issue-1423 branch from 45a97be to 50f6092 Aug 17, 2016
DELETE FROM devices
USING devices LEFT JOIN sessionTokens
ON devices.sessionTokenId = sessionTokens.tokenId
WHERE sessionTokens.tokenId IS NULL;

This comment has been minimized.

Copy link
@rfk

rfk Aug 18, 2016

Member

We should also consider the performance implications of doing this against large tables in production. I wonder if it will cause a window of write outage on the sessionstokens and/or devices table.

This comment has been minimized.

Copy link
@philbooth

philbooth Aug 18, 2016

Author Contributor

Yep, realise it needs @jrgm not to freak out in order to stay in, wanted to see what you thought of the concept first.

I'll try it out on my session-tokeny local db today, although given your point about cliffs and MySQL yesterday, it may not be proving much. I only have ~8 million session tokens at the moment, obviously I can keep increasing that but I've no idea at what point the cliff begins.

@rfk

This comment has been minimized.

Copy link
Member

rfk commented Aug 18, 2016

I'm trying to decide whether there's any reason for a device-registration-aware browser to have a deviceId but not a sessionToken. I know that in the initial design of this feature, we intended the device record to be longer lived, e.g. the device could get disconnected because of a password reset, but when you signed back in it would re-use the existing device record.

I don't think that's currently possible though.

So I think I'm OK with the contentious part here. But if "device records must always have a corresponding sessionToken" is now an invariant we want to enforce, it's probably worth doing a broader audit to ensure it's enforcable and documented. For example, does the change-password flow interact sensibly with this restriction?

@philbooth

This comment has been minimized.

Copy link
Contributor Author

philbooth commented Aug 18, 2016

it's probably worth doing a broader audit to ensure it's enforcable and documented. For example, does the change-password flow interact sensibly with this restriction?

Good point, will do.

@philbooth philbooth added the WIP label Aug 18, 2016
@philbooth

This comment has been minimized.

Copy link
Contributor Author

philbooth commented Aug 18, 2016

I know that in the initial design of this feature, we intended the device record to be longer lived, e.g. the device could get disconnected because of a password reset, but when you signed back in it would re-use the existing device record.

It's dawned on me that either I never realised this when we implemented it, or I forgot all about it. So when I wrote the client code, I should have included affordances to keep track of and re-use the device id across multiple connections/disconnections. My mental model was all wrong.

This explains why the stored procedures for resetAccount and deleteSessionToken were the way they were at the time. And there was me thinking that their failure to DELETE FROM devices was a bug, oh dear. 😊

@philbooth

This comment has been minimized.

Copy link
Contributor Author

philbooth commented Aug 18, 2016

I know that in the initial design of this feature, we intended the device record to be longer lived, e.g. the device could get disconnected because of a password reset, but when you signed back in it would re-use the existing device record.

It's dawned on me that either I never realised this when we implemented it, or I forgot all about it.

Apologies, it turns out we've had this conversation before. And as part of the broader audit you asked me to do, I've now uncovered a bug directly caused by my misunderstanding:

https://bugzilla.mozilla.org/show_bug.cgi?id=1296328

On discovering it, part of me was tempted to go in the other direction and do the server-side work to ensure that device ids remain permanent after the initial registration. But I worry that will:

  1. Require a more complex client patch than the very simple one outlined in my description of the linked bug.
  2. Create a historical spread of Firefox versions that always leave zombie device records in the db (because they aren't preserving the device id).

So I'm still in favour of doing what I set out to in this PR, but just wanted to link to the above as a point of discussion.

@philbooth philbooth force-pushed the phil/auth-server-issue-1423 branch from 50f6092 to 634c85c Aug 18, 2016
@philbooth philbooth force-pushed the phil/auth-server-issue-1423 branch from 634c85c to fa2152b Aug 19, 2016
@philbooth

This comment has been minimized.

Copy link
Contributor Author

philbooth commented Aug 19, 2016

@rfk, in the interests of tying up loose ends before I go on holiday tomorrow and (hopefully) landing a useful fix in time for train 68, I've removed the contentious query from this PR and opened #162 to discuss it.

Hence I'm removing the WIP label and reinstating your face in Waffle. r?

@philbooth philbooth removed the WIP label Aug 19, 2016
@philbooth philbooth force-pushed the phil/auth-server-issue-1423 branch from fa2152b to 840dda6 Aug 19, 2016

DELETE FROM sessionTokens WHERE tokenId = tokenIdArg;
DELETE FROM unverifiedTokens WHERE tokenId = tokenIdArg;
DELETE FROM devices WHERE sessionTokenId = tokenIdArg;

This comment has been minimized.

Copy link
@rfk

rfk Aug 22, 2016

Member

Sadly there's no index on sessionTokenId, so this would result in a full table scan of a pretty sizeable table:

mysql> explain DELETE FROM devices WHERE sessionTokenId = 'foobar';
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table   | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | DELETE      | devices | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    1 |   100.00 | Using where |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set (0.01 sec)

There is an index on (uid, sessionTokenId) that could be used for this, but we'd have to change the way this procedure is called in order to pass in the uid as well:

 explain DELETE FROM devices WHERE uid='x' AND sessionTokenId = 'foobar';
+----+-------------+---------+------------+-------+-----------------------------------+---------+---------+-------+------+----------+-------------+
| id | select_type | table   | partitions | type  | possible_keys                     | key     | key_len | ref   | rows | filtered | Extra       |
+----+-------------+---------+------------+-------+-----------------------------------+---------+---------+-------+------+----------+-------------+
|  1 | DELETE      | devices | NULL       | range | PRIMARY,UQ_devices_sessionTokenId | PRIMARY | 16      | const |    1 |   100.00 | Using where |
+----+-------------+---------+------------+-------+-----------------------------------+---------+---------+-------+------+----------+-------------+
1 row in set (0.00 sec)
@philbooth

This comment has been minimized.

Copy link
Contributor Author

philbooth commented Aug 22, 2016

Sadly there's no index on sessionTokenId, so this would result in a full table scan of a pretty sizeable table.

Ah, bummer. Would adding the index to the table an option or is that problematic too?

Anyway, I'll take the shipit off the issue in Waffle and come back to this when I get back.

@rfk

This comment has been minimized.

Copy link
Member

rfk commented Aug 22, 2016

Adding another index would probably add too much overhead; a better option may be to look up the uid field from the sessionTokens table, and use that to delete from devices using the existing (uid, sessionTokenId) index. This makes the delete slightly more expensive, but we don't have to maintain an additional index with every insert/update in the table.

@rfk

This comment has been minimized.

Copy link
Member

rfk commented Aug 23, 2016

I've pushed a small fix to make the query use an index; @seanmonstar would you mind doing a second r? on this?

@rfk rfk assigned seanmonstar and unassigned rfk Aug 23, 2016
@rfk

This comment has been minimized.

Copy link
Member

rfk commented Aug 23, 2016

@philbooth also having thought about it for a while, I'm fine with your "delete all the device records that don't have a valid sessionToken" step as well. Perhaps we can bring it back in train-69.

@seanmonstar

This comment has been minimized.

Copy link
Member

seanmonstar commented Aug 23, 2016

@rfk new procedure looks correct. Left me wondering why the devices table has an index on a tuple, whereas the other tables index just the tokenId, but whatcha-gonna-do.

@seanmonstar seanmonstar removed their assignment Aug 23, 2016
@rfk

This comment has been minimized.

Copy link
Member

rfk commented Aug 23, 2016

Left me wondering why the devices table has an index on a tuple

It's to enable both this kind of lookup-by-sessionTokenId query, and to lookup all device records belonging to a single user. Could have used two separate indexes for this I guess, but in code we always have the uid whenever we have the sessionTokenId.

@rfk rfk merged commit 2350b1f into master Aug 23, 2016
3 checks passed
3 checks passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details
coverage/coveralls Coverage remained the same at 93.936%
Details
@rfk rfk removed the waffle:active label Aug 23, 2016
@rfk

This comment has been minimized.

Copy link
Member

rfk commented Aug 23, 2016

whereas the other tables index just the tokenId,

Also on the other tables, the tokenId is the primary key.

@philbooth philbooth deleted the phil/auth-server-issue-1423 branch Aug 26, 2016
@philbooth philbooth restored the phil/auth-server-issue-1423 branch Aug 26, 2016
@philbooth philbooth deleted the phil/auth-server-issue-1423 branch Aug 26, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
3 participants
You can’t perform that action at this time.