Optimize generate_username query #18061

wjordan · 2017-09-30T03:37:35Z

Followup/fix to #12745.

This PR uses a faster, and more accurate MySQL query for generating pseudo-sequential usernames.

While the query in #12745 selected an integer one greater than the maximum of all found integers, this query selects the first gap in the integer sequence. This is more accurate/reliable (because a single, random high-value integer won't cause all future pseudo-sequential numbers to be strictly greater), and it also turns out to be a faster query to boot.

Compare:

Previous PR query (super-long number, 4.29 sec):

mysql> SELECT  MAX(CAST(SUBSTRING(`username`, 6) as unsigned)) as `id` FROM `users` WHERE `users`.`deleted_at` IS NULL AND (username LIKE 'coder%' and username RLIKE '^coder[0-9]+$') ORDER BY `users`.`id` ASC LIMIT 1;
+--------------+
| id           |
+--------------+
| 756735909814 |
+--------------+
1 row in set (4.29 sec)

Current PR query (shortest-available number, 2.58 sec):

mysql> SELECT CAST(SUBSTRING(username, 7) as unsigned) + 1 as id  FROM users u   WHERE username LIKE "coder%"     AND username RLIKE "^coder[0-9]+$"      AND NOT EXISTS (       SELECT 1       FROM users u2       WHERE u2.username = CONCAT("coder", CAST(SUBSTRING(u.username, 7) as unsigned) + 1)     )   LIMIT 1;
+------+
| id   |
+------+
| 1125 |
+------+
1 row in set (2.58 sec)

This will fix the issue encountered in our UI tests because the previous query quickly encountered an integer-overrun issue, where the max value "999999[...etc...]" was selected, increased by 1 to "1000000[...etc...]", increasing its string length by 1. If the 999-etc string was already the maximum allowed username length, the newly-generated username would fail validation. The new query will only encounter this issue if all of the gaps in the sequence are also exhausted, which will be much less likely. I've also added a raise on this case, so we should be able to detect this error-condition more easily.

wjordan · 2017-09-30T03:40:04Z

explain for this query:

mysql> explain SELECT CAST(SUBSTRING(username, 7) as unsigned) + 1 as id  FROM users u   WHERE username LIKE "coder%"     AND username RLIKE "^coder[0-9]+$"      AND NOT EXISTS (       SELECT 1       FROM users u2       WHERE u2.username = CONCAT("coder", CAST(SUBSTRING(u.username, 7) as unsigned) + 1)     )   LIMIT 1;
+----+--------------------+-------+-------+----------------------------------------+----------------------------------------+---------+------+---------+--------------------------+
| id | select_type        | table | type  | possible_keys                          | key                                    | key_len | ref  | rows    | Extra                    |
+----+--------------------+-------+-------+----------------------------------------+----------------------------------------+---------+------+---------+--------------------------+
|  1 | PRIMARY            | u     | range | index_users_on_username_and_deleted_at | index_users_on_username_and_deleted_at | 768     | NULL | 4493510 | Using where; Using index |
|  2 | DEPENDENT SUBQUERY | u2    | ref   | index_users_on_username_and_deleted_at | index_users_on_username_and_deleted_at | 768     | func |       1 | Using where; Using index |
+----+--------------------+-------+-------+----------------------------------------+----------------------------------------+---------+------+---------+--------------------------+

Use more accurate MySQL query for finding gaps in integer sequence.

wjordan · 2017-09-30T03:45:23Z

since the earlier version of this PR has been code-reviewed previously, I plan on merging this fix now so that I can run the full set of UI tests against it right away.

Optimize generate_username query

e44b21a

Use more accurate MySQL query for finding gaps in integer sequence.

wjordan force-pushed the generate_username_fix branch from 7de927e to e44b21a Compare September 30, 2017 03:43

wjordan merged commit 256c411 into staging Sep 30, 2017

wjordan deleted the generate_username_fix branch November 8, 2017 01:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize generate_username query #18061

Optimize generate_username query #18061

wjordan commented Sep 30, 2017

wjordan commented Sep 30, 2017

wjordan commented Sep 30, 2017

Optimize generate_username query #18061

Optimize generate_username query #18061

Conversation

wjordan commented Sep 30, 2017

wjordan commented Sep 30, 2017

wjordan commented Sep 30, 2017