You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As a DKAN operator I want to avoid excessive resource usage when API requests have a large offset on a large table.
Currently an offset of 1000000 on a large table requires over a second for MySQL to return the query results due to it having to count over this many records.
Unfortunately this isn't something that can be resolved by using indexes, but it is possible to improve performance by around a factor of 4 by doing the offset processing on the primary key field alone and then using this to select the fields using a join.
Current query:
SELECT t.field1 AS field1, t.field2 AS field2
FROM datastore_abc123 t
WHERE t.field1 = "unicorn"
LIMIT 500 OFFSET 1000000
Proposed query:
SELECT t.field1 AS field1, t.field2 AS field2
FROM datastore_abc123 t
INNER JOIN
(SELECT record_number
FROM datastore_abc123 t
WHERE t.field1 = "unicorn"
LIMIT 500 OFFSET 1000000)
AS i USING(record_number);
Acceptance Criteria
Query results are unchanged
Query performance in this case is ~4x faster
Query performance in other cases is not adversely affected in a significant way
The text was updated successfully, but these errors were encountered:
User Story
As a DKAN operator I want to avoid excessive resource usage when API requests have a large offset on a large table.
Currently an offset of 1000000 on a large table requires over a second for MySQL to return the query results due to it having to count over this many records.
Unfortunately this isn't something that can be resolved by using indexes, but it is possible to improve performance by around a factor of 4 by doing the offset processing on the primary key field alone and then using this to select the fields using a join.
Current query:
Proposed query:
Acceptance Criteria
The text was updated successfully, but these errors were encountered: