API: Poor performance when querying playbooks or other resources with lots of children #158
Labels
api
Related to the API
cli
Related to the CLI
performance
Related to performance
UI
Related to the built-in user interface
Milestone
What component is this about ?
The API and to an extent it's consumers (built-in UI, CLI, ara-web, etc.)
What is happening ?
For playbooks with many results (say, >=5k), responding to an API call on
/api/v1/playbooks/<id>
or rendering the playbook in the UI inside the browser can be pretty slow.Some metrics from a deployment with gunicorn and a local mysql database on a single good virtual machine:
5 hosts, 276 results, 315 files
5 hosts, 1932 results, 203 files
53 hosts, 7580 results, 44 files
334 hosts, 21074 results, 231 files
195 hosts, 27465 results, 97 files
This is because the API attempts, in good faith, to return all the data about a playbook in the right context in a single call without pagination. We do this for other resources as well, for example plays include their tasks which include their results which include their host.
In hindsight, that was a mistake because it doesn't scale very well. Learning to design an API live in production :)
What should be happening ?
Querying for a playbook's details shouldn't return ALL of it's children resources (plays, tasks, results, files, records, etc.) because these children resources can easily be obtained (with pagination and search parameters) by querying
/api/v1/plays?playbook=<id>
,/api/v1/results?playbook=<id>?status=failed
and so on.Fixing this would be a significant API change.
The text was updated successfully, but these errors were encountered: