Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

N+1 slowdown on index page (playbooks list) #534

Open
bendem opened this issue Jan 2, 2024 · 2 comments
Open

N+1 slowdown on index page (playbooks list) #534

bendem opened this issue Jan 2, 2024 · 2 comments

Comments

@bendem
Copy link

bendem commented Jan 2, 2024

I was looking into why the index of ara takes so long to load (2s on good days) and I found that ara does 7 queries per playbook displayed in the table (6 counts for plays, tasks, results, hosts, files, records and 1 select for the labels).

We use ARA_PAGE_SIZE=50 but there are really no reason for the page to load that slow.

I've played with it a bit and with postgres you can get the same data off the db in 58ms by counting in the playbooks query and doing a join+aggregate for the labels.

SELECT distinct on ("playbooks"."id")
    "playbooks"."id",
    "playbooks"."created",
    "playbooks"."updated",
    "playbooks"."started",
    "playbooks"."ended",
    "playbooks"."duration",
    "playbooks"."name",
    "playbooks"."ansible_version",
    "playbooks"."client_version",
    "playbooks"."python_version",
    "playbooks"."server_version",
    "playbooks"."status",
    "playbooks"."arguments",
    "playbooks"."path",
    "playbooks"."controller",
    "playbooks"."user",

    (SELECT COUNT(*) AS "__count" FROM "plays" WHERE "plays"."playbook_id" = "playbooks"."id") as "plays_count",
    (SELECT COUNT(*) AS "__count" FROM "tasks" WHERE "tasks"."playbook_id" = "playbooks"."id") as "tasks_count",
    (SELECT COUNT(*) AS "__count" FROM "results" WHERE "results"."playbook_id" = "playbooks"."id") as "results_count",
    (SELECT COUNT(*) AS "__count" FROM "hosts" WHERE "hosts"."playbook_id" = "playbooks"."id") as "hosts_count",
    (SELECT COUNT(*) AS "__count" FROM "files" WHERE "files"."playbook_id" = "playbooks"."id") as "files_count",
    (SELECT COUNT(*) AS "__count" FROM "records" WHERE "records"."playbook_id" = "playbooks"."id") as "records_count",

    array_agg("playbooks_labels"."name") OVER (PARTITION BY "playbooks"."id") as "labels"
FROM "playbooks"
JOIN (
    SELECT
        "playbooks_labels"."playbook_id",
        "labels"."name"
    FROM "labels"
    INNER JOIN "playbooks_labels" ON ("labels"."id" = "playbooks_labels"."label_id")
) "playbooks_labels" ON "playbooks_labels"."playbook_id" = "playbooks"."id"
ORDER BY "playbooks"."id" DESC LIMIT 50;

class ItemCountSerializer(serializers.ModelSerializer):
class Meta:
abstract = True
# For counting relationships to other objects
items = serializers.SerializerMethodField()
@staticmethod
def get_items(obj):
types = ["plays", "tasks", "results", "hosts", "files", "records"]
items = {item: getattr(obj, item).count() for item in types if hasattr(obj, item)}
return items

@bendem
Copy link
Author

bendem commented Jan 2, 2024

Postgres logs with timestamp (1.472s database time).

@bendem bendem changed the title N+1 slowdown on index (playbooks list) N+1 slowdown on index page (playbooks list) Jan 2, 2024
@l00d3r
Copy link

l00d3r commented Jan 9, 2024

I think I am affected by the same thing that is causing the issue here but in a different way, after upgrading ara-server from 1.6.1 to 1.7.0 our playbook runs have had 50% increase in run times (some are 6min to 9min, some went from 14min to 21min )

Why I think the root cause is the same is because there is a ~2sec delay between each task run reporting to ara server.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants