Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only join required tables #2528

Merged

Conversation

philippthun
Copy link
Member

@philippthun philippthun commented Oct 1, 2021

ServicePlanListFetcher and ServiceOfferingListFetcher JOIN multiple tables to be used in WHERE conditions. The actual set of tables to be joined depends on the provided filters. Instead of joining a superset of tables, this change ensures that only the required tables will be joined.

Unfortunately Postgres does not remove unnecessary JOINs and so they possibly result in slower query execution times.

Before using another table in a WHERE clause, one simply has to call join_<table>(dataset). This can even be done multiple times for each table as the underlying join(dataset, type, table, on) function ensures that only unique table expressions are added.

  • I have reviewed the contributing guide

  • I have viewed, signed, and submitted the Contributor License Agreement

  • I have made this pull request to the main branch

  • I have run all the unit tests using bundle exec rake

  • I have run CF Acceptance Tests

ServicePlanListFetcher and ServiceOfferingListFetcher JOIN multiple
tables to be used in WHERE conditions. The actual set of tables to be
joined depends on the provided filters. Instead of joining a superset of
tables, this change ensures that only the required tables will be
joined.

Unfortunately Postgres does not remove unnecessary JOINs and so they
possibly result in slower query execution times.

Before using another table in a WHERE clause, one simply has to call
join_<table>(dataset). This can even be done multiple times for each
table as the underlying join(dataset, type, table, on) function ensures
that only unique table expressions are added.
@philippthun philippthun marked this pull request as ready for review October 1, 2021 17:50
@philippthun
Copy link
Member Author

We ran EXPLAIN (ANALYZE, BUFFERS) for a SELECT DISTINCT "service_plans".* FROM "service_plans" query with the following WHERE condition:

WHERE ((("service_plans"."public" IS TRUE)
  OR ("plan_orgs"."guid" IN ('org_guid'))
  OR ("broker_spaces"."guid" IN ('space_guid')))
AND ("service_instances"."guid" IN ('service_instance_guid')));

The main indicators were:

Cost: 324,264.94
Buffers: shared hit=67150, temp read=16048 written=16048
Execution Time: 3331.547 ms

After removing the (in this case) unnecessary joins LEFT JOIN "organizations" AS "broker_orgs" and LEFT JOIN "spaces" AS "plan_spaces" the EXPLAIN (ANALYZE, BUFFERS) changes to:

Cost=32,145.98
Buffers: shared hit=14007
Execution Time: 65.044 ms

So omitting unnecessary JOINs can have a huge impact (on Postgres)!

@MarcPaquette MarcPaquette self-assigned this Oct 6, 2021
@MarcPaquette MarcPaquette merged commit 79da053 into cloudfoundry:main Oct 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants