First request to .../builds extremely slower than others for job with huge build history #3754

xtremerui · 2019-04-18T16:49:53Z

As shown in the gif (first 15sec are cut), when I go to https://ci.concourse-ci.org/teams/examples/pipelines/time-triggered/jobs/job/builds/200845

The first request to ../builds endpoint took about 17s, comparing of which to those subsequent paginated calls to the same endpoint shows it could be as 5x slower (17s vs 2s vs 3s).

For a job that has less builds(e.g. couple hundreds) the first call takes around only X0ms to X00ms and the paginated calls show no big difference.

So there are two questions here:

Why is the call to ../builds that slow for the job who has huge build history? Since it will fetch first 100 builds if range is not specified, which should be technically the same as a request against a job with less builds.
Why the paginated request to ../builds are much faster (relatively) than the first one?

stale · 2019-07-16T15:00:37Z

Beep boop! This issue has been idle for long enough that it's time to check
in and see if it's still important.

If it is, what is blocking it? Would anyone be interested in submitting a
PR or
continuing the discussion to help move things forward?

If no activity is observed within the next week, this issue will be
~~exterminated~~ closed, in accordance with our stale issue
process.

cirocosta · 2019-09-30T18:16:44Z

Hey,

I'm reopening this one as this is still a thing.

Thanks!

stale · 2019-11-29T18:49:28Z

Beep boop! This issue has been idle for long enough that it's time to check in and see if it's still important.

If it is, what is blocking it? Would anyone be interested in submitting a PR or continuing the discussion to help move things forward?

If no activity is observed within the next week, this issue will be ~~exterminated~~ closed, in accordance with our stale issue process.

marco-m-pix4d · 2024-01-23T09:47:02Z

Reopening as this is still happening at least with Concourse 7.9.1.
Will report if we still see this problem when upgrading to a more recent Concourse.

xtremerui · 2024-01-23T21:29:23Z

@marco-m-pix4d how big is your builds table? have you tried to manually clean up not needed rows in there?

odormond · 2024-01-30T12:37:41Z

@marco-m-pix4d how big is your builds table? have you tried to manually clean up not needed rows in there?

We have this many rows in our DB:

teams	pipelines	jobs	resources	builds	build_comments
5	365	7806	8398	758906	47

We run concourse with:

--default-build-logs-to-retain=50
--max-build-logs-to-retain=1000

I'm not aware of pipelines overriding the default retention setting but I will not rule it out. :-D

I was able to pinpoint the query (as seen from postgres) to:

SELECT b.id, b.name,
       b.job_id,
       b.resource_id,
       b.resource_type_id,
       b.team_id,
       b.status,
       b.manually_triggered,
       b.created_by,
       b.scheduled,
       b.schema,
       b.private_plan,
       b.public_plan,
       b.create_time,
       b.start_time,
       b.end_time,
       b.reap_time,
       j.name,
       r.name,
       b.pipeline_id,
       p.name,
       p.instance_vars,
       t.name,
       b.nonce,
       b.drained,
       b.aborted,
       b.completed,
       b.inputs_ready,
       b.rerun_of,
       rb.name,
       b.rerun_number,
       b.span_context,
       COALESCE(bc.comment, '')
FROM builds b
LEFT OUTER JOIN jobs j ON b.job_id = j.id
LEFT OUTER JOIN resources r ON b.resource_id = r.id
LEFT OUTER JOIN pipelines p ON b.pipeline_id = p.id
LEFT OUTER JOIN teams t ON b.team_id = t.id
LEFT OUTER JOIN builds rb ON rb.id = b.rerun_of
LEFT OUTER JOIN build_comments bc ON b.id = bc.build_id
WHERE b.pipeline_id = $1 AND b.id < $2
ORDER BY COALESCE(b.rerun_of, b.id) DESC, b.id DESC
LIMIT 1;

For the first build id of the pipeline, the b.id < $2 forces the query to scan the cross product of the tables and due to the builds showing up twice (as b and rb) it has to scan $O(N_{builds}^2)$ rows which makes it blow up the roof with the query taking ~1.5 seconds and sometime up to 4 seconds.
For the second build, it will stop on the first one due to the LIMIT 1. In that case, it only takes ~10ms to run the query.

xtremerui added bug core/api labels Apr 18, 2019

stale bot added the wontfix label Jul 16, 2019

stale bot closed this as completed Jul 23, 2019

cirocosta reopened this Sep 30, 2019

stale bot removed the wontfix label Sep 30, 2019

stale bot added the wontfix label Nov 29, 2019

stale bot closed this as completed Dec 6, 2019

marco-m-pix4d reopened this Jan 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

First request to .../builds extremely slower than others for job with huge build history #3754

First request to .../builds extremely slower than others for job with huge build history #3754

xtremerui commented Apr 18, 2019 •

edited

stale bot commented Jul 16, 2019

cirocosta commented Sep 30, 2019 •

edited

stale bot commented Nov 29, 2019

marco-m-pix4d commented Jan 23, 2024

xtremerui commented Jan 23, 2024

odormond commented Jan 30, 2024

First request to .../builds extremely slower than others for job with huge build history #3754

First request to .../builds extremely slower than others for job with huge build history #3754

Comments

xtremerui commented Apr 18, 2019 • edited

stale bot commented Jul 16, 2019

cirocosta commented Sep 30, 2019 • edited

stale bot commented Nov 29, 2019

marco-m-pix4d commented Jan 23, 2024

xtremerui commented Jan 23, 2024

odormond commented Jan 30, 2024

xtremerui commented Apr 18, 2019 •

edited

cirocosta commented Sep 30, 2019 •

edited