Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Performance and database issues #2111
We're having concourse jobs stuck in pending state for a very long time, all while seeing lots of slow queries on our postgres server. Also seeing resources being slow to trigger, sometimes taking hours, even when
The first time this happened, we got some relief by reducing amount of logs being retained via
The problem recently struck again and we found another job that needed the
Clues in database:
Please help the Concourse fans on my team keep using Concourse, so that mgmt doesn't force us on to Jenkins. :)
@timrchavez - Default overlay filesystem driver is being used to my knowledge. We're deploying via concourse-deployment with just a small set of ops files to change scale, get https, and use an externalized postgres. Is there a good clue in our interpolated bosh manifest which shows which one? I don't see
On the database front, I did some more postgres sleuthing and noticed that the
It sounds like tuples should normally be close to rows?
Update - our friends at Crunchy Data helped us discover that a long-running query/transaction within another schema on the same database was causing our issue. This query was keeping auto vacuum from cleaning up dead rows across all schemas, including our concourse schema. Dead rows can cause major performance problems evidently.
It was a good lesson for us on what the blast radius can be between schemas within the same database. Monitoring is being added to catch both long-running transactions and accumulation of dead tuples.
For reference, the query we ran to find the guilty query (will show itself in output):
Thanks for the replies and suggestions on how to go after this. Happy Concoursing!
I believe our database admins did kill the sessions, and had a conversation with the service owners. From: keithkroeger [mailto:firstname.lastname@example.org] Sent: Friday, March 22, 2019 4:54 AM To: concourse/concourse <email@example.com> Cc: Leliaert, Aaron <Aaron.Leliaert@digitalglobe.com>; State change <firstname.lastname@example.org> Subject: Re: [concourse/concourse] Performance and database issues (#2111) CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe. We are also seeing a similar problem in our 3.13 environment. Once you found the queries, did you kill the sessions or take some other approach? — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub<#2111 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AKYafPNUAnvIt7sNnmW0MFhHQtBBAKbVks5vZLYqgaJpZM4S4-kd>.