Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issues when there is a large number of failed jobs #325

Open
mattburton opened this issue Sep 14, 2023 · 1 comment
Open

Performance issues when there is a large number of failed jobs #325

mattburton opened this issue Sep 14, 2023 · 1 comment

Comments

@mattburton
Copy link

Hello - we encountered an issue recently where we had a Hangfire instance that had collected a large number of failed jobs (over 6,000) due to some issues in our system. Luckily this was a staging environment, so they were accumulating unnoticed, but when we tried to load the dashboard it was incredibly slow to perform any action. This issue did not appear to affect the creation or execution of jobs themselves, thankfully, but our monitoring tooling is showing that the top 3 slowest queries in our system currently are all Hangfire dashboard / cleanup related:

SELECT j . id Id, j . invocationdata InvocationData, j . arguments Arguments, j . createdat CreatedAt, j . expireat ExpireAt, ? FetchedAt, j . statename StateName, s . reason StateReason, s . data StateData FROM hangfire . job j LEFT JOIN hangfire . state s ON j . stateid = ? . id WHERE j . statename = @StateName ORDER BY j . id DESC LIMIT @Limit OFFSET @Offset

SELECT statename State, COUNT ( id ) Count FROM hangfire . job WHERE statename IS NOT ? GROUP BY statename SELECT COUNT ( * ) FROM hangfire . server SELECT SUM ( value ) FROM ( SELECT SUM ( value ) FROM hangfire . counter WHERE key = ? UNION ALL SELECT SUM ( value ) FROM hangfire . aggregatedcounter WHERE key = ? ) c SELECT SUM ( value ) FROM ( SELECT SUM ( value ) FROM hangfire . counter WHERE key = ? UNION ALL SELECT SUM ( value ) FROM hangfire . aggregatedcounter WHERE key = ? ) c SELECT COUNT ( * ) FROM hangfire . set WHERE key = ?

DELETE FROM hangfire . job WHERE id IN ( SELECT id FROM hangfire . job WHERE expireat < NOW ( ) LIMIT ? )

Perhaps there are missing indexes, or index adjustments needed to handle the situation where there are large numbers of failed jobs?

We have wiped out the current state on our staging server for now, but we have backups for that time period if you need any more specific information.

Thanks in advance for any help you can provide and for all you do to maintain this library - it's greatly appreciated!

@azygis
Copy link
Collaborator

azygis commented Sep 19, 2023

Would like to have the backup of hangfire tables, if you can provide them. Could help a lot to not create imaginary random values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants