Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove bull jobs after processing #1040

Merged
merged 6 commits into from Nov 6, 2020
Merged

Remove bull jobs after processing #1040

merged 6 commits into from Nov 6, 2020

Conversation

dmanjunath
Copy link
Contributor

@dmanjunath dmanjunath commented Nov 5, 2020

Trello Card Link

https://trello.com/c/Hb52wyQl/1665-remove-bull-jobs-after-processing

Description

Bull jobs never get cleaned up from Redis so it eats away at system memory slowly. This change cleans it up after each processes

Services

  • Discovery Provider
  • Creator Node
  • Identity Service
  • Libs
  • Contracts
  • Service Commands
  • Mad Dog

Does it touch a critical flow like Discovery indexing, Creator Node track upload, Creator Node gateway, or Creator Node file system?

Delete an option.

  • ✅ Nope

How Has This Been Tested?

Locally checked redis before uploading a track to make sure it was empty. Uploaded the track and hit the gateway to make sure the IPFS rehydrate job doesn't leave anything behind in redis

Before any actions

127.0.0.1:6379> KEYS *
1) "bull:creator-node-state-machine-1604608258933:stalled-check"
2) "bull:transcoding-queue:stalled-check"
3) "bull:rehydrateIpfs:stalled-check"
4) "bull:image-processing-queue:stalled-check"
5) "bull:creator-node-sync-queue-1604608258934:stalled-check"

After upload

127.0.0.1:6379> KEYS *
 1) "bull:creator-node-sync-queue-1604608258934:id"
 2) "userReqLimiter::ffff:172.18.0.1"
 3) "rate:/users::ffff:172.18.0.1"
 4) "bull:transcoding-queue:stalled-check"
 5) "bull:image-processing-queue:stalled-check"
 6) "rate:/sync::ffff:172.18.0.1"
 7) "bull:transcoding-queue:1"
 8) "imageReqLimit::ffff:172.18.0.1"
 9) "bull:image-processing-queue:id"
10) "bull:transcoding-queue:active"
11) "bull:transcoding-queue:id"
12) "rate:/audius_users::ffff:172.18.0.1"
13) "bull:image-processing-queue:stalled"
14) "bull:creator-node-sync-queue-1604608258934:1"
15) "bull:creator-node-sync-queue-1604608258934:1:lock"
16) "bull:image-processing-queue:1"
17) "SESSION.bhvwyKcnwNTMCMGdgzjw2UU4G7uFAqxnv_LsCGqSm3pPB-LYOtK73g"
18) "bull:image-processing-queue:1:lock"
19) "trackReqLimiter::ffff:172.18.0.1"
20) "bull:transcoding-queue:2"
21) "bull:rehydrateIpfs:stalled-check"
22) "rate:/users/login/challenge::ffff:172.18.0.1"
23) "bull:transcoding-queue:stalled"
24) "bull:transcoding-queue:2:lock"
25) "bull:transcoding-queue:1:lock"
26) "bull:creator-node-sync-queue-1604608258934:active"
27) "rate:/track_content::ffff:172.18.0.1"
28) "bull:image-processing-queue:active"
29) "rate:/audius_users/metadata::ffff:172.18.0.1"
30) "rate:/image_upload::ffff:172.18.0.1"

After secondary sync

127.0.0.1:6379> KEYS *
 1) "bull:creator-node-sync-queue-1604608258934:id"
 2) "userReqLimiter::ffff:172.18.0.1"
 3) "rate:/tracks/metadata::ffff:172.18.0.1"
 4) "rate:/users::ffff:172.18.0.1"
 5) "rate:/sync::ffff:172.18.0.1"
 6) "imageReqLimit::ffff:172.18.0.1"
 7) "bull:image-processing-queue:id"
 8) "bull:transcoding-queue:id"
 9) "rate:/audius_users::ffff:172.18.0.1"
10) "bull:image-processing-queue:stalled"
11) "SESSION.bhvwyKcnwNTMCMGdgzjw2UU4G7uFAqxnv_LsCGqSm3pPB-LYOtK73g"
12) "trackReqLimiter::ffff:172.18.0.1"
13) "rate:/users/login/challenge::ffff:172.18.0.1"
14) "bull:transcoding-queue:stalled"
15) "rate:/tracks::ffff:172.18.0.1"
16) "rate:/track_content::ffff:172.18.0.1"
17) "rate:/audius_users/metadata::ffff:172.18.0.1"
18) "rate:/image_upload::ffff:172.18.0.1"

Although there are aggregate keys still in redis, It's definitely cleaning up after itself more

Copy link
Contributor

@SidSethi SidSethi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

big dubs

Copy link
Contributor

@piazzatron piazzatron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like Bull let's you pass removeOnComplete as a default option to the queue constructor, which means we don't have to worry about accidentally forgetting to pass it to some call to add. Maybe a bit safer?

OptimalBits/bull#842

(Randomly, the original GH issue for this + PR was by this React "celebrity dev" max stoiber!)

@piazzatron
Copy link
Contributor

Actually looks like they made this the default behavior of Bull at some point - maybe we just have to upgrade?
atlassian/github-for-jira#337

@dmanjunath
Copy link
Contributor Author

dmanjunath commented Nov 5, 2020

@piazzatron i'm trying that and it seems to work but somehow the test you wrote for manual syncs being triggered first broke when i made it a queue default as opposed to a job option. trying to fix now so that's the reason this is taking some time

@SidSethi SidSethi self-requested a review November 5, 2020 23:30
@dmanjunath
Copy link
Contributor Author

@piazzatron i'm trying that and it seems to work but somehow the test you wrote for manual syncs being triggered first broke when i made it a queue default as opposed to a job option. trying to fix now so that's the reason this is taking some time

@piazzatron user error. can you look again

Copy link
Contributor

@piazzatron piazzatron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beautiful

Copy link
Contributor

@hareeshnagaraj hareeshnagaraj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's get it

@dmanjunath dmanjunath merged commit ea99582 into master Nov 6, 2020
@dmanjunath dmanjunath deleted the dm-bull-remove-jobs branch November 6, 2020 03:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants