Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow to remove offline workers from dashboard #852

Merged
merged 10 commits into from
Jun 12, 2020
Merged

Conversation

bstiel
Copy link
Contributor

@bstiel bstiel commented Nov 1, 2018

Offline workers create a lot of noise in the flower dashboard when running Celery in a containerised environment like kubernetes. See also #840.

I've added a new option purge_offline_workers (--purge_offline_workers / FLOWER_PURGE_OFFLINE_WORKERS) that removes offline workers from the flower dashboard.

purge_offline_workers is optional and defaults to False so that it does not impact default behaviour.

@dejlek
Copy link

dejlek commented Nov 6, 2018

Awesome!

@jheld
Copy link
Contributor

jheld commented Nov 16, 2018

Ideally there would be a way to target a single/group of workers, as opposed to all offline workers. But this is still great. Unfortunately that would be a departure from the implementation here which isn't programmatic.

@jheld
Copy link
Contributor

jheld commented Nov 16, 2018

I wonder if there could be a middle ground on the forcing of clearing, between these 2 PRs, #709

709's programmatic logic, and then this one's "turn it on all the time if you want it", logic.

@dejlek
Copy link

dejlek commented Nov 20, 2018

I understood this PR as a config option for what you refer to as "turn it on all the time if you want it". For "on demand" cleanup you need to modify the frontend as well I guess, so it should require another PR...

Copy link

@dejlek dejlek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@aidaerfanian
Copy link

I am using 'mher/flower' image in a Kubernetes pod. I'd like to use this new option to remove my offline workers from my flower's list. But, I am not sure how to use the option. I've done as follows, but I am still seeing all offline and online workers.

spec:
containers:
- command:
- celery
- flower
- --purge_offline_workers=True
image: mher/flower
imagePullPolicy: Always
name: flower

@jheld
Copy link
Contributor

jheld commented Feb 5, 2019

Correct. The internal logic for that command is just to "get" info from the connected workers. It doesn't actually purge the original data set.

@aidaerfanian
Copy link

Well, I am looking for a way to filter the offline workers from the dashboard. Currently, I have to look into a long list (that mostly includes offline workers) to find my online workers! Is there any solution to this?

@jheld
Copy link
Contributor

jheld commented Feb 7, 2019

You can sort by Active, but otherwise there is no direct way to simply remove them from the list at this point.

@schnie
Copy link

schnie commented Mar 13, 2019

Thanks for this @bstiel, this is exactly what we're looking for - just tested it out in Kubernetes. One thing I noticed was that I had a group of workers, that I restarted. I had 5 worker pods and they all disappeared from flower, then 5 new workers showed up as expected - perfect.

A few minutes later, the old worker pods ended up showing back up after they had finished processing their tasks from their "warm shutdown" in celery. I'm guessing that when they reported back with their tasks statuses, flower somehow picked back up on them and started showing them again.

@jj-ookla
Copy link

Is there any movement on this PR?

@dejlek
Copy link

dejlek commented Aug 6, 2019

I really hope this to be merged in the foreseeable future! @mher what needs to be done for this to be merged?

Copy link
Contributor

@Jamim Jamim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @bstiel,

Thank you for implementing a great feature!
Could you please make some minor changes and resolve conflicts?

Regards!

@@ -1,4 +1,4 @@
======================================
update_workers======================================
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this line was changed by mistake. Wasn't it?

Suggested change
update_workers======================================
======================================

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason, GitHub renders this suggestion incorrectly.

flower/views/dashboard.py Show resolved Hide resolved
docs/config.rst Outdated Show resolved Hide resolved

from tests.unit import AsyncHTTPTestCase
from tests.unit.utils import task_succeeded_events, task_failed_events
from tests.unit.utils import HtmlTableParser

if sys.version_info >= (2, 7):
Copy link
Contributor

@Jamim Jamim Aug 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sys.version_info >= (2, 7) is False only for Python 2.6- and those versions are not supported.
Also, I'm quite sure that from unittest.mock import patch, PropertyMock doesn't work for them anyway.
So you have to remove the try-except and keep from mock import patch, PropertyMock only.

tests/unit/views/test_dashboard.py Show resolved Hide resolved
@Jamim
Copy link
Contributor

Jamim commented Sep 17, 2019

ping @bstiel

@dejlek
Copy link

dejlek commented Oct 24, 2019

@mher - could you please review this, let us know what stops it from being merged so that we can fix any remaining issues... I really need this in Flower. At the moment I am maintaining my own branch with this PR merged in so that I can use this in production (otherwise Flower is useless to us).

@naitik-aubergine
Copy link

@dejlek @mher is there any reason we are not merging this PR? it would really help us. thanx

@rns4731
Copy link

rns4731 commented Jan 24, 2020

+1 it'd be great to have this PR merged @mher

@jheld
Copy link
Contributor

jheld commented Feb 6, 2020

There are a number of major to critical PRs open against flower, many of which it seems are being maintained by developers for their own projects/companies.

There are probably 3-4 of these type of PRs (removal of offline workers).

@mher it looks like there are occasional light-weight PRs being merged into master. PyPI is lagging. From reading (and creating some) issues & PRs on this project for close to 2 years, it seems that you and the project could use additional maintainers.

Merging stable (and correct) and maintainable changes is critical for a system like this (one big breakage could make the app nearly useless for many people), but if there are PRs that are in good standing, it seems problematic that they do not get merged.

I and a couple others have offered to help with triaging & merging & releasing. Definitely still interested.

Please let us know how we can help.

@jheld
Copy link
Contributor

jheld commented Apr 18, 2020

@bstiel can you rebase?

@jheld
Copy link
Contributor

jheld commented Apr 24, 2020

@mher if the conflicts and other discussions are resolved (this PR or one based on it), would you merge this?

I would be up for having an optional timer value (remove an offline worker after X seconds), but I'd rather see this merged and then build that in as a follow up PR.

@bstiel
Copy link
Contributor Author

bstiel commented Apr 24, 2020

apologies, I've only seen the rebase question now. I'll do that over the weekend

@engin-bulut
Copy link

@bstiel can you rebase? we really need that feature :)

@belek
Copy link

belek commented May 4, 2020

Need this feature ASAP!

bstiel and others added 2 commits May 4, 2020 11:07
Co-authored-by: Aliaksei Urbanski <mimworkmail@gmail.com>
Co-authored-by: Aliaksei Urbanski <mimworkmail@gmail.com>
@bstiel
Copy link
Contributor Author

bstiel commented May 4, 2020

apologies for the slow response. I've merged master in and implemented @Jamim 's code styling suggestions. Please let me know if you need anything else from me.

@jheld
Copy link
Contributor

jheld commented May 4, 2020

@mher can you please review?

@mher
Copy link
Owner

mher commented May 8, 2020

I think this PR needs a little rework. It would be better to filter offline workers older than some configurable interval.

If offline-worker-interval is not None and the now() - worker.lastHeartbeatTime > offline-worker-interval the worker will be invisible. Otherwise it will be visible.

@jheld
Copy link
Contributor

jheld commented May 8, 2020

@mher totally agree.

I had mentioned something similar to that here a couple weeks ago.

@bstiel do you think you will be able to implement the proposed improvement within the next couple weeks?

I still think there is merit in some sort of removal of old workers, but perhaps that could be a downstream PR, again, using time (even an API would be a good stop gap, and there is at least one open PR which implements said feature).

Related (I don't think required at this point) note:

As such, I agree with Mher that filtering them out (time-based) is less brutal for this kind of feature. I might ask one more request which would be to allow passing a query param on the URL to enable/disable this per-request. That way if there is a bug or confusion on a team using this, they may simply adjust the URL.

Again, I see this being a good candidate for a downstream PR depending on how much time you have.

@bstiel
Copy link
Contributor Author

bstiel commented May 11, 2020

@jheld - yes, I will implement the proposed improvement this month.

@dejlek
Copy link

dejlek commented May 14, 2020

@bstiel - Thanks for your patience! 2.5 years passed since the PR got created...

@jheld
Copy link
Contributor

jheld commented Jun 5, 2020

@bstiel friendly reminder coming up on a month (life gets busy, totally understand). Do you think you'll have time to finish this within the upcoming week?

@bstiel
Copy link
Contributor Author

bstiel commented Jun 10, 2020

I've just pushed a new commit. I've refactored the --purge_offline_workers / FLOWER_PURGE_OFFLINE_WORKERS option/env var so that it specifies the number of seconds (when offline) since last heartbeat for a worker to be removed from the dashboard. Eg --purge_offline_workers=10. If --purge_offline_workers is omitted, it will default to current behaviour, ie offline workers will not be removed from the dashboard.

Copy link
Contributor

@jheld jheld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! This gives a good standard improvement for the cloudy world. And sets us up to improve later for more advanced configuration/usage if we need.

Thank you @bstiel !

@jheld
Copy link
Contributor

jheld commented Jun 10, 2020

@mher can you review again?

CONTRIBUTORS Outdated Show resolved Hide resolved
@bstiel
Copy link
Contributor Author

bstiel commented Jun 11, 2020

I added the line break at the end of the contributors file. Please let me know if you need anything else from me. It would be great to get this PR finally closed. Thanks guys!

@encryptblockr
Copy link

so what ENV variable can we use to control removing the offline workers? Any docs on how to use this in a .env.file?

@encryptblockr
Copy link

I've just pushed a new commit. I've refactored the --purge_offline_workers / FLOWER_PURGE_OFFLINE_WORKERS option/env var so that it specifies the number of seconds (when offline) since last heartbeat for a worker to be removed from the dashboard. Eg --purge_offline_workers=10. If --purge_offline_workers is omitted, it will default to current behaviour, ie offline workers will not be removed from the dashboard.

where is documentation on how to use this?

@encryptblockr
Copy link

encryptblockr commented Sep 2, 2021

am guessing using .env

FLOWER_PURGE_OFFLINE_WORKERS=10 

or

--purge_offline_workers=10

is this correct?

wmcbroomd2d pushed a commit to Draft2Digital/flower that referenced this pull request Jul 29, 2024
* New option (--purge_offline_workers / FLOWER_PURGE_OFFLINE_WORKERS) to remove offline workers from the dashboard

* Updated docs

* Added unittest for purge_offline_workers option

* Fix tests for Python 2.7

* Added myself to list

* Update tests/unit/views/test_dashboard.py

Co-authored-by: Aliaksei Urbanski <mimworkmail@gmail.com>

* Update docs/config.rst

Co-authored-by: Aliaksei Urbanski <mimworkmail@gmail.com>

* Refactor purging of offline workers

* Add trailing line break

Co-authored-by: Aliaksei Urbanski <mimworkmail@gmail.com>
wmcbroomd2d pushed a commit to Draft2Digital/flower that referenced this pull request Jul 29, 2024
* New option (--purge_offline_workers / FLOWER_PURGE_OFFLINE_WORKERS) to remove offline workers from the dashboard

* Updated docs

* Added unittest for purge_offline_workers option

* Fix tests for Python 2.7

* Added myself to list

* Update tests/unit/views/test_dashboard.py

Co-authored-by: Aliaksei Urbanski <mimworkmail@gmail.com>

* Update docs/config.rst

Co-authored-by: Aliaksei Urbanski <mimworkmail@gmail.com>

* Refactor purging of offline workers

* Add trailing line break

Co-authored-by: Aliaksei Urbanski <mimworkmail@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.