Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dark Reaper isn’t exiting early if there are no quarantined replicas #3952

Closed
dchristidis opened this issue Aug 25, 2020 · 4 comments
Closed
Assignees
Milestone

Comments

@dchristidis
Copy link
Contributor

Motivation

This regression was introduced with commit f557841. If there are no quarantined replicas, list_rses() returns an empty list, but the execution continues. This leads to the Dark Reaper crashing with errors such as this:

Exception in thread Worker: 0, Total_Workers: 20:
Traceback (most recent call last):
  File "/usr/lib64/python2.7/threading.py", line 812, in __bootstrap_inner
    self.run()
  File "/usr/lib64/python2.7/threading.py", line 765, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/lib/python2.7/site-packages/rucio/daemons/reaper/dark_reaper.py", line 78, in reaper
    logging.info('Starting Dark Reaper %s-%s: Will work on RSEs: %s', worker_number, total_workers, ', '.join([rse['rse'] for rse in rses]))
TypeError: 'NoneType' object is not iterable

Modification

@patrick-austin
Copy link
Contributor

I've changed the logic so it should exit early when quarantined_replica.list_rses() evaluates to False, but I'm not sure how to make the PR for this, as I made another change that removed the option --all-rses (the Boolean CLI option) as it was being overwritten (with the list of RSEs) before being evaluated (#3943). This was tagged as a feature, so is in the master branch but not 1.23.4, whereas the change here (#3888) was a patch so is in 1.23.4.

So if I create a PR for this fix against master, it'll have conflicts with the more recent #3943 PR which changed the same area of code. How can I make the fix for only 1.23.4? For reference I don't think this should be an issue with #3943 as that should check if there are any RSEs and exit early if needed.

@bziemons
Copy link
Member

bziemons commented Aug 27, 2020

@patrick-austin if your PR against master fails the merging/testing against the latest release branch, @bari12 will see if he can easily solve the conflicts and otherwise you open a second PR against the release branch. Just wait on the go from Martin for that.

Edit: if it is a fix only for a supported release and not master, you can directly open a PR against the release branch.

@patrick-austin
Copy link
Contributor

Thanks for the advice. It should only be needed for the 1.23.4 release, so I've rebased and will make the PR against 1.23-LTS.

@bari12
Copy link
Member

bari12 commented Aug 31, 2020

👍 Thanks @patrick-austin and @bziemons 😃

bari12 added a commit that referenced this issue Sep 1, 2020
…s_are_None

Consistency checks, bug: dark reaper early exit, #3952
@bari12 bari12 added this to the 1.23.5 milestone Sep 1, 2020
@bari12 bari12 closed this as completed Sep 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants