Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conveyor breaks if no scheme is specified #2402

Closed
vingar opened this issue Mar 27, 2019 · 9 comments · Fixed by #2669
Closed

Conveyor breaks if no scheme is specified #2402

vingar opened this issue Mar 27, 2019 · 9 comments · Fixed by #2669
Assignees
Milestone

Comments

@vingar
Copy link
Contributor

vingar commented Mar 27, 2019

Motivation

When no scheme is specified in the configuration, the conveyor fails to find a common scheme for the source and destination files, here gsiftp:

2019-03-27 08:51:47,386	3214	CRITICAL	Exception happened when trying to get transfer for request 5d9c6d99ebf04465ad3bef4d8078219c: Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/rucio/core/transfer.py", line 624, in get_transfer_requests_and_source_replicas
    'schemes': __add_compatible_schemes(schemes=[matching_scheme[0]], allowed_schemes=current_schemes),
  File "/usr/local/lib/python2.7/dist-packages/rucio/core/transfer.py", line 923, in __add_compatible_schemes
    if scheme in allowed_schemes:
TypeError: argument of type 'NoneType' is not iterable

The respective rules stay in the STUCK state when asked for reevaluation:

2019-03-27 08:49:23,677	3213	INFO	rule_repairer[0/0]: Repairing rule 32d6775ead104b2c8be50cc946f01782
2019-03-27 08:49:23,749	3213	INFO	Rule 32d6775ead104b2c8be50cc946f01782 [0/0/1] state=STUCK
@bari12
Copy link
Member

bari12 commented Mar 27, 2019

The conveyor should probably just stop starting, with an error, if no scheme is specified.

@bari12 bari12 changed the title traceback error in conveyor-submitter and judge* Conveyor breaks if no scheme is specified Mar 27, 2019
@vingar
Copy link
Contributor Author

vingar commented Mar 27, 2019

I would have expected to have the conveyor to figure out automatically which schemes/protocols should be used for third party transfers out of the ones available for the replicas: here it was gsiftp, davs, srm.

Note also here the non expected behaviour of the judge-repairer after the configuration has been corrected to use gsiftp.

@bari12
Copy link
Member

bari12 commented Mar 27, 2019

It's also an option: If no scheme is given, just use all compatible ones.

For the judge-repairer I don't think anything has to be changed. The issue here is in the conveyor, as the request probably stays in it's state forever and never get's failed. Thus this is never reported back to the rule. For the judge it looks like that the request is still in submission.

@vingar
Copy link
Contributor Author

vingar commented Mar 27, 2019

One observation after such error: the request stays in then N, NO_SOURCES state and then disappeared. The judge-repairer keeps it in the S state forever.

@bari12
Copy link
Member

bari12 commented Mar 27, 2019

Is it possible that there is no conveyor-finisher running? The finisher should process the NO_SOURCES ones and mark them as failed.

reqs = request_core.get_next(request_type=[RequestType.TRANSFER, RequestType.STAGEIN, RequestType.STAGEOUT],

@vingar
Copy link
Contributor Author

vingar commented Mar 27, 2019

The finisher runs too.

@bari12
Copy link
Member

bari12 commented Mar 27, 2019

Can you track if the replicas affected by this bug are actually handled by the finisher?
But in any way, this has to be fixed in the submitter, than the workflow through the finisher/repairer is fine again.

@vingar
Copy link
Contributor Author

vingar commented Mar 27, 2019

I'm seeing some messages in the finisher like requeing etc. Don't you have enough information to reproduce the problem and investigate more from your side ?

@bari12
Copy link
Member

bari12 commented Mar 27, 2019

I only tested the NO_SOURCES one, and this works fine in my finisher. But I didn't test the full error yet. But yes, information is sufficient 👍

@bari12 bari12 modified the milestones: 1.19.5, 1.19.6 Apr 2, 2019
@bari12 bari12 modified the milestones: 1.19.6, 1.19.7 Apr 16, 2019
@bari12 bari12 modified the milestones: 1.19.7, 1.19.8 Apr 29, 2019
@bari12 bari12 modified the milestones: 1.19.8, 1.19.9 May 13, 2019
@bari12 bari12 modified the milestones: 1.19.9, 1.20.0 "Wonder Donkey" LTS, 1.20.0, 1.20.1 May 27, 2019
@bari12 bari12 removed this from the 1.20.1 milestone Jun 18, 2019
@bari12 bari12 added this to the 1.20.2 milestone Jun 18, 2019
cserf added a commit to cserf/rucio that referenced this issue Jun 19, 2019
bari12 added a commit that referenced this issue Jun 20, 2019
…scheme_is_specified

Conveyor breaks if no scheme is specified : Closes #2402
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants