Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix multiple retries despite planner_max_retries=0 (#287) #288

Merged
merged 3 commits into from
Jan 17, 2022
Merged

Fix multiple retries despite planner_max_retries=0 (#287) #288

merged 3 commits into from
Jan 17, 2022

Conversation

ChristofDubs
Copy link
Collaborator

fix for #287

As described in the issue, if planner_max_retries is set to 0 and makePlan() fails before the patience is exceeded, the state should be set to NO_PLAN_FOUND immediately, and no retries should occur (just like in the case where planner_patience is set to 0).

I also added a test for this fix (and they fail without the fix), and took the opportunity to add a few other cases that didn't have tests yet.

@corot corot requested review from dorezyuk and corot December 25, 2021 09:52
Copy link
Collaborator

@corot corot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm; I have added the explanation about how the different combinations on max_retries and patience work:
http://wiki.ros.org/move_base_flex#:~:text=mbf_simple_nav%20for%20details.-,Common%20Parameters,-~%3Cname%3E/planner_frequency%20(double

Actually, I have realized that for controller it works as without this PR!
https://github.com/magazino/move_base_flex/blob/53930c17632d45c0362eca14ca6a60cb83cc9ff2/mbf_abstract_nav/src/abstract_controller_execution.cpp#L415
I guess we should change it to make both match and also match to the documented behavior
@dorezyuk , @spuetz , do u agree?

@dorezyuk
Copy link
Contributor

Thanks for the tests and the pr.

I think the base logic we want to implement here is

Patience Retries Effect On Failure
Enabled Enabled Both PAT_EXCEEDED/MAX_RETRIES
Disabled Disabled No retries NO_PLAN_FOUND
Enabled Disabled Patience PAT_EXCEEDED
Disabled Enabled Retries MAX_RETRIES

The only question is if we say the "retry logic" is disabled at max_retries < 0 or max_retries == 0. Both is fine for me, but I thought it might be confusing if we disable the "patience logic" at zero and the "retry logic" at max_retries < 0. On one hand the max_retries < 0 would be more consistent with move_base. On the other hand - move_base does not support the disabling of the "patience logic" - so they don't have this inconsistency... But if you guys feel its more intuitive to disable the "retry logic" at max_retries < 0, I'm fine with that.

I don't entirely understand why we should return NO_PLAN_FOUND if we reach out max-retries (if its set to zero). For me this seems rather counter-intuitive.

The biggest problem with this pr is that it creates a dead lock for the second row in the table above. If we say "retry logic" is disabled at max_retries < 0 and if the also disable the patience (patience.isZero == True), the code will never exit the loop

@ChristofDubs
Copy link
Collaborator Author

ChristofDubs commented Dec 29, 2021

The only question is if we say the "retry logic" is disabled at max_retries < 0 or max_retries == 0. Both is fine for me, but I thought it might be confusing if we disable the "patience logic" at zero and the "retry logic" at max_retries < 0. On one hand the max_retries < 0 would be more consistent with move_base. On the other hand - move_base does not support the disabling of the "patience logic" - so they don't have this inconsistency... But if you guys feel its more intuitive to disable the "retry logic" at max_retries < 0, I'm fine with that.

I'd be in favor of infinite attempts for the max_retries < 0 case:

  • The user should be able to configure exactly one planning attempt (i.e. one attempt, no retries)
  • I would expect that based on the parameter's name, max_retries == 0 should mean 0 retries, which means one attempt and zero retries.

It seems that the parameter name max_retries is what is causing the inconsistency to begin with: If it was instead called max_attempts, then max_attempts = 1 would mean no retries (max_retries == 0) and max_attempts = 0 would be infinite retries (max_retries = -1); this would then be consistent with patience disabling.

I don't entirely understand why we should return NO_PLAN_FOUND if we reach out max-retries (if its set to zero). For me this seems rather counter-intuitive.

It just returns NO_PLAN_FOUND if it failed and it is not allowed to do any retries (which it already did before this PR); I was assuming that MAX_RETRIES should not be returned since it didn't do any retries:

Patience Retries Effect On Failure
Enabled > 0 Retries + Patience PAT_EXCEEDED / MAX_RETRIES
Disabled > 0 Retries MAX_RETRIES
Enabled 0 1 attempt + Patience PAT_EXCEEDED / NO_PLAN_FOUND
Disabled 0 1 attempt NO_PLAN_FOUND
Enabled < 0 Patience PAT_EXCEEDED
Disabled < 0 infinite retries -

My personal opinion is that the two NO_PLAN_FOUND and MAX_RETRIES return codes are a redundant to begin with; I just based my changes on what was already there to not break any existing code.

Assuming we used NO_PLAN_FOUND regardless of how many retries were configured, then above table would simplify to:

Patience Retries Effect On Failure
Enabled >= 0 finite attempts + Patience PAT_EXCEEDED / NO_PLAN_FOUND
Disabled >= 0 finite attempts NO_PLAN_FOUND
Enabled < 0 Patience PAT_EXCEEDED
Disabled < 0 infinite attempts -

The biggest problem with this pr is that it creates a dead lock for the second row in the table above. If we say "retry logic" is disabled at max_retries < 0 and if the also disable the patience (patience.isZero == True), the code will never exit the loop

Setting max_retries < 0 and patience == 0 would already have this effect without this PR? So it's not a new problem that was introduced.

I also think this is a valid configuration, and exactly the behavior one would expect by giving infinite patience and infinite retires: It means the planner should never give up. I wouldn't recommend such a configuration, but it might be what the user wants if there are no alternative goals to target and no recovery behaviors. Or, the user has his own logic of when to cancel and re-send the action.

@dorezyuk
Copy link
Contributor

Ah ok, then the misunderstanding I had was that I considered the "infinite retries" as a bug.. So the question we should answer is if we want to allow something like a infinite retry for the user... @corot @spuetz What do you think?

@corot
Copy link
Collaborator

corot commented Jan 11, 2022

Ah ok, then the misunderstanding I had was that I considered the "infinite retries" as a bug.. So the question we should answer is if we want to allow something like a infinite retry for the user... @corot @spuetz What do you think?

Inf retries is a valid option; 👍 for merging this as it is (and do the same for exe_path, see my comment above)

@spuetz
Copy link
Member

spuetz commented Jan 11, 2022

Ah ok, then the misunderstanding I had was that I considered the "infinite retries" as a bug.. So the question we should answer is if we want to allow something like a infinite retry for the user... @corot @spuetz What do you think?

Yes, inf retries as an option makes sense... as the action and by that the execution can be canceled.

@corot corot requested a review from spuetz January 16, 2022 11:50
@corot
Copy link
Collaborator

corot commented Jan 16, 2022

Yes, inf retries as an option makes sense... as the action and by that the execution can be canceled.

So we all agree with this solution? if so,,, @spuetz, @dorezyuk, please approve

@spuetz spuetz merged commit 8bd407d into naturerobots:master Jan 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants