[Improvement] Avoid unnecessary retry in some access checker when using DelegationShuffleManager #151

zuston · 2022-08-09T09:23:39Z

The retry mechanism is introduced by #127. But in some access checker, there's no need to retry, like candidates checker. But the health checker maybe need.

So I think we need introduce the NON_TRANSIENT_ACCESS_DENIED to avoid retry in some checker to reduce time.

The text was updated successfully, but these errors were encountered:

zuston · 2022-08-09T09:23:52Z

cc @jerqi

jerqi · 2022-08-09T12:09:03Z

@smallzhongfeng What do you think?

smallzhongfeng · 2022-08-09T12:28:12Z

I think this is a parameter setting. If the retry interval * retry times is greater than the heartbeat time passed by the server to the coordinator, the check of AccessCandidatesChecker is meaningful. If theretry interval * retry times is less than the heartbeat time of the server, Can be solved by setting the number of retries to 0 ? @zuston

zuston · 2022-08-09T13:40:21Z

Firstly i think your PR is meaningful for cluster load access checker.

But in other access checker, sometimes we needn’t retry. So I will introduce a special acess result of NON_TRANSIENT_ACCESS_DENIED to avoid retry in some checkers.

For example

If we enable two checkers

when cluster-loader check deny but candidates checker pass, so we need to retry
when cluster-loader check pass but candidates check deny, retry isn’t needed
when two checks deny, retry isn’t needed

smallzhongfeng · 2022-08-10T06:17:47Z

I got your point. If you raise a pr, I'm glad to review.@zuston WDYT? @jerqi

jerqi · 2022-08-10T07:11:50Z

A little complex, are there similar mechanisms in the other systems? In my opinion, we shouldn't retry when we use candidate checker. when we only use health checker, we can retry, we can scale out our RSS at the same time. I doubt whether we need this mechanism?

zuston · 2022-08-10T07:17:03Z

If not having this mechanism, how to handle the multiple checkers retry?

In our internal env, we will use the multiple checkers, including health checker(need to retry) and customize checker(no need to retry).

jerqi · 2022-08-10T07:18:58Z

If not having this mechanism, how to handle the multiple checkers retry?

In our internal env, we will use the multiple checkers, including health checker(need to retry) and customize checker(no need to retry).

You can choose not to retry.

zuston · 2022-08-10T07:23:01Z

If not having this mechanism, how to handle the multiple checkers retry?
In our internal env, we will use the multiple checkers, including health checker(need to retry) and customize checker(no need to retry).

You can choose not to retry.

No retry is OK. But this will not solve the problem described in the issue #127

smallzhongfeng · 2022-08-10T08:48:17Z

Maybe you could choose not to retry when you use multiple checker, because in your description, the scenario of multiple checkers seems to be more dependent on the results of the candidates checker, in this way, it is not enough meaningful to retry, but the default checker is only AccessClusterloadChecker, and the issue #127 I proposed is mainly adapted to this checker, when you use this checker, you can choose to retry.

zuston · 2022-08-10T10:28:23Z

But when having multiple checkers, and the apps are in candidates list, for these apps, it need retry.

zuston · 2022-08-15T12:30:26Z

So do we need this feature ? cc @jerqi

jerqi · 2022-08-15T12:46:16Z

So do we need this feature ? cc @jerqi

I don't think we need so complex retry mechanism ...

zuston · 2022-08-17T11:32:08Z

OK. Close it.

zuston closed this as completed Aug 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Improvement] Avoid unnecessary retry in some access checker when using DelegationShuffleManager #151

[Improvement] Avoid unnecessary retry in some access checker when using DelegationShuffleManager #151

zuston commented Aug 9, 2022

zuston commented Aug 9, 2022

jerqi commented Aug 9, 2022

smallzhongfeng commented Aug 9, 2022 •

edited

zuston commented Aug 9, 2022

smallzhongfeng commented Aug 10, 2022 •

edited

jerqi commented Aug 10, 2022

zuston commented Aug 10, 2022

jerqi commented Aug 10, 2022

zuston commented Aug 10, 2022

smallzhongfeng commented Aug 10, 2022

zuston commented Aug 10, 2022

zuston commented Aug 15, 2022

jerqi commented Aug 15, 2022

zuston commented Aug 17, 2022

[Improvement] Avoid unnecessary retry in some access checker when using DelegationShuffleManager #151

[Improvement] Avoid unnecessary retry in some access checker when using DelegationShuffleManager #151

Comments

zuston commented Aug 9, 2022

zuston commented Aug 9, 2022

jerqi commented Aug 9, 2022

smallzhongfeng commented Aug 9, 2022 • edited

zuston commented Aug 9, 2022

For example

smallzhongfeng commented Aug 10, 2022 • edited

jerqi commented Aug 10, 2022

zuston commented Aug 10, 2022

jerqi commented Aug 10, 2022

zuston commented Aug 10, 2022

smallzhongfeng commented Aug 10, 2022

zuston commented Aug 10, 2022

zuston commented Aug 15, 2022

jerqi commented Aug 15, 2022

zuston commented Aug 17, 2022

smallzhongfeng commented Aug 9, 2022 •

edited

smallzhongfeng commented Aug 10, 2022 •

edited