Make RADIUS timeout value configurable #3185

aLTeReGo-SWI · 2023-05-22T15:15:40Z

⚠️ Please verify that this feature request has NOT been suggested before.

I checked and didn't find similar feature request

🏷️ Feature Request Type

UI Feature

🔖 Feature description

The RADIUS monitor appears to have a hard-coded 2500ms timeout, though it could be two 1-second and another 30-second timeout.

We have instances where RADIUS requests can take as much as 10 seconds to respond. It's not performant, but it isn't 'down' either. Making this value configurable would alleviate a lot of the false positives I'm seeing.

✔️ Solution

Add a new UI element to monitor that allows for the input of a user-defined integer timeout value

❓ Alternatives

Increase the hard coded timeout values to be higher. Not a good solution, but it is an alternative.

📝 Additional Context

No response

CommanderStorm · 2023-05-22T15:26:51Z

Could you further, how such a high ping could happen?
For a user who has nothing to do with radius: Is this expected behaviour to have such abnormally high latency?

I71d0r · 2023-05-23T07:19:34Z

@CommanderStorm for basic scenarios the Radius will verify access quickly using internal means.
However, the Radius implementation allows more advanced scenarios to verify identity against external services like Active Directory, Okta, Google Workspace etc. Typically such information would be cached, but the cache may be expired or invalidated on purpose.
This may cause spikes that are evaluated as failures, although eventually the requests would succeed with delay. To distinguish whether the service is sluggish or not working a fine tuning of request timeout is essential to minimize the false positives.

CommanderStorm · 2023-05-23T08:24:20Z

So basically the avg number you would expect for Laltency is below the current value, right?

Is the Usecase you are talking about not better solved via the Retries Option?
What you say would be a good helptext in the monitor setup to distinguish between Timeout and Retries?

aLTeReGo-SWI · 2023-05-24T16:35:55Z

@I71d0r is 100% spot on. While most RADIUS requests should take less than 2.5 seconds to complete, there are instances where this simply takes more time. It's not 'Down', as the response is eventually set. Sometimes that takes as much as 10 seconds, but this is normal and expected behavior, even if it's not optimal.

That means you shouldn't receive an alert for something that is normal/expected behavior. That's what causes alert fatigue and causes people to ignore alerts because they're not confident they are accurate.

Retries as I understand them aren't going to solve the problem if the response is going to take 10 seconds to complete. What retries are doing is 'continue retrying X number of times, or until the response takes only 2.5 seconds' That's not the same thing as a configurable timeout value. Especially for other instances where the normal average response time is greater than 2.5 seconds. You could retry forever, but it might not ever complete in 2.5 seconds.

CommanderStorm · 2023-05-25T09:57:20Z

@aLTeReGo-SWI please answer all my questions

So basically, the avg number you would expect for Latency of RADIUS is below the current value, right? (as in Latency>2.5s is the absolute exception?)
Is the Usecase (cache miss => long latency) you are talking about not better solved via the Retries Option?

What you say would be a good helptext in the monitor setup to distinguish between Timeout and Retries?

aLTeReGo-SWI · 2023-05-29T04:52:50Z

@CommanderStorm Latency varies based on the request. If a request comes in that has cached data, the response is relatively quick. Less than a second on average. Requests that are not cached take longer to be served. Upwards of ~10 seconds

Increasing the retries simply results in hammering the same request stacking up these requests in the queue, causing further delaying the response.

A 'retry' is.. this was down.. E.G. it exceeded the timeout value. That timeout value right now is 2.5 seconds, but it might come back so try again.

A 'timeout' is how long should I wait for my request to be responded to before giving up, and retrying if a retry value is configured.

Also, I may be mistaken but the little bits of yellow on my availability charts suggest that retries count against overall availability. An extended timeout value should not if the request was serviced within the user-definable timeout period.

CommanderStorm · 2023-05-30T08:27:22Z

Linking a few PRs/Issues:

Current state of timeouts: #2142
Timeouts are generally tracked in #877

⇒ once #2142 and #3188 are merged, adding a timeout to radius is eazy

aLTeReGo-SWI added the feature-request Request for new features to be added label May 22, 2023

chakflying mentioned this issue May 23, 2023

Fix: Set radius connection timeout to monitor default #3188

Merged

7 tasks

louislam closed this as completed in #3188 Jul 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make RADIUS timeout value configurable #3185

Make RADIUS timeout value configurable #3185

aLTeReGo-SWI commented May 22, 2023

CommanderStorm commented May 22, 2023 •

edited

I71d0r commented May 23, 2023

CommanderStorm commented May 23, 2023

aLTeReGo-SWI commented May 24, 2023

CommanderStorm commented May 25, 2023 •

edited

aLTeReGo-SWI commented May 29, 2023

CommanderStorm commented May 30, 2023 •

edited

Make RADIUS timeout value configurable #3185

Make RADIUS timeout value configurable #3185

Comments

aLTeReGo-SWI commented May 22, 2023

⚠️ Please verify that this feature request has NOT been suggested before.

🏷️ Feature Request Type

🔖 Feature description

✔️ Solution

❓ Alternatives

📝 Additional Context

CommanderStorm commented May 22, 2023 • edited

I71d0r commented May 23, 2023

CommanderStorm commented May 23, 2023

aLTeReGo-SWI commented May 24, 2023

CommanderStorm commented May 25, 2023 • edited

aLTeReGo-SWI commented May 29, 2023

CommanderStorm commented May 30, 2023 • edited

CommanderStorm commented May 22, 2023 •

edited

CommanderStorm commented May 25, 2023 •

edited

CommanderStorm commented May 30, 2023 •

edited