ONVIF Discovery Handler Optimizations #351

kate-goldenring · 2021-07-19T22:56:37Z

What this PR does / why we need it:
This PR optimizes our ONVIF discovery handler by:

Applying scope filtering based on discovery response removing the need for an additional GetScopes call to the cameras during apply_filters
Testing the responsiveness of cameras during discovery via a call to the publicly accessible GetSystemDateAndTime. This prevents further filtering calls from being made to unresponsive cameras and ensures that camera is responsive.
After initial discovery, maintaining a list of the previously visible cameras and only apply filtering on new cameras that are seen. Previously, every 10 seconds, we were making calls to get the cameras' ip and mac addresses which not only is computationally expensive but also could overload the cameras.

Via some local testing, it appears that the cpu requests and limits can be reduced as a result of the optimizations. Previously discovery was getting stuck on one thread if a connection could not be made to a camera.

During this PR, also learned that:

IPv6 cameras with local-link addresses throw an "Invalid argument" error due to an interface not being specified. A later PR should address that. For now, this PR fixed the previous behavior of panicking upon this error: closes ONVIF Camera discovery issue #249.
Authenticated cameras do not support GetNetworkInterfaces calls, so they are filtered out due to no response. A later PR should add support for obtaining the mac address from authenticated cameras, possibly via arp.

Special notes for your reviewer:

If applicable:

this PR contains unit tests
added code adheres to standard Rust formatting (cargo fmt)
code builds properly (cargo build)
code is free of common mistakes (cargo clippy)
all Akri tests succeed (cargo test)
version has been updated appropriately (./version.sh)

discovery-handlers/onvif/src/discovery_impl.rs

discovery-handlers/onvif/src/discovery_handler.rs

kate-goldenring · 2021-07-20T19:10:26Z

@bfjelds I made some changes to make the device calls more concurrent, further improving performance. A follow up to this PR should address the verbose logging. trace and debug logging from the ONVIF discover handler dependencies are appearing in the Discovery Handler pod logs.

jiria · 2021-07-20T19:35:38Z

deployment/helm/values.yaml

@@ -415,11 +415,11 @@ onvif:
      memoryRequest: 11Mi
      # cpuRequest defines the minimum amount of CPU that must be available to this Pod
      # for it to be scheduled by the Kubernetes Scheduler
-      cpuRequest: 300m
+      cpuRequest: 10m


jiria · 2021-07-20T19:37:38Z

discovery-handlers/onvif/src/discovery_handler.rs

-                    if !cameras.contains(camera) {
+                trace!("discover - discovered:{:?}", &latest_cameras);
+                // Remove cameras that have gone offline
+                previous_cameras.iter().for_each(|c| {


Does this logic need some retry timeout before removing the cameras or is that handled elsewhere?

The retry timeout is effectively 10 seconds since that is the discovery interval. If they were to come back online on the next discovery check they will once again be reported to the agent

So if on second X camera A is present, on second X+9 it disappears, on X+10 another scan does not find it, so it removes it from Akri instance, on X+11 camera comes back (so camera had intermittent connectivity for 2 seconds), we still end up evicting pods that were using the camera or is my understanding wrong?

Your scenario does/would not occur. The discovery handler logic is different from the agent logic. The Discovery Handler reports to the Agent whenever there are changes in device visibility, checking for changes every 10 seconds. The Agent then decides what to do with that information. Currently, for shared devices, it gives a 5 minute timeout before deleting instances, triggering the controller to bring down pods.

Does this logic need some retry timeout before removing the cameras or is that handled elsewhere?

To summarize: the removing of instances and brokers is handled elsewhere, in the Agent and Controller respectively after a 5 minute period of the Discovery Handler not reporting the device.

jiria · 2021-07-20T19:40:21Z

discovery-handlers/onvif/src/discovery_handler.rs

@@ -295,18 +258,16 @@ mod tests {
    #[tokio::test]
    async fn test_apply_filters_no_filters() {
        let mock_uri = "device_uri";
+        let mock_ip = "mock.ip";


Curious, why is one with . and the other with :?

i think we were aiming at mimicking somewhat the format of an ip address vs a mac address, with . and : separators, respectively.

kate-goldenring added 4 commits July 19, 2021 15:39

filter out unresponsive or authenticated devices during onvif discovery

f11de5f

reduce onvif discovery resource requirements

cdf62b3

remove get_device_scopes

797835d

increase version

0c0f11e

kate-goldenring requested review from bfjelds, Britel, jiria and romoh as code owners July 19, 2021 22:56

bfjelds reviewed Jul 20, 2021

View reviewed changes

discovery-handlers/onvif/src/discovery_impl.rs Outdated Show resolved Hide resolved

discovery-handlers/onvif/src/discovery_impl.rs Outdated Show resolved Hide resolved

bfjelds reviewed Jul 20, 2021

View reviewed changes

discovery-handlers/onvif/src/discovery_handler.rs Outdated Show resolved Hide resolved

discovery-handlers/onvif/src/discovery_handler.rs Outdated Show resolved Hide resolved

bfjelds approved these changes Jul 20, 2021

View reviewed changes

make device calls concurent and rename get_socket

c55d42f

jiria reviewed Jul 20, 2021

View reviewed changes

kate-goldenring merged commit 45345d2 into project-akri:main Jul 22, 2021

kate-goldenring deleted the onvif-cleanup branch July 22, 2021 15:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ONVIF Discovery Handler Optimizations #351

ONVIF Discovery Handler Optimizations #351

kate-goldenring commented Jul 19, 2021

kate-goldenring commented Jul 20, 2021

jiria Jul 20, 2021

jiria Jul 20, 2021

kate-goldenring Jul 20, 2021

jiria Jul 20, 2021

kate-goldenring Jul 21, 2021

kate-goldenring Jul 21, 2021

jiria Jul 20, 2021

kate-goldenring Jul 20, 2021

ONVIF Discovery Handler Optimizations #351

ONVIF Discovery Handler Optimizations #351

Conversation

kate-goldenring commented Jul 19, 2021

kate-goldenring commented Jul 20, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment