ROX-15904: migrate caching node scanner #1130

Maddosaurus · 2023-03-22T17:44:33Z

This PR updates the Node Scanner that was introduced in #1116 to a new implementation that caches results in an EmptyDir to not rescan the Node if a new enough cached scan is available in the cache.
The implementation was discussed and developed in stackrox/stackrox#4701, but required some minimal changes to adapt to the new location.

Limitations

Due to the fact that Scanner doesn't currently have a Duration-type environment setting, all settings for the Caching Scanner that were previously env vars are currently hardcoded (see TODO in service.go).
This will be addressed in a follow up, ROX-16095, to keep these PR reviews short, isolated, and quick 😃

Testing Performed

I deployed this version on an OpenShift 4.12 cluster and verified that the caching functionality is still working.
As long as stackrox/stackrox/pull/5292 is not merged, manual changes need to be applied to the Collector DaemonSet to use this container.
The following changes have to be executed, after ACS has been deployed via ./deploy/openshift/deploy.sh:
Update the Collector DaemonSet with this additional container in spec/template/spec/containers:

      - name: node-inventory
        image: quay.io/mmeiding/playground:scanner-8a7f5bf983be
        command: [ "/scanner", "--nodeinventory", "--config=", "" ]
        ports:
        - containerPort: 8444
        env:
        - name: ROX_NODE_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: spec.nodeName
        volumeMounts:
        - mountPath: /host
          name: host-root-ro
          readOnly: true
        - mountPath: /tmp/
          name: tmp-volume
        - mountPath: /cache
          name: cache-volume

Afterwards, set the following env vars to shorten the rescan times for testing:
kubectl -n stackrox set env daemonsets/collector --containers="compliance" ROX_NODE_SCANNING_INTERVAL="20s" ROX_NODE_SCANNING_INTERVAL_DEVIATION="1s" ROX_NODE_SCANNING_MAX_INITIAL_WAIT="10s" LOGLEVEL="DEBUG"

Tests

If a cached scan is used, debug logs will denote the use of a cached scan.
To test the backoff, get a shell on node-inventory and execute: echo -n "{\"CacheValidUntil\":\"0001-01-01T00:00:00Z\",\"RetryBackoffDuration\":\"10s\",\"CachedInventory\":\"\"}" > /cache/inventory-cache. Observe a log entry on node-scanner on the next scan that it found a backoff and will wait for 10 seconds.
To test a corrupted meta cache, get a shell and execute: echo -n "{\"CacheValidUntil\":\"42\",\"RetryBackoffDuration\":\"noDuration\",\"CachedInventory\":\"\"}" > /cache/inventory-cache. Observe a failsafe where the next scan will be delayed by 300 seconds.
To test a corrupted inventory, execute: echo -n "{\"CacheValidUntil\":\"2023-04-01T00:00:00Z\",\"RetryBackoffDuration\":\"0s\",\"CachedInventory\":\"noInventory\"}" > /cache/inventory-cache. Observe a failsafe where the next scan will be delayed by 300 seconds.

roxbot · 2023-03-22T18:20:02Z

Images are ready for the commit at 218b500.

To use the images, use the tag 2.28.x-47-g218b500bc7.

jvdm

LGTM.

Maddosaurus added 3 commits March 22, 2023 15:20

WIP: Move caching scanner, adding duration setting

2a2e173

WIP: Move caching scanner

312d6f5

Fix tests, switch to human readable cache

218b500

Maddosaurus requested review from vikin91, jvdm, fredrb, RTann and jschnath March 22, 2023 17:44

jvdm approved these changes Mar 22, 2023

View reviewed changes

vikin91 approved these changes Mar 23, 2023

View reviewed changes

Maddosaurus merged commit 8129118 into master Mar 23, 2023

Maddosaurus deleted the mm/ROX-15904-migrate-caching-scanner branch March 23, 2023 09:01

Maddosaurus mentioned this pull request Mar 23, 2023

ROX-15904: Remove migrated code stackrox/stackrox#5361

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ROX-15904: migrate caching node scanner #1130

ROX-15904: migrate caching node scanner #1130

Maddosaurus commented Mar 22, 2023 •

edited

Loading

roxbot commented Mar 22, 2023

jvdm left a comment

ROX-15904: migrate caching node scanner #1130

ROX-15904: migrate caching node scanner #1130

Conversation

Maddosaurus commented Mar 22, 2023 • edited Loading

Limitations

Testing Performed

Tests

roxbot commented Mar 22, 2023

jvdm left a comment

Choose a reason for hiding this comment

Maddosaurus commented Mar 22, 2023 •

edited

Loading