Skip to content

VAST 2021.11.18

Compare
Choose a tag to compare
@lava lava released this 18 Nov 14:22
42a6d1f

Dear users, we are happy to announce version VAST 2021.11.18! This month we introduced the query backlog, a new feature for VAST.

As you might have noticed, there is no October release of VAST. We run an extensive suite of performance tests in preparation of every release we put out. During this testing we discovered a performance regression late in the last cycle that we could not explain. As a consequence, we decided to delay the release until we identified the root cause, which turned out to be an update to the suricata schema definitions that triggered an increase in ingest times that became observable when running large numbers of export queries in parallel. It ultimately took us two weeks to track down and given the date, so we decided to skip that release completely and turn it into the November release.

Going forward, we improved our testing infrastructure and processes to ensure a more efficient bisect process and earlier alerts, so we hope to avoid this kind of delays in the future.

Query Backlog

We noticed that issuing thousands of point queries in one shot can degrade the performance of query execution to the point such that VAST becomes unresponsive. Such query patterns occur during automated retro matching, e.g., especially when threat feeds update at coarser intervals (e.g., hourly or even daily), spawning a surge of queries since the last update—all at once. Unless something is seriously wrong, the majority of such queries should be true negatives. Ideally, VAST does not spend time on those that don’t matter. Due to the probabilistic nature of the indexing, false positives can occur. It turns out that the sheer number of queries trigger a non-negligible amount of false positives with problematic performance implications.

We have several plans to remedy this. In addition to tuning the probabilistic structures, this release of VAST now places queries that are yet to be executed in a backlog. The query backlog consists of two queues internally: A normal and a low-priority queue. VAST processes the low-priority queue only if the normal-priority queue is empty. A query in the low-priority queue will thus not affect the performance of queries in the regular queue, and we recommend using it for automated mass-exports to avoid affecting the performance of user-issued queries. Pass --low-priority to the vast export command to mark a query as low priority. This months’ VAST Threat Bus release already makes use of this feature.

With this, the meaning of the option vast.max-workers changed: It no longer controls the degree of parallelism VAST uses to work on all queries, but rather limits the amount of queries that are being worked on at the same time. This also means new queries no longer impact the performance of already running queries if the number of existing queries already exceeds vast.max-workers.

Smaller Things

This month brings a lot of smaller changes. Here’s a selection:

  • VAST now depends on xxHash instead of vendoring it. This will allow for performance gains in a future release by switching our filter data structures to use XXH3 internally. Early experiments suggest a 2x throughput increase over the double hashing variant.
  • VAST gained the ability to apply transforms to entire partitions at once, which can be used by plugin developers.
  • Building VAST now requires CMake 3.18+.

And of course, a whole lot of bug-fixes and stability improvements. Read the full CHANGELOG here.