Skip to content

VAST 2021.01.28

Compare
Choose a tag to compare
@dominiklohmann dominiklohmann released this 28 Jan 09:01
9db6f36

We’re happy to announce the monthly release 2021.01.28. This year begins with some exciting changes. As an open-source telemetry engine, VAST now doubles down on the open platform character with a new plugin framework with multiple customization points.

VAST’s new plugin framework makes it possible to ship third-party extensions—in source or binary—along with an existing VAST deployment. We use this new functionality as well for our closed-source add-ons, e.g., live threat intel matching and NetFlow parsing. We believe having a fully open platform with extension points for custom functionality is the sweet-spot for an open-core business model.

We were also able to increase JSON parsing performance by 5x by switching to a SIMD-based implementation. Our Docker build now relies on BuildKit to support optional layers and images are based on Debian Buster. Multiple bug fixes and robustness improvements also made it in this release. Enjoy.

Plugin Framework

VAST now offers an experimental plugin framework to support efficient customization points at various places of the data processing pipeline. There exist several base classes that define an interface, such as adding a new command or spawning a new actor that processes the incoming stream of data.

The documentation page gives an overview of the plugin framework, which is still under active development.

JSON improvements

VAST now natively supports Zeek logs as line-delimited JSON objects as produced by the json-streaming-logs package via the vast import zeek-json command.

Thanks to @ngrodzitski, VAST now has experimental support for relying on simdjson for parsing JSON objects. This brings substantial gains in throughput, and shifts the bottleneck of the ingest path from parsing input to indexing at the node. To use the feature, add the --simdjson flag to the following import commands: json, suricata, and zeek-json. We will stabilize this feature in the near future and make it the default option, replacing our legacy NDJSON parser entirely.

Additionally, VAST no longer flattens imported data that contains nested records on ingestion. This is most noticeable with imports in the JSON format, but actually applies to all formats under the hood. This means that VAST now fully preserves nested JSON objects and exports them in the same structure as they were ingested. To restore the old export behavior, use vast export json --flatten.

Changelog Highlights

As always, you can find the full technical scoop in our changelog.

⚡️ Breaking Changes

  • The GitHub CI changed to Debian Buster and produces Debian artifacts instead of Ubuntu artifacts. Similarly, the Docker images we provide on dockerhub use Debian Buster as base image. To build Docker images locally, users must set DOCKER_BUILDKIT=1 in the build environment. #1294

  • The new short options -v, -vv, -vvv, -q, -qq, and -qqq map onto the existing verbosity levels. The existing short syntax, e.g., -v debug, no longer works. #1244

⚠️ Changes

  • The option vast.schema-paths is renamed to vast.schema-dirs. The old option is deprecated and will be removed in a future release. #1287

  • VAST preserves nested JSON objects in events instead of formatting them in a flattened form when exporting data with vast export json. The old behavior can be enabled with vast export json --flatten. #1257 #1289

🧬 Experimental Features

  • VAST relies on simdjson for JSON parsing. The substantial gains in throughput shift the bottleneck of the ingest path from parsing input to indexing at the node. To use the (yet experimental) feature, use vast import json|suricata|zeek-json --simdjson. #1230 #1246 #1281 #1314 #1315 @ngrodzitski

  • VAST features a new plugin framework to support efficient customization points at various places of the data processing pipeline. There exist several base classes that define an interface, e.g., for adding new commands or spawning a new actor that processes the incoming stream of data. The directory examples/plugins/example contains an example plugin. #1208 #1264 #1275 #1282 #1285 #1287 #1302 #1307 #1316

🎁 Features

  • The output of vast status contains detailed memory usage information about active and cached partitions. #1297

  • The new import zeek-json command allows for importing line-delimited Zeek JSON logs as produced by the json-streaming-logs package. Unlike stock Zeek JSON logs, where one file contains exactly one log type, the streaming format contains different log event types in a single stream and uses an additional _path field to disambiguate the log type. For stock Zeek JSON logs, use the existing import json with the -t flag to specify the log type. #1259

🐞 Bug Fixes

  • Disk monitor quota settings not ending in a 'B' are no longer silently discarded. #1278

  • Values in JSON fields that can't be converted to the type that is specified in the schema won't cause the containing event to be dropped any longer. #1250

  • Manually specified configuration files may reside in the default location directories. Configuration files can be symlinked. #1248