Skip to content

Big Ideas

Christian Kreibich edited this page Mar 1, 2024 · 10 revisions

This document is a list of various ideas collected over many years by the Zeek team for features, changes, research, and other tasks. Sometimes, ideas from this list end up as new issues and/or pull requests on Github. We do our best to link to those where possible. Completed tasks are listed separately, below.

NOTE: This shouldn’t be considered an implementation roadmap, but a collection of such ideas to track over time.

See the discussion at https://github.com/zeek/zeek/discussions/2850 for more.

  • Infrastructure & Resources
    • Retire the machine behind old.zeek.org
    • New Package website (Tim)
    • Website revamp — minimize to resolve:
      • Lots of outdated information
      • 170+ broken links
      • Blogging in limbo between Discourse and Wordpress
  • Documentation
    • Complete push to cover packet capture more comprehensively
    • Write in-depth plugin concepts guide
    • Explain the full set of environment variables and what they do -- zeek --help is partial and arguably not the right place
  • Code Base
    • Subsystems
    • Code quality and maintenance
    • Analyzers
      • Multipath TCP (recognize at least)
      • SCTCP
      • Better VLAN visibility
    • Scripting language
      • async statement, or more generally: coroutines?
      • Event dependencies
      • String extraction
      • Review & overhaul handling of text vs binary data (includes UTF-8 work from #596)
      • String interpolation
      • Overhaul name spacing
      • LSP server (Benjamin)
      • New types
        • nullable/optional
        • result/error
      • Provide functionality to operate on JSON data
        • JSON reader
      • Script formatter (Christian)
      • Script linter (need a place to collect ideas for this)
      • Disable event handlers (or groups of event handler) on a per connection basis
        • If there's no handler anymore, stop processing connection
    • Scripts, Frameworks, Packages
      • Package curation for commonly desired infrastructure
        • List recommended packages prominently, e.g. for data export options
        • Also curate most-wanted packages/plugins
      • General direction
      • Logging
        • Zeek log schemas as a release-level deliverable:
          • standardize a JSON-level rendering of what Zeek logs look like
          • Produce such as log schema with each release, to allow tracking changes in one place and via tooling
          • Possibly expand tooling to capture how packages adapt this log schema by adding columns, logs, etc.
        • Delayed logging, with a generalized notion of additional log annotations
        • Universal log format alterations (based on a predicate, say — anything that contains a conn_id changes in a particular way).
          • Could relate to delayed logging, as a generalized “log alteration framework”
      • New frameworks
        • High-level cluster state sharing framework
        • Persistence / data sharing framework
          • SQLite-backed datastores are the only out-of-the-box persistence option in Zeek right now, and are not designed to address any and all use cases
          • Could initially explore alternatives via dedicated packages and low-level BiFs, to shape into a framework at a later stage
          • Contenders: generic SQL backend, Redis, memcached
        • Authentication framework
          • Track host/IP/user mapping
          • LDAP/AD integration
        • Alerting framework (Replace notice framework with something more generic than interfaces to Slack etc.)
        • Asset database framework (aggregate information from different sources: network, zeek-agent, Suricata)
      • Port Sumstats to new Broker ALM functionality [ALM is on hold]
      • Higher-level logs: create narratives that can be understood without detailed protocol knowledge
      • Provide extended reflection capabilities
        • TODO: need specific use cases
      • Zeek profiles: create packages that configure Zeek for.a specific use-case, e.g.
        • Minimum log volume
        • Comprehensive logging, record whatever you can
      • Configuration wizard
        • Select basic profile, tune common options
    • Performance
      • Script compiler (Vern)
        • ZAM
        • C++
      • CPU & memory profiling (Tim)
        • Scripts
        • Interpreter
        • Event engine
        • Redundancy of file analysis
      • Timer buckets (Tim)
    • Build system & deployment
      • New install/RPM structure (after supervisor)
      • "Easy install": One-click package installation
      • Rewrite Cirrus config to use Starlark build system

Past efforts

Kept largely for reference since we often need to recall the decision-making around past work, so this is mostly useful for items that link to past discussions, tickets, etc.

  • Infrastructure & Resources
    • Move docs to RTD (Jon)
    • Move code to GitHub (Jon)
    • Move JIRA tickets to GitHub (Jon)
    • Website redesign (Amber)
    • Testlab setup for performance benchmarking (Tim)
    • Move mailing lists to Discourse (Johanna)
  • Documentation
    • New User Manual (Richard)
  • Code Base
    • Install headers (Jon)
    • Bro-to-Zeek rename
    • README redesign and modernization (Zeke)
    • Subsystems
    • DPDify transport-layer (Jan, Tim)
    • Support for non-IP protocols (Tim)
    • New I/O loop (Tim)
    • Broker
      • Port cluster framework & BroControl (Jon)
      • New/real network protocol (Dominik)
      • Windows support (Dominik)
      • Synchronize data stores with script tables (Johanna)
      • Add Websocket API for clients to easily subscribe/publish (Dominik)
    • Spicy & HILTI (Robin/Benjamin)
    • Code quality and maintenance
      • Define & implement coding conventions (Tim)
      • Remove 2.x legacy functionality (Johanna)
      • Unit testing using doctest
      • Code Modernization
        • Convert internal containers types to C++ templates
          • Add proper iterator support
          • Remove loop_over_list/loop_over_queue methods
        • Extend iterator support to support more std algorithms like std::sort
        • Add support for standard for loops to Dict to allow looping over std::pair for the keys and values
        • Convert the manual ref counting to use IntrusivePtr<> (Jon)
    • Analyzers
      • QUIC (Jon)
      • MQTT (Seth)
    • Scripting language
    • Scripts, Frameworks, Packages
      • Logging
      • Configuration framework (Johanna)
        • Port existing options
        • Integrate with BroControl/Broker
      • Telemetry on Zeek's operation (Arne/Dominik)
      • Community ID (Christian)
    • JavaScript
      • Include ZeekJS as an external builtin plugin that is activated when libnode-dev is installed. Starting with Ubuntu 22.10 and Debian Bookworm, this package is an apt-get install away. Fedora has one since much longer
      • Basic motivation: Open access to a vast and active ecosystem of third-party libraries and tools. Possibly attract JavaScript developers. "Lets not build a replacement for ActiveHTTP".
Clone this wiki locally