botmon

botmon monitors sandboxed malware or bot traffic and emits structured alerts whenever observed network behavior crosses configurable thresholds. It ingests either a live interface or a pre-recorded PCAP file, groups packets into fixed time windows, and writes Suricata-compatible Eve JSON for each window that shows suspicious activity.

The tool is designed for use inside automated malware-analysis pipelines. It integrates directly with bottle and bottle-warden, but runs standalone on any host with libpcap.

How it works

botmon runs a sliding-window analysis loop over captured packets.

Packets (live interface or PCAP)
  │
  ▼
FlowCollector          — groups packets into bidirectional flows per window
  │
  │  WindowStats (packet counts per flow, start time, duration)
  ▼
BehaviorClassifier     — applies thresholds, assigns local and global behaviors
  │
  │  LocalBehaviors (per-flow), GlobalBehavior (window-level)
  ▼
Eve JSON output        — Suricata-compatible alert or stats record per event

Each window produces:

One global behavior record reflecting the overall character of the window (scanning or idle).
Zero or more local behavior records, one per distinct bidirectional flow that crossed a threshold.

When the window is empty (no packets at all), cross-window state (flow IDs, previously seen hosts) is reset so that a resumed session is not contaminated by a previous one.

Classification model

Local behaviors (per-flow)

Every bidirectional flow observed in a window is classified as one of:

Class	Condition	Eve `event_type`
`attack`	Packet rate exceeds `--packet-threshold` and a C2 IP (`-c/--c2-ip`) is set	`alert`
`outbound_connection`	Packet rate below threshold, or no C2 IP is configured	`alert`

The attack classification deliberately requires a known C2 IP because a high packet rate alone is ambiguous (e.g. bulk file transfer). If no -c/--c2-ip is provided, every flow is logged as outbound_connection regardless of rate.

Packet rate is computed as total packets in both directions divided by the window duration in seconds.

Direction stats are included in every local behavior record: src_to_dst_packets, dst_to_src_packets, src_to_dst_rate, dst_to_src_rate, src_to_dst_bytes, dst_to_src_bytes, and an amplification factor (dst_to_src_bytes / src_to_dst_bytes). An amplification factor greater than 1 indicates the destination replied with more data than it received, which is the defining signal for reflection/amplification attacks (a packet-count ratio would miss high-bandwidth amplifiers that inflate response size rather than response count).

The "source" of a flow is determined by a priority chain:

Whichever side matches <bot-ip>.
Whichever side is RFC 1918 (private address space).
Whichever side sent fewer packets in the window (heuristic for scanner/initiator vs. responder).
Canonical tie-break (lower IP address / lower port).

Global behaviors (window-level)

The global behavior summarizes the entire window:

Class	Condition	Eve `event_type`
`scanning`	A horizontal or vertical scan pattern is detected (see below)	`alert`
`idle`	No scan pattern detected (emitted only with `--show-idle`)	`stats`

Scan detection

botmon detects scanning incrementally during packet collection, before the window closes. Two patterns are recognised:

Pattern	Trigger
Horizontal scan	The bot contacts more than `T` distinct destination hosts on the same port within one window (e.g. SSH sweeping).
Vertical scan	The bot contacts more than `T` distinct destination ports on the same host within one window (e.g. service enumeration).

The threshold T is computed as:

T = max(1, min(maxFlows/2, int(--destination-threshold × --window)))

where maxFlows is the internal flow-map capacity (1024). With the defaults (-d 10, -w 30) this gives T = min(512, 300) = 300.

When a scan threshold is crossed, all matching flows are immediately removed from the flow statistics map and recorded as scan flows. Subsequent packets on any already-flagged port or host are skipped entirely, preventing the map from being saturated by scan traffic. This ensures that concurrent non-scan flows (e.g. C2 beaconing or file transfers) still have capacity in the map and continue to produce outbound_connection or attack local behavior events as normal.

The C2 IP supplied via -c/--c2-ip, along with any IPs supplied via -x/--ignore-dst, are excluded from the flow map before scan detection runs. They cannot contribute to scan counts.

Attack type classification

Every attack local behavior is further labelled with an attack_type string in the metadata.botmon object (e.g. "tcp_flood", "udp_flood", "dns_amplification"). The classification methodology is based on the NLADC DDoS Dissector [1], adapted for an attacker-side vantage point: the dissector observes traffic arriving at a victim and groups flows by source port (the reflector's service port); botmon observes traffic leaving the bot and groups flows by destination port (the same service port, from the other side).

How attack types are determined

Within each analysis window, all flows are grouped by (protocol, destination_port) from the bot's perspective. Any group that accounts for more than 5% of total bytes in the window is considered dominant. Bytes are used rather than packets because amplification attacks produce a small number of spoofed requests but a large volume of amplified response bytes, making byte fraction a more reliable dominance signal. For each dominant group, a flow is classified as "<service>_amplification" only when all three conditions hold:

Protocol is UDP. Amplification is a UDP-only phenomenon; TCP traffic to the same service ports is not a reflection vector.
The bot IP is absent from both flow endpoints. In an amplification attack the bot spoofs the victim's IP as the UDP source, so the reflector replies to the victim — the flow captured on the sandbox interface is between the victim and the reflector, and the bot's own IP never appears. If the bot IP is present as either endpoint the traffic is a direct flood. If the bot IP is not configured (<bot-ip> unset), absence cannot be verified and the flow is conservatively treated as a flood.
The destination port appears in the amplification services table.

If any condition fails the flow is classified as "<protocol>_flood" instead (e.g. "tcp_flood", "udp_flood"). Flows whose group is not dominant also default to "<protocol>_flood".

Amplification services table

The port-to-service mapping is taken verbatim from the dissector's AMPLIFICATION_SERVICES dictionary [1]:

Port	Service
17	Quote of the Day
19	Chargen
53	DNS
69	TFTP
111	TPC
123	NTP
137	NetBios
161	SNMP
177	XDMCP
389	LDAP
500	ISAKMP
520	RIPv1
623	IPMI
1434	MS SQL
1900	SSDP
3283	Apple Remote Desktop
3389	Windows Remote Desktop
3702	WS-Discovery
5093	Sentinel
5351	NAT-PMP
5353	mDNS
5683	CoAP
10074	Mitel MiColab (CVE-2022-26143)
11211	MEMCACHED
27015	Steam
32414	Plex Media
33848	Jenkins
37810	DHDiscover

Comparison with the NLADC dissector

The table below shows how each classification step in the dissector [1] maps to botmon's current implementation and what remains planned.

Structural grouping (attack vector identification)

Step	Dissector	botmon	Status
Group by `(protocol, source_port)` / `(Protocol, DstPort)`	5%, no z-score	5%, no z-score	Implemented ✓
Min vector fraction to keep	5% of bytes	5% of bytes	Implemented ✓
Second pass on remainder traffic: group by `(protocol, destination_port)`	10%, no z-score	—	Planned

The remainder pass catches random-source-port floods that don't have a dominant source port. In botmon this is a gap: if an attack uses randomised source ports its dominant signal is the destination port, which the primary grouping already captures from the bot's perspective, so the gap is smaller than it appears.

Per-vector L3 attributes

Field	Dissector	botmon	Status
Destination port distribution	10%, no z-score	—	Planned
Source IP list	All distinct IPs	—	Planned
Frame length	5%, z-score on	—	Planned
Ethernet type	5%, z-score on	—	Planned
IP fragmentation offset	10%, z-score on	—	Planned
IP TTL	10%, z-score on	—	Planned

Per-vector L4 attributes

Field	Dissector	botmon	Status
TCP flags distribution	10%, z-score on	—	Planned
ICMP type distribution	10%, z-score on	—	Planned

Per-vector L7 attributes

Field	Dissector	botmon	Status
DNS query name	10%, z-score on	—	Out of scope (for now)
DNS query type	10%, z-score on	—	Out of scope (for now)
HTTP URI	5%, z-score on	—	Out of scope (for now)
HTTP method	10%, z-score on	—	Out of scope (for now)
HTTP user-agent	5%, z-score on	—	Out of scope (for now)
NTP requestcode	10%, z-score on	—	Out of scope (for now)

Other

Step	Dissector	botmon	Status
Target inference (dominant destination address)	50%, no z-score	N/A — bot IP known from config	Not applicable
Fragmentation vector handling	Separate pass, restricted to known attacker IPs	—	Planned
Multi-vector merging (same service+protocol)	Deduplicated into one vector	—	Planned

Z-score pattern: the dissector uses absolute threshold only (no z-score) for the high-level structural decisions (which protocol+port combinations dominate), and enables z-score for secondary per-vector attribute extraction (TTL, frame length, flags, etc.). The intuition is that z-score adds noise at the grouping stage but surfaces meaningful peaks within an already-identified vector where distributions can be multimodal. botmon follows the same rule for the grouping stage; z-score will be opt-in when per-vector attribute extraction is added.

Output format

botmon writes one JSON object per line (NDJSON), compatible with Suricata's Eve schema. Standard Eve fields are present (timestamp, event_type, src_ip, dest_ip, src_port, dest_port, proto, flow_id, alert), with botmon-specific detail under metadata.botmon.

Local behavior record (attack or outbound_connection)

{
  "timestamp": "2024-11-01T12:00:15.000000Z",
  "event_type": "alert",
  "src_ip": "10.0.0.5",
  "dest_ip": "1.2.3.4",
  "src_port": 54321,
  "dest_port": 80,
  "proto": "tcp",
  "flow_id": 12345678901234567,
  "host": "my-sample-42",
  "alert": {
    "action": "allowed",
    "gid": 5,
    "signature_id": 2100001,
    "rev": 1,
    "signature": "botmon high packet-rate to single host",
    "category": "attack",
    "severity": 2
  },
  "metadata": {
    "botmon": {
      "scope": "local",
      "context": {
        "sample_id": "my-sample-42",
        "bot_ip": "10.0.0.5",
        "c2_ip": "203.0.113.4"
      },
      "attack_type": "tcp_flood",
      "packet_rate": 47.3,
      "packet_threshold": 20.0,
      "src_to_dst_packets": 120,
      "dst_to_src_packets": 60,
      "src_to_dst_rate": 4.0,
      "dst_to_src_rate": 2.0,
      "src_to_dst_bytes": 180000,
      "dst_to_src_bytes": 90000,
      "amplification_factor": 0.5
    }
  }
}

Global behavior record (scanning)

{
  "timestamp": "2024-11-01T12:00:15.000000Z",
  "event_type": "alert",
  "src_ip": "10.0.0.5",
  "dest_ip": "0.0.0.0",
  "flow_id": 98765432109876543,
  "alert": {
    "signature_id": 2100002,
    "signature": "botmon horizontal scan host-rate exceeded",
    "category": "scan",
    "severity": 3
  },
  "metadata": {
    "botmon": {
      "scope": "global",
      "context": { "sample_id": "my-sample-42", "bot_ip": "10.0.0.5" },
      "packet_rate": 210.0,
      "packet_threshold": 20.0,
      "destination_rate": 35.0,
      "destination_rate_threshold": 10.0
    }
  }
}

The list of scanned destination IPs is available through the corresponding local behavior records emitted in the same window. Each local record's dest_ip is one flow target.

Idle record

Emitted only with --show-idle. Uses event_type: stats (not alert) and carries the measured rates even though no threshold was crossed. Useful for correlating quiet periods against sandbox execution state.

Signature IDs

Behavior	`signature_id`
`attack`	2100001
`scanning`	2100002
`outbound_connection`	2100003

These IDs are stable across versions and can be used as filters in downstream processing (e.g. jq 'select(.alert.signature_id == 2100001)').

Flow IDs

Each behavior is assigned a stable flow_id that persists across consecutive windows as long as the same flow remains active. A gap (empty window) resets continuity. This lets you reconstruct the timeline of a single flow across multiple windows by grouping on flow_id.

Installation

Requirements:

Linux with libpcap headers (libpcap-dev on Debian/Ubuntu) for live capture.
Root or appropriate group membership to open a network interface. PCAP files work without elevated privileges.
Go 1.24+ for building from source.

# Install from the module registry
go install github.com/cochaviz/botmon@latest

# Or build from source
git clone https://github.com/cochaviz/botmon.git
cd botmon
go build -o botmon ./cmd/botmon

Confirm installation:

botmon --version
# botmon version v0.3.0

Usage

botmon <input> <bot-ip> [flags]

Argument	Description
`<input>`	Network interface name (e.g. `eth0`, `vnet0`) or path to a `.pcap` / `.pcapng` file.
`<bot-ip>`	IPv4 address of the monitored host (the bot). Used to orient flows and populate `bot_ip` in output.

Analyze a PCAP file (offline)

botmon sample.pcap 10.0.0.5 \
  -w 15 -p 20 -d 25 \
  -c 203.0.113.4 \
  -s my-sample-42 \
  -o /tmp/my-sample-42.eve.json

Monitor a live interface

sudo botmon vnet0 10.10.0.20 \
  -c 203.0.113.4 \
  -s beacon-42 \
  --save-packets 100 \
  --capture-dir /var/log/botmon/captures \
  --show-idle

Exclude infrastructure IPs from metrics

Use -x / --ignore-dst to drop known-benign endpoints (DNS resolver, gateway, sandbox controller) from flow counts so they do not inflate rates or appear in scan lists. The flag is repeatable:

botmon sample.pcap 10.0.0.5 \
  -x 10.0.0.1 \
  -x 8.8.8.8 \
  -x 192.168.1.254

The C2 IP supplied via -c/--c2-ip is automatically added to this exclusion list.

All flags

Flag	Short	Default	Description
`--window`	`-w`	`30`	Analysis window size in seconds.
`--packet-threshold`	`-p`	`5`	Packets per second per flow before the flow is classified as `attack` (requires `-c/--c2-ip`).
`--destination-threshold`	`-d`	`10`	Scan threshold factor: `T = max(1, min(maxFlows/2, int(d × window)))` unique flows per port or host before classifying the window as `scanning`.
`--c2-ip`	`-c`	(unset)	Known C2 server IP. Required for `attack` classification; automatically excluded from metrics.
`--sample-id`	`-s`	(unset)	Free-form identifier written into every output record's `context.sample_id`.
`--ignore-dst`	`-x`	(none)	Destination IPs to exclude from all metrics. Repeatable.
`--eve-log-path`	`-o`	(stdout)	File path for Eve JSON output. Appends if the file already exists.
`--log-level`	`-l`	`info`	Verbosity of the operational log written to stderr: `debug`, `info`, `warn`, `error`.
`--show-idle`		`false`	Emit a `stats` record for every window that produces no alerts.
`--save-packets`		`0` (off)	Number of most recent packets per attack destination to write as a PCAP artifact when an `attack` alert fires.
`--capture-dir`		`./captures`	Directory for packet capture artifacts. Created if it does not exist.
`--version`			Print the binary version and exit.

Memory note: Flow tracking is capped at 1024 unique bidirectional flows per window. Windows that hit the cap log a warning; flows beyond the cap are not counted. In practice this limit is not reached in single-sandbox scenarios.

Integration with bottle / bottle-warden

botmon is designed as a sidecar to the bottle sandbox instrumentation framework. Add it to a bottle profile so every sandbox run inherits consistent thresholds:

cli:
  - command: >
      botmon {{ .VmInterface }} {{ .VmIp }}
      {{- if .C2Ip }} -c {{ .C2Ip }}{{ end }}
      -s {{ .SampleName }}
      -w 30 -p 20 -d 25
      --save-packets 100
      --capture-dir {{ .LogDir }}/captures
      -o {{ .LogDir }}/{{ .SampleName }}.eve.json
    output: file

bottle-warden can tail the Eve file to detect when beaconing stops or spikes, using standard Suricata tooling or plain jq queries.

Reproducibility notes for research

Configuration provenance

At startup, botmon prints every active flag value to stderr before processing begins:

Configuration:
  version: v0.3.0
  input: sample.pcap
  bot-ip: 10.0.0.5
  window: 30s
  packet-threshold: 20.00
  destination-threshold: 25.00
  ...

Capturing this output alongside Eve logs gives a complete record of the parameters used for a run.

Threshold selection

The default thresholds (-p 5, -d 10) are intentionally conservative. For publication, report the thresholds used and the window size explicitly, since both affect what is and is not classified as suspicious.

Offline vs. live capture

PCAP file analysis is fully deterministic: the same file with the same flags always produces the same output. Live interface capture is not deterministic (packet ordering depends on the OS scheduler). For reproducible experiments, record traffic first with tcpdump or Wireshark and replay offline.

Packet artifacts

When --save-packets N is set, botmon writes a .pcap file to --capture-dir for each attack alert, named after the sample ID, destination IP, and timestamp. These artifacts let you inspect the exact packets that triggered an alert without re-running the sandbox.

Development

# Run all tests
go test ./...

# Run with verbose window accounting
botmon sample.pcap 10.0.0.5 -l debug

# Build and install locally
go build -o botmon ./cmd/botmon

Module path: github.com/cochaviz/botmon

Source layout:

Path	Contents
`cmd/cli.go`	CLI entry point, flag definitions, startup banner
`internal/collector.go`	`FlowCollector` — packet accumulation and windowing
`internal/classifier.go`	`BehaviorClassifier` — threshold application, flow ID continuity
`internal/analysis.go`	`AnalysisConfiguration` — orchestration, Eve output, capture
`internal/behavior.go`	Data types: `LocalBehavior`, `GlobalBehavior`, `BehaviorFlow`
`internal/flows.go`	Flow key types, canonical orientation, `normalizedFlowCounts`
`internal/eve_logger.go`	Suricata Eve JSON serialization
`internal/utils.go`	Packet counting, RFC 1918 checks, source endpoint heuristics

References

[1] Nederlandse anti-DDoS Coalitie (NLADC). DDoS Dissector [Software]. SIDN Labs / NBIP. https://github.com/NLADC/dissector

Name		Name	Last commit message	Last commit date
Latest commit History 87 Commits
.github/workflows		.github/workflows
cmd		cmd
internal		internal
.gitignore		.gitignore
README.md		README.md
go.mod		go.mod
go.sum		go.sum
main.go		main.go

Folders and files

Latest commit

History

Repository files navigation