Skip to content

cochaviz/botmon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

87 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

botmon

botmon monitors sandboxed malware or bot traffic and emits structured alerts whenever observed network behavior crosses configurable thresholds. It ingests either a live interface or a pre-recorded PCAP file, groups packets into fixed time windows, and writes Suricata-compatible Eve JSON for each window that shows suspicious activity.

The tool is designed for use inside automated malware-analysis pipelines. It integrates directly with bottle and bottle-warden, but runs standalone on any host with libpcap.

Contents

How it works

botmon runs a sliding-window analysis loop over captured packets.

Packets (live interface or PCAP)
  │
  ▼
FlowCollector          — groups packets into bidirectional flows per window
  │
  │  WindowStats (packet counts per flow, start time, duration)
  ▼
BehaviorClassifier     — applies thresholds, assigns local and global behaviors
  │
  │  LocalBehaviors (per-flow), GlobalBehavior (window-level)
  ▼
Eve JSON output        — Suricata-compatible alert or stats record per event

Each window produces:

  • One global behavior record reflecting the overall character of the window (scanning or idle).
  • Zero or more local behavior records, one per distinct bidirectional flow that crossed a threshold.

When the window is empty (no packets at all), cross-window state (flow IDs, previously seen hosts) is reset so that a resumed session is not contaminated by a previous one.

Classification model

Local behaviors (per-flow)

Every bidirectional flow observed in a window is classified as one of:

Class Condition Eve event_type
attack Packet rate exceeds --packet-threshold and a C2 IP (-c/--c2-ip) is set alert
outbound_connection Packet rate below threshold, or no C2 IP is configured alert

The attack classification deliberately requires a known C2 IP because a high packet rate alone is ambiguous (e.g. bulk file transfer). If no -c/--c2-ip is provided, every flow is logged as outbound_connection regardless of rate.

Packet rate is computed as total packets in both directions divided by the window duration in seconds.

Direction stats are included in every local behavior record: src_to_dst_packets, dst_to_src_packets, src_to_dst_rate, dst_to_src_rate, src_to_dst_bytes, dst_to_src_bytes, and an amplification factor (dst_to_src_bytes / src_to_dst_bytes). An amplification factor greater than 1 indicates the destination replied with more data than it received, which is the defining signal for reflection/amplification attacks (a packet-count ratio would miss high-bandwidth amplifiers that inflate response size rather than response count).

The "source" of a flow is determined by a priority chain:

  1. Whichever side matches <bot-ip>.
  2. Whichever side is RFC 1918 (private address space).
  3. Whichever side sent fewer packets in the window (heuristic for scanner/initiator vs. responder).
  4. Canonical tie-break (lower IP address / lower port).

Global behaviors (window-level)

The global behavior summarizes the entire window:

Class Condition Eve event_type
scanning A horizontal or vertical scan pattern is detected (see below) alert
idle No scan pattern detected (emitted only with --show-idle) stats

Scan detection

botmon detects scanning incrementally during packet collection, before the window closes. Two patterns are recognised:

Pattern Trigger
Horizontal scan The bot contacts more than T distinct destination hosts on the same port within one window (e.g. SSH sweeping).
Vertical scan The bot contacts more than T distinct destination ports on the same host within one window (e.g. service enumeration).

The threshold T is computed as:

T = max(1, min(maxFlows/2, int(--destination-threshold × --window)))

where maxFlows is the internal flow-map capacity (1024). With the defaults (-d 10, -w 30) this gives T = min(512, 300) = 300.

When a scan threshold is crossed, all matching flows are immediately removed from the flow statistics map and recorded as scan flows. Subsequent packets on any already-flagged port or host are skipped entirely, preventing the map from being saturated by scan traffic. This ensures that concurrent non-scan flows (e.g. C2 beaconing or file transfers) still have capacity in the map and continue to produce outbound_connection or attack local behavior events as normal.

The C2 IP supplied via -c/--c2-ip, along with any IPs supplied via -x/--ignore-dst, are excluded from the flow map before scan detection runs. They cannot contribute to scan counts.

Attack type classification

Every attack local behavior is further labelled with an attack_type string in the metadata.botmon object (e.g. "tcp_flood", "udp_flood", "dns_amplification"). The classification methodology is based on the NLADC DDoS Dissector [1], adapted for an attacker-side vantage point: the dissector observes traffic arriving at a victim and groups flows by source port (the reflector's service port); botmon observes traffic leaving the bot and groups flows by destination port (the same service port, from the other side).

How attack types are determined

Within each analysis window, all flows are grouped by (protocol, destination_port) from the bot's perspective. Any group that accounts for more than 5% of total bytes in the window is considered dominant. Bytes are used rather than packets because amplification attacks produce a small number of spoofed requests but a large volume of amplified response bytes, making byte fraction a more reliable dominance signal. For each dominant group, a flow is classified as "<service>_amplification" only when all three conditions hold:

  1. Protocol is UDP. Amplification is a UDP-only phenomenon; TCP traffic to the same service ports is not a reflection vector.
  2. The bot IP is absent from both flow endpoints. In an amplification attack the bot spoofs the victim's IP as the UDP source, so the reflector replies to the victim — the flow captured on the sandbox interface is between the victim and the reflector, and the bot's own IP never appears. If the bot IP is present as either endpoint the traffic is a direct flood. If the bot IP is not configured (<bot-ip> unset), absence cannot be verified and the flow is conservatively treated as a flood.
  3. The destination port appears in the amplification services table.

If any condition fails the flow is classified as "<protocol>_flood" instead (e.g. "tcp_flood", "udp_flood"). Flows whose group is not dominant also default to "<protocol>_flood".

Amplification services table

The port-to-service mapping is taken verbatim from the dissector's AMPLIFICATION_SERVICES dictionary [1]:

Port Service
17 Quote of the Day
19 Chargen
53 DNS
69 TFTP
111 TPC
123 NTP
137 NetBios
161 SNMP
177 XDMCP
389 LDAP
500 ISAKMP
520 RIPv1
623 IPMI
1434 MS SQL
1900 SSDP
3283 Apple Remote Desktop
3389 Windows Remote Desktop
3702 WS-Discovery
5093 Sentinel
5351 NAT-PMP
5353 mDNS
5683 CoAP
10074 Mitel MiColab (CVE-2022-26143)
11211 MEMCACHED
27015 Steam
32414 Plex Media
33848 Jenkins
37810 DHDiscover

Comparison with the NLADC dissector

The table below shows how each classification step in the dissector [1] maps to botmon's current implementation and what remains planned.

Structural grouping (attack vector identification)

Step Dissector botmon Status
Group by (protocol, source_port) / (Protocol, DstPort) 5%, no z-score 5%, no z-score Implemented ✓
Min vector fraction to keep 5% of bytes 5% of bytes Implemented ✓
Second pass on remainder traffic: group by (protocol, destination_port) 10%, no z-score Planned

The remainder pass catches random-source-port floods that don't have a dominant source port. In botmon this is a gap: if an attack uses randomised source ports its dominant signal is the destination port, which the primary grouping already captures from the bot's perspective, so the gap is smaller than it appears.

Per-vector L3 attributes

Field Dissector botmon Status
Destination port distribution 10%, no z-score Planned
Source IP list All distinct IPs Planned
Frame length 5%, z-score on Planned
Ethernet type 5%, z-score on Planned
IP fragmentation offset 10%, z-score on Planned
IP TTL 10%, z-score on Planned

Per-vector L4 attributes

Field Dissector botmon Status
TCP flags distribution 10%, z-score on Planned
ICMP type distribution 10%, z-score on Planned

Per-vector L7 attributes

Field Dissector botmon Status
DNS query name 10%, z-score on Out of scope (for now)
DNS query type 10%, z-score on Out of scope (for now)
HTTP URI 5%, z-score on Out of scope (for now)
HTTP method 10%, z-score on Out of scope (for now)
HTTP user-agent 5%, z-score on Out of scope (for now)
NTP requestcode 10%, z-score on Out of scope (for now)

Other

Step Dissector botmon Status
Target inference (dominant destination address) 50%, no z-score N/A — bot IP known from config Not applicable
Fragmentation vector handling Separate pass, restricted to known attacker IPs Planned
Multi-vector merging (same service+protocol) Deduplicated into one vector Planned

Z-score pattern: the dissector uses absolute threshold only (no z-score) for the high-level structural decisions (which protocol+port combinations dominate), and enables z-score for secondary per-vector attribute extraction (TTL, frame length, flags, etc.). The intuition is that z-score adds noise at the grouping stage but surfaces meaningful peaks within an already-identified vector where distributions can be multimodal. botmon follows the same rule for the grouping stage; z-score will be opt-in when per-vector attribute extraction is added.

Output format

botmon writes one JSON object per line (NDJSON), compatible with Suricata's Eve schema. Standard Eve fields are present (timestamp, event_type, src_ip, dest_ip, src_port, dest_port, proto, flow_id, alert), with botmon-specific detail under metadata.botmon.

Local behavior record (attack or outbound_connection)

{
  "timestamp": "2024-11-01T12:00:15.000000Z",
  "event_type": "alert",
  "src_ip": "10.0.0.5",
  "dest_ip": "1.2.3.4",
  "src_port": 54321,
  "dest_port": 80,
  "proto": "tcp",
  "flow_id": 12345678901234567,
  "host": "my-sample-42",
  "alert": {
    "action": "allowed",
    "gid": 5,
    "signature_id": 2100001,
    "rev": 1,
    "signature": "botmon high packet-rate to single host",
    "category": "attack",
    "severity": 2
  },
  "metadata": {
    "botmon": {
      "scope": "local",
      "context": {
        "sample_id": "my-sample-42",
        "bot_ip": "10.0.0.5",
        "c2_ip": "203.0.113.4"
      },
      "attack_type": "tcp_flood",
      "packet_rate": 47.3,
      "packet_threshold": 20.0,
      "src_to_dst_packets": 120,
      "dst_to_src_packets": 60,
      "src_to_dst_rate": 4.0,
      "dst_to_src_rate": 2.0,
      "src_to_dst_bytes": 180000,
      "dst_to_src_bytes": 90000,
      "amplification_factor": 0.5
    }
  }
}

Global behavior record (scanning)

{
  "timestamp": "2024-11-01T12:00:15.000000Z",
  "event_type": "alert",
  "src_ip": "10.0.0.5",
  "dest_ip": "0.0.0.0",
  "flow_id": 98765432109876543,
  "alert": {
    "signature_id": 2100002,
    "signature": "botmon horizontal scan host-rate exceeded",
    "category": "scan",
    "severity": 3
  },
  "metadata": {
    "botmon": {
      "scope": "global",
      "context": { "sample_id": "my-sample-42", "bot_ip": "10.0.0.5" },
      "packet_rate": 210.0,
      "packet_threshold": 20.0,
      "destination_rate": 35.0,
      "destination_rate_threshold": 10.0
    }
  }
}

The list of scanned destination IPs is available through the corresponding local behavior records emitted in the same window. Each local record's dest_ip is one flow target.

Idle record

Emitted only with --show-idle. Uses event_type: stats (not alert) and carries the measured rates even though no threshold was crossed. Useful for correlating quiet periods against sandbox execution state.

Signature IDs

Behavior signature_id
attack 2100001
scanning 2100002
outbound_connection 2100003

These IDs are stable across versions and can be used as filters in downstream processing (e.g. jq 'select(.alert.signature_id == 2100001)').

Flow IDs

Each behavior is assigned a stable flow_id that persists across consecutive windows as long as the same flow remains active. A gap (empty window) resets continuity. This lets you reconstruct the timeline of a single flow across multiple windows by grouping on flow_id.

Installation

Requirements:

  • Linux with libpcap headers (libpcap-dev on Debian/Ubuntu) for live capture.
  • Root or appropriate group membership to open a network interface. PCAP files work without elevated privileges.
  • Go 1.24+ for building from source.
# Install from the module registry
go install github.com/cochaviz/botmon@latest

# Or build from source
git clone https://github.com/cochaviz/botmon.git
cd botmon
go build -o botmon ./cmd/botmon

Confirm installation:

botmon --version
# botmon version v0.3.0

Usage

botmon <input> <bot-ip> [flags]
Argument Description
<input> Network interface name (e.g. eth0, vnet0) or path to a .pcap / .pcapng file.
<bot-ip> IPv4 address of the monitored host (the bot). Used to orient flows and populate bot_ip in output.

Analyze a PCAP file (offline)

botmon sample.pcap 10.0.0.5 \
  -w 15 -p 20 -d 25 \
  -c 203.0.113.4 \
  -s my-sample-42 \
  -o /tmp/my-sample-42.eve.json

Monitor a live interface

sudo botmon vnet0 10.10.0.20 \
  -c 203.0.113.4 \
  -s beacon-42 \
  --save-packets 100 \
  --capture-dir /var/log/botmon/captures \
  --show-idle

Exclude infrastructure IPs from metrics

Use -x / --ignore-dst to drop known-benign endpoints (DNS resolver, gateway, sandbox controller) from flow counts so they do not inflate rates or appear in scan lists. The flag is repeatable:

botmon sample.pcap 10.0.0.5 \
  -x 10.0.0.1 \
  -x 8.8.8.8 \
  -x 192.168.1.254

The C2 IP supplied via -c/--c2-ip is automatically added to this exclusion list.

All flags

Flag Short Default Description
--window -w 30 Analysis window size in seconds.
--packet-threshold -p 5 Packets per second per flow before the flow is classified as attack (requires -c/--c2-ip).
--destination-threshold -d 10 Scan threshold factor: T = max(1, min(maxFlows/2, int(d × window))) unique flows per port or host before classifying the window as scanning.
--c2-ip -c (unset) Known C2 server IP. Required for attack classification; automatically excluded from metrics.
--sample-id -s (unset) Free-form identifier written into every output record's context.sample_id.
--ignore-dst -x (none) Destination IPs to exclude from all metrics. Repeatable.
--eve-log-path -o (stdout) File path for Eve JSON output. Appends if the file already exists.
--log-level -l info Verbosity of the operational log written to stderr: debug, info, warn, error.
--show-idle false Emit a stats record for every window that produces no alerts.
--save-packets 0 (off) Number of most recent packets per attack destination to write as a PCAP artifact when an attack alert fires.
--capture-dir ./captures Directory for packet capture artifacts. Created if it does not exist.
--version Print the binary version and exit.

Memory note: Flow tracking is capped at 1024 unique bidirectional flows per window. Windows that hit the cap log a warning; flows beyond the cap are not counted. In practice this limit is not reached in single-sandbox scenarios.

Integration with bottle / bottle-warden

botmon is designed as a sidecar to the bottle sandbox instrumentation framework. Add it to a bottle profile so every sandbox run inherits consistent thresholds:

cli:
  - command: >
      botmon {{ .VmInterface }} {{ .VmIp }}
      {{- if .C2Ip }} -c {{ .C2Ip }}{{ end }}
      -s {{ .SampleName }}
      -w 30 -p 20 -d 25
      --save-packets 100
      --capture-dir {{ .LogDir }}/captures
      -o {{ .LogDir }}/{{ .SampleName }}.eve.json
    output: file

bottle-warden can tail the Eve file to detect when beaconing stops or spikes, using standard Suricata tooling or plain jq queries.

Reproducibility notes for research

Configuration provenance

At startup, botmon prints every active flag value to stderr before processing begins:

Configuration:
  version: v0.3.0
  input: sample.pcap
  bot-ip: 10.0.0.5
  window: 30s
  packet-threshold: 20.00
  destination-threshold: 25.00
  ...

Capturing this output alongside Eve logs gives a complete record of the parameters used for a run.

Threshold selection

The default thresholds (-p 5, -d 10) are intentionally conservative. For publication, report the thresholds used and the window size explicitly, since both affect what is and is not classified as suspicious.

Offline vs. live capture

PCAP file analysis is fully deterministic: the same file with the same flags always produces the same output. Live interface capture is not deterministic (packet ordering depends on the OS scheduler). For reproducible experiments, record traffic first with tcpdump or Wireshark and replay offline.

Packet artifacts

When --save-packets N is set, botmon writes a .pcap file to --capture-dir for each attack alert, named after the sample ID, destination IP, and timestamp. These artifacts let you inspect the exact packets that triggered an alert without re-running the sandbox.

Development

# Run all tests
go test ./...

# Run with verbose window accounting
botmon sample.pcap 10.0.0.5 -l debug

# Build and install locally
go build -o botmon ./cmd/botmon

Module path: github.com/cochaviz/botmon

Source layout:

Path Contents
cmd/cli.go CLI entry point, flag definitions, startup banner
internal/collector.go FlowCollector — packet accumulation and windowing
internal/classifier.go BehaviorClassifier — threshold application, flow ID continuity
internal/analysis.go AnalysisConfiguration — orchestration, Eve output, capture
internal/behavior.go Data types: LocalBehavior, GlobalBehavior, BehaviorFlow
internal/flows.go Flow key types, canonical orientation, normalizedFlowCounts
internal/eve_logger.go Suricata Eve JSON serialization
internal/utils.go Packet counting, RFC 1918 checks, source endpoint heuristics

References

[1] Nederlandse anti-DDoS Coalitie (NLADC). DDoS Dissector [Software]. SIDN Labs / NBIP. https://github.com/NLADC/dissector

About

Network monitoring for botnet clients, extract attacks packets with eve-compatible logging

Topics

Resources

Stars

Watchers

Forks

Contributors

Languages