Rust library and CLI for parsing FreeSWITCH mod_sofia SIP trace dump files.
cargo run --features cli -- [OPTIONS] [FILES...][dependencies]
freeswitch-sofia-trace-parser = "0"FreeSWITCH logs SIP traffic to dump files at
/var/log/freeswitch/sip_traces/{profile}/{profile}.dump (rotated as .dump.1.xz, etc.).
This library provides a streaming, multi-level parser:
- Level 1 — Frames: Split raw bytes on
\x0B\nboundaries, parse frame headers - Level 2 — Messages: Reassemble TCP segments, split aggregated messages by Content-Length
- Level 3 — Parsed SIP: Extract method/status, headers, body, and multipart MIME parts
use std::fs::File;
use freeswitch_sofia_trace_parser::{MessageIterator, SipMessage};
let file = File::open("profile.dump")?;
for result in MessageIterator::new(file) {
let msg: SipMessage = result?;
println!("{} {} {}:{} ({} frames, {} bytes)",
msg.timestamp, msg.direction, msg.transport, msg.address,
msg.frame_count, msg.content.len());
}use std::fs::File;
use freeswitch_sofia_trace_parser::ParsedMessageIterator;
let file = File::open("profile.dump")?;
for result in ParsedMessageIterator::new(file) {
let msg = result?;
println!("{} {} {} call-id={}",
msg.timestamp, msg.direction, msg.message_type,
msg.call_id().unwrap_or("-"));
}use std::fs::File;
use freeswitch_sofia_trace_parser::ParsedMessageIterator;
let file = File::open("profile.dump")?;
for result in ParsedMessageIterator::new(file) {
let msg = result?;
if let Some(parts) = msg.body_parts() {
for part in &parts {
println!(" part: {} ({} bytes)",
part.content_type().unwrap_or("(none)"),
part.body.len());
}
}
}ParsedSipMessage provides three methods for body access:
body_data()— raw bytes as UTF-8 (no processing, exact wire representation)body_text()— for JSON content types, unescapes RFC 8259 string escape sequences (\r\n→ CRLF,\t→ tab,\"→",\uXXXX→ Unicode including surrogate pairs); passthrough for all other content typesjson_field(key)— parses body as JSON, returns unescaped string value for a top-level key; returnsNoneif content type is not JSON, body is invalid, key is missing, or value is not a string
JSON-aware behavior activates for application/json and any application/*+json subtype (e.g., application/emergencyCallData.AbandonedCall+json). Matching is case-insensitive; media type parameters like charset=utf-8 are ignored.
use std::fs::File;
use freeswitch_sofia_trace_parser::ParsedMessageIterator;
let file = File::open("profile.dump")?;
for result in ParsedMessageIterator::new(file) {
let msg = result?;
// Extract embedded INVITE from NG9-1-1 AbandonedCall JSON NOTIFY
if let Some(invite) = msg.json_field("invite") {
println!("{}", invite); // actual CRLF, not literal \r\n
}
// body_text() unescapes JSON — greppable with regex
let text = msg.body_text();
if text.contains("urn:service:sos") {
println!("Emergency call: {}", msg.call_id().unwrap_or("-"));
}
}use std::process::{Command, Stdio};
use freeswitch_sofia_trace_parser::MessageIterator;
let child = Command::new("xzcat")
.arg("profile.dump.1.xz")
.stdout(Stdio::piped())
.spawn()?;
for msg in MessageIterator::new(child.stdout.unwrap()) {
let msg = msg?;
// process message...
}use std::fs::File;
use freeswitch_sofia_trace_parser::FrameIterator;
let f1 = File::open("profile.dump.2")?;
let f2 = File::open("profile.dump.1")?;
let chain = std::io::Read::chain(f1, f2);
for frame in FrameIterator::new(chain) {
let frame = frame?;
// Truncated first frames at file boundaries are handled automatically
}- Truncated first frame (rotated files,
xzgrepextracts, pipe mid-stream) \x0Bin XML/binary content (not a boundary unless followed by valid header)- Multiple SIP messages aggregated in one TCP read
- TCP segment reassembly (consecutive same-direction same-address frames)
- File concatenation (
cat dump.2 dump.1 | parser) - Non-UTF-8 content (works on
&[u8]) - EOF without trailing
\x0B\n - Multipart MIME bodies (SDP + PIDF/EIDO splitting for NG-911)
- JSON body unescaping for
application/jsonandapplication/*+jsoncontent types - TLS keep-alive whitespace (RFC 5626 CRLF probes, sofia-sip bare
\n) - Logrotate replay detection (partial frame re-written at start of new file)
- Incomplete frames at EOF (byte_count exceeds available content)
- Byte-level input coverage tracking (
ParseStatswith unparsed region reporting)
Tested against 83 production dump files (~12GB) from FreeSWITCH NG-911 infrastructure:
| Profile | Files | Frames | Messages | Multi-frame | byte_count mismatches |
|---|---|---|---|---|---|
| TCP IPv4 | 14 | 6.2M | 6.0M | 21,492 (max 7) | 0 |
| UDP IPv4 | 13 | 4.8M | 4.8M (1:1) | 0 | 0 |
| TLS IPv6 | 18 | 5.9M | 5.9M | 108 | 0 |
| TLS IPv4 | 5 | 660K | 660K | 70 | 0 |
| TCP IPv6 | 3 | 327K | 327K | - | 0 |
| UDP IPv6 | 3 | 301K | 301K (1:1) | 0 | 0 |
| Internal TCP v4 | 13 | 723K | - | - | 0 |
| Internal TCP v6 | 13 | 836K | - | - | 0 |
- Zero byte_count mismatches across all frames
- 99.99%+ of reassembled messages start with a valid SIP request/response line
- Level 3 SIP parsing: 100% on all tested profiles (TCP, UDP, TLS)
- Multipart body splitting: 1,223 multipart messages, 2,446 parts (SDP + PIDF), 0 failures
- File concatenation (
cat dump.29 dump.28 |): 965,515 frames, zero mismatches
Every sample file is verified for byte-level parse coverage. Each unparsed region is
classified by SkipReason:
PartialFirstFrame— truncated frame at start of file (logrotate, pipe, grep extract), capped at 65535 bytesOversizedFrame— skipped region exceeds 65535 bytes (corrupt or non-dump content)ReplayedFrame— logrotate wrote a partial frame tail at the start of the new fileMidStreamSkip— unrecoverable bytes skipped mid-stream (e.g., TCP reassembly edge case)IncompleteFrame— frame at EOF with fewer bytes than declared in the headerInvalidHeader— data starts withrecv/sentbut header fails to parse
ParseStats exposes bytes_read, bytes_skipped, and detailed UnparsedRegion records
with offset, length, and skip reason for each region.
The parser is designed for constant-memory streaming of arbitrarily large inputs,
including multi-day dump file chains (50GB+). Memory behavior was validated using
jemalloc heap profiling (_RJEM_MALLOC_CONF=prof:true) and gdb inspection of live
data structures during processing of 50+ chained dump files.
Parser internals at runtime (gdb-verified):
FrameIterator::buf— 64KB capacity, ~200 bytes used (single read buffer, never grows)MessageIterator::buffers— 0 entries (TCP reassembly buffers evicted after message extraction)MessageIterator::ready— 0 entries, capacity 10 (drained each iteration)
Design choices that maintain constant memory:
SkipTrackingdefaults toCountOnly— no allocation for unparsed region tracking unless opted in- TCP connection buffers are eagerly removed after complete message extraction
- Stale buffers (>2h inactive) are evicted via time-based sweep to handle TLS ephemeral port accumulation
flush_all()clears the entire buffer map at EOF
Consumers processing many files should open files lazily (one at a time) rather
than using Read::chain() upfront, which keeps all file handles and decompression
state alive for the entire run. With 50+ XZ-compressed dump files, eager chaining
consumed 172MB of LZMA decoder state alone.
OPTIONS keepalives are excluded by default (use --all-methods to include them).
# One-line summary (OPTIONS excluded by default)
freeswitch-sofia-trace-parser profile.dump
# Pipe from xzcat
xzcat profile.dump.1.xz | freeswitch-sofia-trace-parser
# Filter by method — shows INVITE requests and their 100/180/200 responses
freeswitch-sofia-trace-parser -m INVITE profile.dump
# Filter by Call-ID regex
freeswitch-sofia-trace-parser -c '6fba3e7e-dddf' profile.dump
# Header regex — all sent INVITEs from a specific extension
freeswitch-sofia-trace-parser -m INVITE -d sent -H 'From=Extension 1583' profile.dump
# Grep for a string anywhere in the SIP message (headers + body)
freeswitch-sofia-trace-parser -g '15551234567' profile.dump
# Body grep — match only in message body (SDP, EIDO XML, etc.)
freeswitch-sofia-trace-parser -b 'conference-info' -m NOTIFY --body profile.dump
# Extract SDP body from a specific call's INVITEs
freeswitch-sofia-trace-parser -c '6fba3e7e' -m INVITE -d sent --body profile.dump
# Full SIP message output
freeswitch-sofia-trace-parser -c '6fba3e7e' --full profile.dump
# Statistics: method and status code distribution
freeswitch-sofia-trace-parser --stats profile.dump
# Multiple files (concatenated in order)
freeswitch-sofia-trace-parser profile.dump.2 profile.dump.1 profile.dump
# Raw frames (level 1) or reassembled messages (level 2)
freeswitch-sofia-trace-parser --frames profile.dump
freeswitch-sofia-trace-parser --raw profile.dumpUse -D to expand matched messages to full Call-ID conversations. When any message
matches, all messages sharing its Call-ID are output. Single pass — works with stdin/pipes.
# Find dialogs containing INVITEs, show full call flow
freeswitch-sofia-trace-parser -D -m INVITE profile.dump
# Find all dialogs related to an incident ID (across profiles)
freeswitch-sofia-trace-parser -D -H 'Call-Info=abc123def456' \
esinet1-v4-tcp.dump.* esinet1-v6-tcp.dump.*
# Find dialogs by phone number anywhere in message
freeswitch-sofia-trace-parser -D -g '15551234567' profile.dump.*
# Find dialogs by body content (EIDO XML, PIDF)
freeswitch-sofia-trace-parser -D -b 'Moncton' --full profile.dump.*
# Works with stdin/pipes
xzcat profile.dump.1.xz | freeswitch-sofia-trace-parser -D -m INVITETerminated dialogs (BYE + 200 OK) that never matched are pruned during processing to limit memory usage. Unmatched Call-IDs with only OPTIONS traffic are never buffered.
| Flag | Description |
|---|---|
-m, --method <VERB> |
Include method (request + responses via CSeq), repeatable |
-x, --exclude <VERB> |
Exclude method (request + responses), repeatable |
-c, --call-id <REGEX> |
Match Call-ID by regex |
-d, --direction <DIR> |
Filter by direction (recv/sent) |
-a, --address <REGEX> |
Match address by regex |
-H, --header <NAME=REGEX> |
Match header value by regex, repeatable |
-g, --grep <REGEX> |
Match regex against full reconstructed SIP message |
-b, --body-grep <REGEX> |
Match regex against message body only |
-D, --dialog |
Expand matches to full Call-ID conversations |
--all-methods |
Include OPTIONS (excluded by default) |
| Flag | Description |
|---|---|
| (default) | One-line summary per message |
--full |
Full SIP message with metadata header |
--headers |
Headers only, no body |
--body |
Body only (for SDP/PIDF extraction) |
--raw |
Raw reassembled bytes (level 2) |
--frames |
Raw frames (level 1) |
--stats |
Method and status code distribution + input coverage |
--unparsed |
Report unparsed input regions to stderr (combinable with any mode) |
See docs/freeswitch-setup.md for the required patches, SIP profile configuration, and log rotation setup.
cargo build --release# Unit tests (no external files needed)
cargo test --lib
# Integration tests (requires production samples in samples/)
cargo test --test level1_samples -- --nocapture # Frame parsing
cargo test --test level2_samples -- --nocapture # TCP reassembly, Content-Length splitting
cargo test --test level3_samples -- --nocapture # SIP parsing, multipart, method extractionIntegration tests validate at each parser level:
- Level 1: Frame parsing, transport detection, address format, byte_count accuracy, and parse stats coverage (max 1 partial first frame per file, zero invalid header skips)
- Level 2: TCP reassembly, UDP pass-through, interleaved multi-address reassembly, frame accounting, and parse stats delegation
- Level 3: SIP request/response parsing, Call-ID/CSeq extraction, multipart MIME splitting, method distribution, and parse stats delegation
The all_samples_consistent_frame_counts test iterates all sample files per profile and asserts parse stats on each individually.
See CLAUDE.md for test architecture details.
LGPL-2.1-or-later