-
Notifications
You must be signed in to change notification settings - Fork 1
Sound File Analysis
analysis by Claude
Technical reference for the sound bank and stream file format used in
Star Wars Battlefront (2004) and Star Wars Battlefront II (2005),
reverse-engineered from SoundFLMunge.exe, the VB ripper source,
and binary analysis of Xbox hot.lvl.
| Property | Value |
|---|---|
| File | tools/SoundFLMunge.exe |
| Size | 176 KB |
| Compiled | Wed Jul 13 17:29:59 2005 |
| Architecture | x86 PE32, Windows CUI (console) |
| PDB path | e:\Battlefront2\main\ToolsFL\SoundFLMunge\Release\SoundFLMunge.pdb |
| Source tree | e:\Battlefront2\main\ToolsFL\SoundFLMunge\ |
SoundFLMunge.exe ("Sound FL Munge") is the build-time tool that packages
individual WAV files into the binary .bnk (sample bank) and .str (stream)
files. Those files are then further wrapped into .lvl files by the level
packer. Its help text and disassembly reveal the complete internal format.
| Flag | Description | Platform |
|---|---|---|
pcm8 |
8-bit PCM | PC |
pcm16 |
16-bit PCM | PC |
imaadpcm |
IMA ADPCM (WAV format 0x0011) | PC |
xadpcm |
Xbox ADPCM (WAV format 0x0069) | Xbox |
vag |
PS2 VAG ADPCM | PS2 |
| Extension | Contents |
|---|---|
.bnk |
Sample bank — short, one-shot sounds |
.str |
Stream — long audio segments played sequentially (music, ambience) |
Both are wrapped in UCF binary chunks and stored inside .lvl files.
All game data files use the UCF Binary format (the engine references it as "UCF" — likely Universal Container File). It is structurally identical to RIFF but with different semantics.
Every chunk has an 8-byte header:
Offset Size Field
0 4 ID — four bytes (often ASCII, sometimes FNV-1a hash)
4 4 Size — uint32 LE, byte count of the chunk body
8 Size Body — chunk data
Children in a parent chunk are read sequentially. Each child starts at the next 4-byte-aligned offset after its predecessor:
next_child_offset = align4(current_child_offset + 8 + current_child.size)
Data chunks hold raw bytes interpreted by their ID and context.
ucfb (root)
└── 0x5bb97f21 (wrapper; purpose not fully determined)
└── emo_ (sound module; one per .lvl sound file)
├── 0x0fb40705 (bank metadata, bank 1)
├── 0xd872e2a5 (bank audio data, bank 1)
├── 0x0fb40705 (bank metadata, bank 2) ← if multi-bank
└── 0xd872e2a5 (bank audio data, bank 2)
Chunk IDs in the tree are stored as little-endian uint32, so emo_
(ASCII 65 6d 6f 5f) is the integer 0x5f6f6d65.
| Chunk ID | ASCII | Role |
|---|---|---|
0x62666375 |
ucfb |
Root container |
0x5f6f6d65 |
emo_ |
Sound module container |
0x0fb40705 |
— | Bank metadata chunk |
0xd872e2a5 |
— | Bank audio data chunk |
0x809608b6 |
— | Block-alignment padding chunk |
The emo_ identifier is the FNV-1a hash of the empty string "" XOR'd with
a constant, or it may simply be the literal ASCII bytes of an internal module
name. Its presence immediately identifies this file as a sound bank.
The metadata chunk precedes its paired audio data chunk within emo_.
Its body is a flat sequence of tagged field pairs — each field is 8 bytes:
[tag : uint32 LE][value : uint32 LE]
where tag is the FNV-1a hash of the field name (same hash used for sound
name lookup), and value is the field data.
The metadata chunk opens with a bank-level header (40 bytes = 5 pairs):
| Offset | Tag (LE) | Meaning | Example (hot.lvl bank 1) |
|---|---|---|---|
| +0 | 0x8d39bde6 |
(dynamic — possibly file offset or sequence number) | varies |
| +8 | 0xb99d8552 |
(unknown — value consistently 4; possibly bank format version) | 4 |
| +16 | 0x7816084b |
Channel count (1=mono, 2=stereo) | 2 |
| +24 | 0x182fd58d |
(unknown — value consistently 4) | 4 |
| +32 | 0x40fbdebd |
Sample/segment count in this bank | 3 |
Immediately after the bank header (starting at byte 40 of the chunk body):
| Offset | Tag (LE) | Meaning | Example |
|---|---|---|---|
| +40 | 0x23a0d95c |
Total audio data size (bytes in paired data chunk) | 0x02AE4000 |
| +48 | 0x7aaf1a1c |
Substream count (1 = normal; 2 = 4-ch ambient interleaved as 2×stereo) | 2 |
| +56 | 0x740fdb0c |
Substream interleave size (bytes per interleave block, e.g. 32768) | 0x9000 |
| +64 | 0xb969be96 |
(unknown trailing header field) | varies |
Ambient sound encoding note: When substream count is 2, the audio data chunk contains two interleaved stereo streams representing front and back channel pairs of a 4-channel ambient sound. The deinterlacer must split these into separate stereo files. See §6.
After the bank header (72 bytes), the metadata chunk contains one 48-byte record per sound, packed sequentially (6 tag+value pairs each):
Offset within record Tag (LE) Meaning
0 +0 0x37386ae0 Name hash (FNV-1a of sound filename, see §5)
8 +8 0x2fb31c01 Sample rate (exact Hz, e.g. 44100, 22050)
16 +16 0x23a0d95c Audio data size (bytes of raw audio for this entry)
24 +24 0x1d48feff (unknown; stream format flags?)
32 +32 0x809608b6 Post-data padding (bytes from end of raw audio to next substream-interleave boundary; 0 for sample banks)
40 +40 0x2e789fb4 Skip flag / reference flag
The skip flag value 0x7D268157 means the actual audio data is stored in
a separate file (e.g. common.bnk on PC). Entries with this value should be
skipped during extraction from this file.
VB ripper note: SoundRipperVB identifies entries by scanning for the byte sequence
5C D9 A0 23(the little-endian bytes of tag0x23a0d95c). It reads: name hash ati−12, sample rate ati−4, data size ati+4, and skip flags ati+0x14/i+0x18. This works because the 16 bytes before the data-size tag are always the name-hash and sample-rate pairs.
The audio data chunk immediately follows the metadata chunk within emo_.
Its body contains the raw encoded audio for all sounds in the bank,
stored sequentially with block alignment.
Two distinct alignment concepts exist and must not be confused:
| Concept | Purpose | Block size |
|---|---|---|
| WAV output size (audioReadSize) | Rounds raw data size up so the output WAV is 2048-aligned | 2048 bytes |
| Entry stride (offset to next entry) | Determined by the per-entry block_padding field — aligns to the substream interleave boundary |
substream interleave size (e.g. 36864 bytes for Xbox streams) |
For sample banks (substream count = 1), block_padding is 0 and both
concepts collapse to the same 2048-byte alignment.
For stream banks with substream count > 1, the entry stride must use
block_padding, not the 2048-aligned WAV size:
entry_offset[0] = data_chunk.body_offset
entry_offset[n] = entry_offset[n-1] + raw_size[n-1] + block_padding[n-1]
The block_padding value (tag 0x809608b6, field at i+0x14 relative to
the SearchStart tag) is the exact number of bytes between the end of raw audio
and the next substream-interleave boundary. For hot.lvl Xbox bank 1 with
substream interleave = 36864 (0x9000):
| Entry | raw_size | block_padding | stride |
|---|---|---|---|
| hot_amb_wind | 18,931,104 | 16,992 | 18,948,096 (= 512 × 36,864) |
| hot_amb_icecave | 9,425,520 | 11,664 | 9,437,184 (= 256 × 36,864) |
| hot_amb_hangar | 16,552,944 | 35,856 | 16,588,800 (= 450 × 36,864) |
The 2048-aligned WAV output sizes (18,931,712 / 9,426,944 / 16,553,984) are only used as the byte count to read and write into the output WAV file. They do NOT determine where the next entry starts.
VB ripper bug: SoundRipperVB uses the 2048-aligned size as the stride, placing hot_amb_icecave at 0x120E800 and hot_amb_hangar at 0x1B0C000. Both land in the middle of an interleave block where ADPCM step indices are invalid (e.g. L=202, R=131), making those files silent or undecodable. The correct offsets (verified by ADPCM block header inspection) are 0x1212800 and 0x1B12800 respectively.
Bank 1 is a stream bank: substream count = 2, interleave = 36864 bytes. The entire data chunk is 1,220 consecutive 36864-byte interleave blocks (all containing valid ADPCM data). Entry offsets are interleave-aligned:
File offset 0x000800 — Data bank 1 starts (3 stereo Xbox ADPCM streams)
hot_amb_wind: offset 0x000800 raw 18,931,104 B readSize 18,931,712 B
hot_amb_icecave: offset 0x1212800 raw 9,425,520 B readSize 9,426,944 B
hot_amb_hangar: offset 0x1B12800 raw 16,552,944 B readSize 16,553,984 B
File offset 0x2AE4800 — Metadata bank 2 starts
File offset 0x2AEB000 — Data bank 2 starts (512 mono sample entries, no padding)
audioReadSize (2048-aligned) is written into the WAV data chunk size field.
The file offset of each entry is determined by the interleave-aligned stride.
Sound file names are stored as FNV-1a hashes. The algorithm is:
int hashString(String input) {
const int FNV_prime = 16777619;
const int offset = 2166136261; // FNV offset basis
int hash = offset;
for (int i = 0; i < input.length; i++) {
int c = input.codeUnitAt(i) | 0x20; // force lowercase
hash ^= c;
hash = (hash * FNV_prime) & 0xFFFFFFFF; // 32-bit overflow
}
return hash;
}The dictionary (source_code/SoundRipperVB/dictionary.txt) contains
thousands of known sound names. When a hash cannot be resolved, fall back
to the hex string (e.g. 0x7a849cde).
Ambient sound files (names containing _amb_) encode four audio channels
as two interleaved stereo streams:
- Substream 0: front-left / front-right channels
- Substream 1: back-left / back-right channels
The munge tool writes them with -substream 2 <interleave_size>. The bank
metadata reflects this: substream count = 2.
In the data chunk the streams are interleaved in blocks of substream_interleave_size bytes:
[substream0 block 0 (N bytes)]
[substream1 block 0 (N bytes)]
[substream0 block 1 (N bytes)]
[substream1 block 1 (N bytes)]
...
where N = substream interleave size from the metadata (tag 0x740fdb0c).
The zig deinterlacer (source_code/audio-deinterlacer/) uses
interlace_sample_count = 33280 as the interleave block size in samples.
To extract correctly: read the full audio blob, then split into two stereo files by deinterleaving at the block boundary.
"RIFF" [file_size - 8 : u32]
"WAVE"
"fmt " [chunk_size=20 : u32]
format_tag = 0x0069 (105)
channels = 1 or 2
sample_rate = [from metadata : u32]
byte_rate = sample_rate / 2
block_align = 36 (mono) or 72 (stereo)
bits_per_sample = 4
extra_size = 2
extra_data = 0x0040
"data" [data_size : u32]
[raw audio bytes]
Identical to Xbox ADPCM header except:
-
format_tag = 0x0011(17)
"RIFF" [file_size - 8 : u32]
"WAVE"
"fmt " [chunk_size=16 : u32]
format_tag = 0x0001
channels = 1 or 2
sample_rate = [from metadata : u32]
byte_rate = sample_rate * 2
block_align = 2
bits_per_sample = 16
"data" [data_size : u32]
[raw audio bytes]
VAG is the native PS2 ADPCM format. Header is 48 bytes:
Offset Size Field
0 4 Magic = "VAGp" (0x56414770)
4 4 Version = 0x00000002 (big-endian)
8 4 Reserved = 0
12 4 Data size (big-endian)
16 4 Sample rate (big-endian)
20 4 Reserved (volume / pitch / ADSR fields)
28 4 Reserved
32 16 Track name (ASCII, zero-padded)
| Platform | Stream encoding | Sample bank encoding | Block size | Bank ext |
|---|---|---|---|---|
| Xbox | Xbox ADPCM (0x0069) |
Xbox ADPCM (0x0069) |
2048 |
.bnk / .str
|
| PC | Xbox ADPCM (0x0069) |
PCM16 (0x0001) |
2048 |
.bnk / .str
|
| PS2 | VAG ADPCM | VAG ADPCM | 16384 |
.bnk / .str
|
| PSP | ATRAC3plus (WAVE_FORMAT_EXTENSIBLE, GUID {E923AABF-...}) |
VAG ADPCM | n/a | .lvl |
PC stream format note: PC stream banks (.lvl / .str) use the same
Xbox ADPCM encoding (WAV format 0x0069, adpcm_ima_wav) as the Xbox
version — identical encoding, identical file sizes for the same sound.
Only PC sample banks (.bnk, e.g. common.bnk) use raw PCM16.
Discovery: This was verified by probing header bytes and running ffmpeg: Xbox ADPCM headers on PC stream data produce correct ~7-minute decodes (exit 0, 352 kb/s); PCM16 headers produce a spurious 107-second decode at 1411 kb/s; all IMA ADPCM block sizes (512, 1024, 2048) fail with "Invalid data found".
PC file layout: PC .lvl files contain only stream banks. All short
one-shot sounds are stored in a shared common.bnk file (1567+ entries on
BF2). The per-level .lvl files only hold the level-specific ambient streams.
The skip flag 0x7D268157 is used in PC files that reference common.bnk.
PSP audio encoding: PSP uses two distinct formats depending on bank type:
-
Stream banks (long audio — music, ambience, VO): ATRAC3plus wrapped in complete
WAVE_FORMAT_EXTENSIBLERIFF/WAV containers. Each entry's audio data is a self-contained.wavfile — no header construction needed for extraction. SubFormat GUID:{E923AABF-CB58-4471-A119-FFFA01E4CE62}. Block align = 744 (ATRAC3plus ~132 kbps frames). Confirmed decodable by ffmpeg (codec_name=atrac3p). -
Sample banks (short one-shots): Raw VAG ADPCM 16-byte blocks, identical to PS2. Always mono. The bank header channel count field reads garbage for PSP sample banks (field absent or at a different offset); channel is assumed to be 1.
PSP sample rate downsampling: Pandemic aggressively reduced sample rates for the PSP version to fit within UMD/RAM constraints. The sample rate stored in the per-entry metadata is the actual playback rate — parsing is correct even when the value looks unusual. Observed rates across PSP BF2 sample banks:
| Rate (Hz) | Typical use |
|---|---|
| 3016 | Extreme cases (e.g. looping background engine sounds) |
| 8000–8025 | Command post tones, UI, droid chatter |
| 11025–11057 | Most weapon and creature effects |
| 11808 | Some sounds |
| 12012 | Occasional outliers |
| 22050 | Stream banks (standard) |
| 44100 | Stream banks (high quality) |
For comparison, the same sounds on PS2 are typically 22050 Hz. PSP sample rates are often half (11025) or much lower (3016–8025) than their PS2 equivalents, trading audio fidelity for space. Audio decoded at these rates will sound lo-fi but is correct — it reflects how the game shipped.
The UCF container, tagged-field bank header, and audio encoding are identical between Battlefront (BF1, 2004) and Battlefront II (BF2, 2005). The same parser handles both without modification. One structural difference was found during analysis of BF1 files across all three platforms:
The substream interleave tag (0x740fdb0c, bank data descriptor offset +56)
is not always present in BF1 bank headers. In BF2, this field is present
in every stream bank. In BF1, only some banks include it.
Observable symptom: Reading offset bankI + 20 (the value position of the
substream interleave tag, relative to the first SearchStart) returns 0x4
instead of the expected 0x9000 (Xbox) or 0x4000 (PS2). The value 0x4 is
not a real interleave size — it is the byte count of the preceding field's
value (i.e., the parser is reading the wrong field).
Affected banks (probe of BF1 Xbox test files):
| File | Banks with 0x4
|
Banks with 0x9000
|
|---|---|---|
bes.lvl |
1–4 (stream banks) | 5–6 |
cw.lvl |
1–2 (stream banks) | — |
gcw.lvl |
1–2 (stream banks) | — |
shell.lvl |
1–2 (stream banks) | — |
end.lvl (PC) |
1 (stream bank) | 2–3 |
hot.lvl (PC) |
1 (stream bank) | 2 |
BF1 PS2 files are unaffected: KAM.LVL has the correct 0x4000 interleave, and
CW.LVL / GCW.LVL are mono streams where the interleave value is irrelevant.
Why extraction still works: The substreamInterleave field is only used by
the PS2 VAG stereo decoder to deinterleave L/R channel blocks. Xbox and PC
stream extraction (Xbox ADPCM) does not use this field at all — audio is read
sequentially using audioOffset and audioReadSize. All BF1 files pass ffmpeg
validation regardless.
Root cause hypothesis: The substream interleave tagged pair was added to the
bank header format during BF2 development. BF1 banks written with an older
version of SoundFLMunge omit it, leaving the field position occupied by
whatever follows in the header (the value 4 from an adjacent field). BF2
standardised on always writing this field, explaining why some BF1 files
(bes.lvl banks 5–6, end.lvl banks 2–3) already include it — those banks
were likely built with a newer toolchain version.
Parser note: The current parser reads bankI + 20 unconditionally. A
defensive implementation would verify the tag ID at bankI + 16 equals
0x740fdb0c before trusting the value at bankI + 20, and default to 0
(or the platform default) if the tag is absent.
The .sfx and .stm source files consumed by the tool follow this format:
# Comment
path\to\sound.wav [optional_sample_id] [-resample xbox 22050] [-alias ps2 other_id]
#ifplatform xbox
path\to\xbox_only.wav ...
#endifplatform xbox
Key directives:
-
-resample <platform> <hz>— target sample rate for the platform -
-alias <platform> <id>— use a different sample ID on that platform -
-substream 2 32768— encode as 2-substream interleaved (4-channel ambient)
soundflmunge.exe
-banklistinput <file.txt> [file2.txt ...] Input bank list file(s)
-bankoutput <out.bnk|out.str> Output file path
-platform pc|xbox|ps2 Target platform
-stream Output as .str stream
-substream <count> <interleave_bytes> Multi-substream interleaving
-resample audioframe|substream Resample to frame-align
-sampleformat pcm8|pcm16|vag|xadpcm|imaadpcm Override output format
-compact Deduplicate samples
-template Write header-only stub bank
-stub <file.wav> Replace all samples with stub
-relativepath Prefix bank list paths
-nowarning Suppress duplicate warnings
-checkid [noabort] Check for duplicate IDs
-verbose Detailed output with hashed IDs
-leavetempfiles Keep intermediate converted files
| Value (LE) | Bytes | Role |
|---|---|---|
0x62666375 |
ucfb |
Root UCF chunk |
0x5f6f6d65 |
emo_ |
Sound module chunk |
0x0fb40705 |
— | Bank metadata chunk |
0xd872e2a5 |
\xa5\xe2\x72\xd8 |
Bank audio data chunk |
0x809608b6 |
— | Block-alignment padding chunk |
0x23a0d95c |
\x5c\xd9\xa0\x23 |
Tag: audio data size (VB "SearchStart") |
0x40fbdebd |
— | Tag: sample count in bank |
0x7816084b |
— | Tag: channel count |
0x37386ae0 |
— | Tag: name hash |
0x2fb31c01 |
— | Tag: sample rate |
0x7aaf1a1c |
— | Tag: substream count |
0x740fdb0c |
— | Tag: substream interleave size |
0x7D268157 |
— | Skip-flag value: audio in external file |
0x5bb97f21 |
— | Wrapper chunk inside ucfb (purpose TBD) |
- Developing a SWBFII DLC package
-
Notes
- SWBFII Lua Environment
- common.lvl notes
- PSP notes
- PPSSPP notes
- Console world memory notes
- Shipped world sizes
- Sides memory notes
- Sound File analysis
- Sound file Notes
- Console memory adjustments
- Important Links
- Productivity Tips
- Useful Debugging Functions
- BF1 Notes
- UnleashX Menu Notes
- Video Conversion Notes