Skip to content

fix(importer): cap chord name read at field width in GP3-5 binary parser#2669

Open
kaizenman wants to merge 1 commit intoCoderLine:developfrom
kaizenman:fix/gp5-chord-name-overflow
Open

fix(importer): cap chord name read at field width in GP3-5 binary parser#2669
kaizenman wants to merge 1 commit intoCoderLine:developfrom
kaizenman:fix/gp5-chord-name-overflow

Conversation

@kaizenman
Copy link
Copy Markdown

@kaizenman kaizenman commented Apr 24, 2026

Issues

Fixes #2676

Proposed changes

GpBinaryHelpers.gpReadStringByteLength in Gp3To5Importer interprets the length-hint byte as the actual byte count to read:

const stringLength: number = data.readByte();
const s: string = GpBinaryHelpers.gpReadString(data, stringLength, encoding);
if (stringLength < length) {
    data.skip(length - stringLength);
}

Real-world Guitar Pro 5 files contain chord-name fields where the length hint exceeds the field width — Guitar Pro's editor lets users type chord-comment strings longer than the 22-byte on-disk field, and stores the original length byte verbatim while truncating the data. PyGuitarPro and TuxGuitar (the two other open-source GP parsers) both treat this field as fixed width, decoding only min(stringLength, length) of the always-length-bytes payload.

When AlphaTab encounters such a field (for example a chord with stringLength=32 in a 21-byte field), it advances the stream by an extra stringLength - length bytes, mis-aligning the rest of the beat parse. The misalignment lands in readBend, which reads a 4-byte pointCount from garbage bytes (observed value: 84 097 000), enters a tight loop reading 9 bytes per iteration, and burns the V8 heap. In Node this OOM-crashes after a few seconds; in the browser it deadlocks the tab at 100% CPU.

This is a practical DoS vector against any page running AlphaTab on user-supplied files.

Reproducer

test-data/guitarpro5/chord-name-overflow.gp5 — a published Guitar Pro 5.10 file with a chord-name length hint of 32 in a 22-byte field. Pre-fix: readBend enters the runaway loop and the worker hangs. Post-fix: parses cleanly in ~17 ms.

Fix

Read a fixed-width field (length bytes) and decode up to min(stringLength, length) bytes — matching PyGuitarPro's readByteSizeString(count) and TuxGuitar's readStringByte(size):

const stringLength: number = data.readByte();
const fieldBytes: Uint8Array = new Uint8Array(length);
data.read(fieldBytes, 0, length);
const effectiveLength: number = Math.max(0, Math.min(stringLength, length));
const decoded: Uint8Array = new Uint8Array(effectiveLength);
decoded.set(fieldBytes.subarray(0, effectiveLength));
return IOHelper.toString(decoded, encoding);

The buffer is explicitly sized down before passing to IOHelper.toString because TextDecoder.decode(arr.buffer) ignores Uint8Array byte offsets — without the copy the decoded string includes padding NULs from the wider field buffer.

Tests

Two regression tests in Gp5ImporterTest:

  1. chord-name-overflow — end-to-end: loads the fixture, asserts tracks.length === 1, masterBars.length === 193. Without the fix this hangs the test runner; with the fix it parses in <50 ms.
  2. gpReadStringByteLength caps consumption at field width — synthetic unit test: feeds an oversized length hint (32) to a 21-byte field, asserts the function consumes exactly length + 1 bytes and that the next byte (a sentinel) survives unread.

All 41 GP5 importer tests pass (full GP-format importer suite: 174 — 16 GP3 + 21 GP4 + 41 GP5 + 42 GP7 + 26 GP8 + 28 GPX).

Validation

The fixture's beat-by-beat byte position matches PyGuitarPro after the fix: 1217/1217 beats land at identical cursor positions in both parsers. TuxGuitar's readString(size, len) follows the same fixed-width semantic.

Note

One of three independent fixes in the same area. The others address distinct root causes:

Checklist

  • I consent that this change becomes part of alphaTab under it's current or any future open source license
  • Changes are implemented
  • New tests were added

Further details

  • This is a breaking change
  • This change will require update of the documentation/website

GpBinaryHelpers.gpReadStringByteLength interpreted the length-hint byte
as the actual read count, advancing the stream by stringLength + 1
bytes. The field is fixed-width (length + 1 bytes) and the hint is just
how many of those bytes are the active string — Guitar Pro itself,
PyGuitarPro and TuxGuitar all read this way. When stringLength exceeds
the field width (real-world GP5 chord names where Guitar Pro stored a
hint of 32 in a 22-byte field) AlphaTab over-consumed the stream,
mis-aligned the parse, and eventually attempted an unbounded readBend
loop on garbage bytes — practical DoS against any page running AlphaTab.

Read a fixed-width field (length + 1 bytes) and decode up to
min(stringLength, length) bytes, matching the reference parsers.
The buffer is also explicitly sized down before passing to
IOHelper.toString since TextDecoder.decode(arr.buffer) ignores
TypedArray byte offsets.

The fixture test-data/guitarpro5/chord-name-overflow.gp5 has a chord
diagram with a 32-byte name hint in a 22-byte field that previously
triggered the runaway loop. Two regression tests:
  - end-to-end: file parses cleanly with expected track/measure counts
  - synthetic: gpReadStringByteLength stops at field width, leaving
    the stream pointer at length + 1 bytes regardless of the hint
@Danielku15
Copy link
Copy Markdown
Member

Thanks for the fixes, can you open counterpart bug reports following the template so we have proper tracking of these items? Also: the issue and PR templates are not optional, please be sure to update your description accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

gpReadStringByteLength misaligns parser on oversized chord-name length hint (DoS)

2 participants