fix(importer): cap chord name read at field width in GP3-5 binary parser#2669
Open
kaizenman wants to merge 1 commit intoCoderLine:developfrom
Open
fix(importer): cap chord name read at field width in GP3-5 binary parser#2669kaizenman wants to merge 1 commit intoCoderLine:developfrom
kaizenman wants to merge 1 commit intoCoderLine:developfrom
Conversation
GpBinaryHelpers.gpReadStringByteLength interpreted the length-hint byte
as the actual read count, advancing the stream by stringLength + 1
bytes. The field is fixed-width (length + 1 bytes) and the hint is just
how many of those bytes are the active string — Guitar Pro itself,
PyGuitarPro and TuxGuitar all read this way. When stringLength exceeds
the field width (real-world GP5 chord names where Guitar Pro stored a
hint of 32 in a 22-byte field) AlphaTab over-consumed the stream,
mis-aligned the parse, and eventually attempted an unbounded readBend
loop on garbage bytes — practical DoS against any page running AlphaTab.
Read a fixed-width field (length + 1 bytes) and decode up to
min(stringLength, length) bytes, matching the reference parsers.
The buffer is also explicitly sized down before passing to
IOHelper.toString since TextDecoder.decode(arr.buffer) ignores
TypedArray byte offsets.
The fixture test-data/guitarpro5/chord-name-overflow.gp5 has a chord
diagram with a 32-byte name hint in a 22-byte field that previously
triggered the runaway loop. Two regression tests:
- end-to-end: file parses cleanly with expected track/measure counts
- synthetic: gpReadStringByteLength stops at field width, leaving
the stream pointer at length + 1 bytes regardless of the hint
5 tasks
Member
|
Thanks for the fixes, can you open counterpart bug reports following the template so we have proper tracking of these items? Also: the issue and PR templates are not optional, please be sure to update your description accordingly. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Issues
Fixes #2676
Proposed changes
GpBinaryHelpers.gpReadStringByteLengthinGp3To5Importerinterprets the length-hint byte as the actual byte count to read:Real-world Guitar Pro 5 files contain chord-name fields where the length hint exceeds the field width — Guitar Pro's editor lets users type chord-comment strings longer than the 22-byte on-disk field, and stores the original length byte verbatim while truncating the data. PyGuitarPro and TuxGuitar (the two other open-source GP parsers) both treat this field as fixed width, decoding only
min(stringLength, length)of the always-length-bytes payload.When AlphaTab encounters such a field (for example a chord with
stringLength=32in a 21-byte field), it advances the stream by an extrastringLength - lengthbytes, mis-aligning the rest of the beat parse. The misalignment lands inreadBend, which reads a 4-bytepointCountfrom garbage bytes (observed value: 84 097 000), enters a tight loop reading 9 bytes per iteration, and burns the V8 heap. In Node this OOM-crashes after a few seconds; in the browser it deadlocks the tab at 100% CPU.This is a practical DoS vector against any page running AlphaTab on user-supplied files.
Reproducer
test-data/guitarpro5/chord-name-overflow.gp5— a published Guitar Pro 5.10 file with a chord-name length hint of 32 in a 22-byte field. Pre-fix:readBendenters the runaway loop and the worker hangs. Post-fix: parses cleanly in ~17 ms.Fix
Read a fixed-width field (
lengthbytes) and decode up tomin(stringLength, length)bytes — matching PyGuitarPro'sreadByteSizeString(count)and TuxGuitar'sreadStringByte(size):The buffer is explicitly sized down before passing to
IOHelper.toStringbecauseTextDecoder.decode(arr.buffer)ignoresUint8Arraybyte offsets — without the copy the decoded string includes padding NULs from the wider field buffer.Tests
Two regression tests in
Gp5ImporterTest:chord-name-overflow— end-to-end: loads the fixture, assertstracks.length === 1,masterBars.length === 193. Without the fix this hangs the test runner; with the fix it parses in <50 ms.gpReadStringByteLength caps consumption at field width— synthetic unit test: feeds an oversized length hint (32) to a 21-byte field, asserts the function consumes exactlylength + 1bytes and that the next byte (a sentinel) survives unread.All 41 GP5 importer tests pass (full GP-format importer suite: 174 — 16 GP3 + 21 GP4 + 41 GP5 + 42 GP7 + 26 GP8 + 28 GPX).
Validation
The fixture's beat-by-beat byte position matches PyGuitarPro after the fix: 1217/1217 beats land at identical cursor positions in both parsers. TuxGuitar's
readString(size, len)follows the same fixed-width semantic.Note
One of three independent fixes in the same area. The others address distinct root causes:
Checklist
Further details