-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check for overflow of next_code
when adding burst_size
#30
Check for overflow of next_code
when adding burst_size
#30
Conversation
break; | ||
} | ||
} else { | ||
// next_code overflowed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. That's an odd bug. This overflow would imply there's a code to be assigned larger than max_code
. In particular we will assign one later on. Since codes are at most 12bit and not 16, reaching the overflow should be impossible without incorrectly assumed codes before that point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue I was seeing was that this code was being called in a loop, and next_code
just eventually stepped its way past 64K (it started at 18 using the provided repro).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[/.../weezl/src/decode.rs:765] &self.next_code = 18
[/.../weezl/src/decode.rs:765] &burst_size = 0
...
[/,,,/weezl/src/decode.rs:765] &self.next_code = 65535
[/.../weezl/src/decode.rs:765] &burst_size = 0
[/.../weezl/src/decode.rs:765] &self.next_code = 65535
[/.../weezl/src/decode.rs:765] &burst_size = 1
[/.../weezl/src/decode.rs:765] &self.next_code = 65535
[/.../weezl/src/decode.rs:765] &burst_size = 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the non-optimized code path the increment itself is gated behind !self.table.is_full()
https://github.com/image-rs/lzw/blob/d55c610f1bff4ef0ed2f72af2c823b4a244d51a3/src/decode.rs#L893-L904 aka self.inner.len() < MAX_ENTRIES
where MAX_ENTRIES = 1 << 12
.
I'm decently sure that if we add a debug assertion to Table::derive
, which asserts this invariant, then we happen across a much sooner violation.
I've thrown together a patch for to put in Patch fileFrom ac575ce26fb081883092536e0fcbf00c2af59cc2 Mon Sep 17 00:00:00 2001
From: Andreas Molzer <andreas.molzer@gmx.de>
Date: Tue, 19 Apr 2022 21:29:13 +0200
Subject: [PATCH] Add debug assertions on internal invariants
---
src/decode.rs | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/src/decode.rs b/src/decode.rs
index 283e31f..49f3bfd 100644
--- a/src/decode.rs
+++ b/src/decode.rs
@@ -711,7 +711,7 @@ impl<C: CodeBuffer> Stateful for DecodeState<C> {
Some(tup) => {
status = Ok(LzwStatus::Ok);
code_link = Some(tup)
- },
+ }
};
// Track an empty `burst` (see below) means we made no progress.
@@ -827,6 +827,7 @@ impl<C: CodeBuffer> Stateful for DecodeState<C> {
// the case of requiring an allocation (which can't occur in practice).
let new_link = self.table.derive(&link, cha, code);
self.next_code += 1;
+ debug_assert!(self.next_code as usize <= MAX_ENTRIES);
code = burst;
link = new_link;
}
@@ -918,6 +919,8 @@ impl<C: CodeBuffer> Stateful for DecodeState<C> {
}
self.next_code += 1;
+ debug_assert!(self.next_code as usize <= MAX_ENTRIES);
+
new_link = link;
} else {
// It's actually quite likely that the next code will be a reset but just in case.
@@ -1203,6 +1206,13 @@ impl Table {
}
fn derive(&mut self, from: &Link, byte: u8, prev: Code) -> Link {
+ debug_assert!(
+ self.inner.len() < MAX_ENTRIES,
+ "Invalid code would be created {:?} {} {:?}",
+ from.prev,
+ byte,
+ prev
+ );
let link = from.derive(byte, prev);
let depth = self.depths[usize::from(prev)] + 1;
self.inner.push(link.clone());
--
2.35.1 The trace of running decoding with those suggest that the comparison itself relies on an incorrect assumption. Since it uses But when that is the exact moment that we enter a burst, as is the case with the provided file, then it will advance
I'll measure if that leads to too much of a performance loss due to executing less of the simple code reconstruction. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deliberation: this fixes the immediate issue. All codes that are incorrectly created can not and are not actually used during decoding. While it's not optimal that this waste is being created during decoding, it's also not critically buggy.
So, I'll merge the fix and treat the underlying cause as a separate issue (opening another issue for it).
This addresses a problem in the GIF parser detected during fuzzing of a crate that utilizes
image-rs
. The issue can be reproduced using this short test program and the attached file (too large to include in the source -- unzip first).5220731288420352.gif.zip
After the fix, loading the malformed image produces a usable
Err
result: