Check for overflow of `next_code` when adding `burst_size` #30

shutton · 2022-04-19T16:56:28Z

This addresses a problem in the GIF parser detected during fuzzing of a crate that utilizes image-rs. The issue can be reproduced using this short test program and the attached file (too large to include in the source -- unzip first).

const BAD_GIF: &[u8] = include_bytes!(concat!(
    env!("CARGO_MANIFEST_DIR"),
    "/5220731288420352.gif"
));

fn main() {
    if let Err(e) = image::load_from_memory(BAD_GIF) {
        eprintln!("failed to load: {e}");
    }
}

5220731288420352.gif.zip

After the fix, loading the malformed image produces a usable Err result:

Format error decoding Gif: unexpected EOF

HeroicKatora · 2022-04-19T17:58:04Z

src/decode.rs

+                        break;
+                    }
+                } else {
+                    // next_code overflowed


Hm. That's an odd bug. This overflow would imply there's a code to be assigned larger than max_code. In particular we will assign one later on. Since codes are at most 12bit and not 16, reaching the overflow should be impossible without incorrectly assumed codes before that point.

The issue I was seeing was that this code was being called in a loop, and next_code just eventually stepped its way past 64K (it started at 18 using the provided repro).

[/.../weezl/src/decode.rs:765] &self.next_code = 18 [/.../weezl/src/decode.rs:765] &burst_size = 0 ... [/,,,/weezl/src/decode.rs:765] &self.next_code = 65535 [/.../weezl/src/decode.rs:765] &burst_size = 0 [/.../weezl/src/decode.rs:765] &self.next_code = 65535 [/.../weezl/src/decode.rs:765] &burst_size = 1 [/.../weezl/src/decode.rs:765] &self.next_code = 65535 [/.../weezl/src/decode.rs:765] &burst_size = 0

In the non-optimized code path the increment itself is gated behind !self.table.is_full() https://github.com/image-rs/lzw/blob/d55c610f1bff4ef0ed2f72af2c823b4a244d51a3/src/decode.rs#L893-L904 aka self.inner.len() < MAX_ENTRIES where MAX_ENTRIES = 1 << 12.

I'm decently sure that if we add a debug assertion to Table::derive, which asserts this invariant, then we happen across a much sooner violation.

HeroicKatora · 2022-04-19T20:06:55Z

I've thrown together a patch for to put in debug_asserts for the actual invariants that the code is working under:

Patch file

From ac575ce26fb081883092536e0fcbf00c2af59cc2 Mon Sep 17 00:00:00 2001
From: Andreas Molzer <andreas.molzer@gmx.de>
Date: Tue, 19 Apr 2022 21:29:13 +0200
Subject: [PATCH] Add debug assertions on internal invariants

---
 src/decode.rs | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/src/decode.rs b/src/decode.rs
index 283e31f..49f3bfd 100644
--- a/src/decode.rs
+++ b/src/decode.rs
@@ -711,7 +711,7 @@ impl<C: CodeBuffer> Stateful for DecodeState<C> {
             Some(tup) => {
                 status = Ok(LzwStatus::Ok);
                 code_link = Some(tup)
-            },
+            }
         };
 
         // Track an empty `burst` (see below) means we made no progress.
@@ -827,6 +827,7 @@ impl<C: CodeBuffer> Stateful for DecodeState<C> {
                 // the case of requiring an allocation (which can't occur in practice).
                 let new_link = self.table.derive(&link, cha, code);
                 self.next_code += 1;
+                debug_assert!(self.next_code as usize <= MAX_ENTRIES);
                 code = burst;
                 link = new_link;
             }
@@ -918,6 +919,8 @@ impl<C: CodeBuffer> Stateful for DecodeState<C> {
                 }
 
                 self.next_code += 1;
+                debug_assert!(self.next_code as usize <= MAX_ENTRIES);
+
                 new_link = link;
             } else {
                 // It's actually quite likely that the next code will be a reset but just in case.
@@ -1203,6 +1206,13 @@ impl Table {
     }
 
     fn derive(&mut self, from: &Link, byte: u8, prev: Code) -> Link {
+        debug_assert!(
+            self.inner.len() < MAX_ENTRIES,
+            "Invalid code would be created {:?} {} {:?}",
+            from.prev,
+            byte,
+            prev
+        );
         let link = from.derive(byte, prev);
         let depth = self.depths[usize::from(prev)] + 1;
         self.inner.push(link.clone());
-- 
2.35.1

The trace of running decoding with those suggest that the comparison itself relies on an incorrect assumption. Since it uses == it relies on self.next_code <= self.code_buffer.max_code() but that doesn't hold. When we reach 12-bits then the code buffer does not get larger and max_code() remains at 4095. At the same time next_code will advance to 4096, and never beyond in the sequential code path, a code that will never be created and thus works correctly with the rest of the logic.

But when that is the exact moment that we enter a burst, as is the case with the provided file, then it will advance next_code beyond that and not notice that the maximum code has been reached. An easy fix would be to adjust the condition:

if potential_code >= self.code_buffer.max_code() - Code::from(self.is_tiff) {

I'll measure if that leads to too much of a performance loss due to executing less of the simple code reconstruction.

HeroicKatora

Deliberation: this fixes the immediate issue. All codes that are incorrectly created can not and are not actually used during decoding. While it's not optimal that this waste is being created during decoding, it's also not critically buggy.

So, I'll merge the fix and treat the underlying cause as a separate issue (opening another issue for it).

Check for overflow of next_code when adding burst_size

d55c610

shutton mentioned this pull request Apr 19, 2022

Security policy, reporting procedures image-rs/organization#14

Open

HeroicKatora reviewed Apr 19, 2022

View reviewed changes

HeroicKatora approved these changes Apr 20, 2022

View reviewed changes

HeroicKatora merged commit ecbbe2a into image-rs:master Apr 20, 2022

HeroicKatora mentioned this pull request Apr 20, 2022

Invalid codes being created during decode: debug_asserts for invariants #31

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Check for overflow of `next_code` when adding `burst_size` #30

Check for overflow of `next_code` when adding `burst_size` #30

shutton commented Apr 19, 2022 •

edited

Loading

HeroicKatora Apr 19, 2022

shutton Apr 19, 2022 •

edited

Loading

shutton Apr 19, 2022 •

edited

Loading

HeroicKatora Apr 19, 2022

HeroicKatora commented Apr 19, 2022

HeroicKatora left a comment

Check for overflow of next_code when adding burst_size #30

Check for overflow of next_code when adding burst_size #30

Conversation

shutton commented Apr 19, 2022 • edited Loading

HeroicKatora Apr 19, 2022

Choose a reason for hiding this comment

shutton Apr 19, 2022 • edited Loading

Choose a reason for hiding this comment

shutton Apr 19, 2022 • edited Loading

Choose a reason for hiding this comment

HeroicKatora Apr 19, 2022

Choose a reason for hiding this comment

HeroicKatora commented Apr 19, 2022

HeroicKatora left a comment

Choose a reason for hiding this comment

Check for overflow of `next_code` when adding `burst_size` #30

Check for overflow of `next_code` when adding `burst_size` #30

shutton commented Apr 19, 2022 •

edited

Loading

shutton Apr 19, 2022 •

edited

Loading

shutton Apr 19, 2022 •

edited

Loading