Skip to content

Support streaming decompression #25

@LordMike

Description

@LordMike

The problem

Today, the OD firmware allocates a buffer of about 50 KB. This is a static buffer, always allocated - which also means some build targets cannot support compression. The flow of data is then that a client starts image transferring, and streams compressed image data into the buffer. The buffer then holds the full image, compressed, before it on image completion then decompresses the entire buffer in one go and streams it to the panel IC.

The reason it does this, is mostly due to the API. The uzlib implementation does not offer an easy way to "push" data (as we receive it from the BLE connection) through decompression, and into the panel.

This issue is to gather information on what can be done. I will group options into two:

  • Keeping the existing zlib approach. This can mostly be done without protocol changes at all, as the change is purely MCU side.
  • Introduce new compression algorithms

The Zlib approach

We keep zlib as is and improve it. This is protocol compatible and just improves the MCU side. Clients (like the HA integration) that do logic to decide when to send data will have to change a bit, as they can now send more compressed data to OD than before - with a slight risk that older MCU firmwares will fail.

All zlib-compatible options have the same core memory constraint: Deflate can refer back into a sliding history window. Normal zlib streams often use 32 KB (windowBits=15), but the client can compress with a smaller window:

windowBits History window
9 512 B
10 1 KB
11 2 KB
14 16 KB
15 32 KB

If we exposed windowBits (a protocol change, either as new algorithms like ZIP_SMALL, or as compressionBuffer=9), we could graduate the window from the client side, and do best compression. We should see better compression ratios, with larger window sizes.

uzlib

Details

  • Pro: Already integrated and MCU-friendly.
  • Pro: Small fixed decompressor state, roughly 1.3 KB, plus the selected dictionary/window.
  • Con: Pull-style API; no clean “need more input” result.
  • Con: True packet-driven streaming likely needs awkward buffering/task logic or a small uzlib fork.

Best fit if we prioritize low fixed RAM and want to keep the current dependency. For clean streaming from BLE/WiFi packets, we would likely need to patch/wrap uzlib so temporary input exhaustion is resumable instead of being treated as EOF/error.

Miniz / tinfl

Details

  • Pro: Real push-style incremental API with “needs input” / “has output” states.
  • Pro: Maps cleanly to BLE/WiFi packets and direct panel SPI output.
  • Con: Larger fixed decompressor state, roughly 8-11 KB.
  • Con: Likely too heavy for the 24 KB RAM target unless combined with small-window streams.

Best fit if we want the cleanest implementation for normal nRF/ESP targets. We can feed packet bytes into tinfl, write produced bytes directly to the panel, and remove the full compressed-image buffer.

G5 / Group5

  • Pro: Very small decoder target; designed for constrained MCUs and no 32 KB Deflate window.
  • Pro: Supports 1bpp directly and 4-color output as two separately compressed 1bpp planes.
  • Con: Not push-style as-is; current decoder takes a full compressed buffer pointer and decodes one output line at a time.
  • Con: Does not support native 4bpp / 16-color images; its “4 color” and 4GRAY modes are effectively 2bpp split into two 1bpp planes.

bitbank2 (who also makes the bb_epaper library) has experimented with image compression and touts some impressive numbers. Best fit as a new low-memory compression mode, especially for the 24 KB target. We would need protocol/client changes and a firmware decoder adaptation that reads compressed bytes from a small input ring instead of a full buffer.

Notes from the repo: README says G5 is a lossless 1bpp format, with three/four-color support implemented as two compressed 1bpp images. The code exposes decodeLine(), stores current/reference line flip arrays, and the converter modes are BW, BWR, BWYR, and 4GRAY.

heatshrink

  • Pro: Designed for embedded streaming; API is explicitly sink input, poll output, then finish.
  • Pro: Very small decoder state; RAM is mostly 2^window_sz2 history plus a configurable input buffer.
  • Pro: Works on arbitrary bytes, so it can handle 1bpp, 2bpp, 4bpp, etc. without image-specific changes.
  • Con: Not zlib-compatible, so this requires a new protocol mode and client-side encoder support.
  • Con: Compression ratio is likely worse than zlib on many images; needs testing with real OD payloads.

Best fit for the smallest targets and for clean packet-driven decompression. Firmware would feed BLE/WiFi chunks into heatshrink_decoder_sink(), drain output with heatshrink_decoder_poll(), and write produced bytes directly to panel SPI. Encoder/decoder settings like window_sz2 and lookahead_sz2 must be agreed by protocol/capability negotiation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions