Skip to content

Conversation

@turian
Copy link
Contributor

@turian turian commented Nov 6, 2025

Related to #12596

Purpose

Allow users to optionally export 16-bit evaluation samples (PNG) from the unconditional training example while keeping TensorBoard previews 8-bit and avoiding silent dtype conversions.

Changes

  • Adds helpers:
    • _prepare_sample_images: clamps/scales NHWC float [0,1] to uint8/uint16 arrays; also builds an 8-bit preview for TensorBoard.
    • _log_sample_images: logs previews to TensorBoard and uploads true 16-bit PNGs to W&B via file paths.
  • Introduces --image_bit_depth with choices {8, 16} (default 8). Explicit guards raise ValueError for unsupported depths/dtypes.
  • W&B: replaces in-memory buffer usage with temporary file paths, so the uploaded artifact is the exact 16-bit PNG. Notes:
    • Pillow is used to encode 16-bit PNGs. BytesIO is avoided because wandb.Image does not reliably accept it.
    • For RGB(A) 16-bit, uses raw mode "RGB;16B"/"RGBA;16B" and byteswap to satisfy Pillow’s big-endian expectation.
  • Aligns image logging steps to global_step (matches scalar metrics).
  • Docs: README documents the flag and clarifies preview behavior (TensorBoard stays 8-bit; W&B previews are typically 8-bit, but uploaded files keep 16-bit).

Behavior/UX

  • Default behavior unchanged (8-bit).
  • When --image_bit_depth 16 is set:
    • TensorBoard shows 8-bit previews for usability.
    • W&B receives the true 16-bit PNG files; previews may look 8-bit in the UI, but the stored files are 16-bit and can be downloaded.
  • Explicit dtype/shape checks fail fast if inputs are not uint8/uint16 or are in an unsupported layout.

Limitations (intentional to keep scope small)

  • 32-bit TIFFs are not included (would require broader encoding and logger plumbing).
  • Pillow is required at runtime for W&B 16-bit encoding. We keep the import scoped and do not add a new dependency to the example’s requirements to avoid expanding the PR; torchvision typically brings Pillow in.

Notes for reviewers

  • No change to training semantics; this only touches evaluation/sample logging.
  • Helper functions localize the conversion and logging behavior for future extensions (additional bit depths can be added by extending the helpers and CLI choices).
  • Endianness is handled explicitly for 16-bit RGB(A) via ";16B" and byteswap.
  • Temporary file paths are unlinked after logging.

Follow-ups (out of scope here)

  • Optional: support 32-bit float/TIFF and a generic bit-depth pipeline.

Before submitting

Who can review?

@turian
Copy link
Contributor Author

turian commented Nov 6, 2025

Closing for now, I need to test this more.

@turian turian closed this Nov 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant