Summary
iscc_sci.preprocess_image applies trim_border before remove_transparency, which deviates from the normative image-normalization order in IEP-0004 (Content-Code Image) and from iscc-sdk's image_normalize. The two libraries should agree on the preflight steps they have in common; today they don't. Proposing a one-function fix in code_semantic_image.py that aligns the order without requiring model retraining.
Current behavior
iscc_sci/code_semantic_image.py, preprocess_image (lines ~127–138):
def preprocess_image(image):
with sci.metrics(name="Image preprocessing time {seconds:.4f} seconds"):
image = ImageOps.exif_transpose(image)
image = trim_border(image) # ← trim BEFORE fill
image = image.resize((512, 512), Resampling.BILINEAR)
image = remove_transparency(image) # ← fill AFTER resize
image = np.array(image, dtype=np.float32)
image /= 255.0
mean = np.array([0.5, 0.5, 0.5], dtype=np.float32)
std = np.array([0.5, 0.5, 0.5], dtype=np.float32)
image = (image - mean) / std
image = np.expand_dims(np.transpose(image, (2, 0, 1)), axis=0)
return image.astype(np.float32)
Order: exif → trim → resize → fill → normalize.
Expected behavior (per IEP-0004)
IEP-0004 §Processing documents the normative Image-Code preflight order:
- EXIF Transpose
- Alpha Compositing ("Add white background to image if it contains alpha transparency.")
- Border Trimming ("Crop uniformly colored borders if applicable.")
- Grayscale Conversion (Content-Code only — not relevant here)
- Resize (Content-Code 32×32, Semantic-Code 512×512)
iscc_sdk/image.py:image_normalize implements steps 1–3 in that order:
# iscc_sdk/iscc_sdk/image.py:47-56
if idk.sdk_opts.image_exif_transpose:
img = image_exif_transpose(img)
if idk.sdk_opts.image_fill_transparency:
img = image_fill_transparency(img)
if idk.sdk_opts.image_trim_border:
img = image_trim_border(img)
iscc-sci should match steps 1–3.
Why this matters
-
Deterministic trim reference. trim_border uses img.getpixel((0,0)) as the reference color. For a fully-transparent corner pixel, PIL exposes whatever the encoder wrote into the RGB channels — that's spec-undefined; different PNG encoders write different RGB values under alpha=0. Running remove_transparency first normalizes transparent pixels to (255, 255, 255), so the trim reference is a fixed, encoder-independent color. Two encoders that disagree on the corner pixel's hidden RGB values would today produce different Semantic-Codes for the same visible image.
-
No alpha-edge resize halos. Bilinear interpolation across an RGBA edge with straight (non-premultiplied) alpha pulls in arbitrary RGB values from under transparent pixels, producing visible color halos. Filling first means resize operates on opaque RGB and cannot produce halos.
-
Alignment with the standard. IEP-0004 is the normative spec for Image-Code preprocessing. iscc-sdk follows it; iscc-sci should too for the steps both pipelines share. A shared, IEP-0004-compliant prefix simplifies downstream tooling (iscc-gen, iscc_sdk.code_iscc(experimental=True)) that wants to run one preflight feeding both pipelines.
Why retraining is not required
The ONNX model is robust to preprocessing-order changes by construction:
- The ISC21-derived training distribution uses standard heavy augmentation (crops, rotations, color jitter).
trim_border and remove_transparency are inference-time normalizations layered on top — they were never part of the training distribution, regardless of order.
- For opaque inputs (the vast majority — photographs, JPEGs, opaque PNGs), the tensor at the model input is bit-identical before and after the swap.
- For inputs with both alpha and a uniform-colored border, the tensors differ slightly, but the differences fall well inside the model's invariance envelope. Embedding-space similarity behavior is unchanged on those inputs in practical use.
What does change: the exact embedding-bit output for the alpha+border subset of inputs. Per the project README, this kind of breakage is sanctioned pre-1.0:
All releases with version numbers below v1.0.0 may break backward compatibility and produce incompatible Semantic Image-Codes.
Proposed patch
def preprocess_image(image):
"""Preprocess image for inference."""
with sci.metrics(name="Image preprocessing time {seconds:.4f} seconds"):
image = ImageOps.exif_transpose(image)
- image = trim_border(image)
- image = image.resize((512, 512), Resampling.BILINEAR)
image = remove_transparency(image)
+ image = trim_border(image)
+ image = image.resize((512, 512), Resampling.BILINEAR)
image = np.array(image, dtype=np.float32)
image /= 255.0
mean = np.array([0.5, 0.5, 0.5], dtype=np.float32)
std = np.array([0.5, 0.5, 0.5], dtype=np.float32)
image = (image - mean) / std
image = np.expand_dims(np.transpose(image, (2, 0, 1)), axis=0)
return image.astype(np.float32)
remove_transparency is already a no-op for mode == "RGB" inputs (early-return at the top of the function), so the new ordering is safe for images that don't carry alpha.
Output-byte impact
| Input class |
Tensor diff vs. current |
Embedding diff |
Code diff |
| Opaque (RGB, no alpha) |
none |
none |
none |
| Alpha, no uniform border |
resize quality (no halos) |
small |
possible — usually 0–2 bits |
| Alpha + uniform border |
trim bbox + resize |
small–moderate |
possible — usually 0–4 bits |
Backward compatibility is not a constraint here — the project is pre-1.0 and explicitly allows incompatible Semantic-Codes between releases.
Acceptance criteria
preprocess_image matches IEP-0004 step order for steps 1–3 (exif_transpose → remove_transparency → trim_border).
- Existing tests pass; regenerate fixtures for any alpha+border cases.
References
Summary
iscc_sci.preprocess_imageappliestrim_borderbeforeremove_transparency, which deviates from the normative image-normalization order in IEP-0004 (Content-Code Image) and fromiscc-sdk'simage_normalize. The two libraries should agree on the preflight steps they have in common; today they don't. Proposing a one-function fix incode_semantic_image.pythat aligns the order without requiring model retraining.Current behavior
iscc_sci/code_semantic_image.py,preprocess_image(lines ~127–138):Order:
exif → trim → resize → fill → normalize.Expected behavior (per IEP-0004)
IEP-0004 §Processing documents the normative Image-Code preflight order:
iscc_sdk/image.py:image_normalizeimplements steps 1–3 in that order:iscc-scishould match steps 1–3.Why this matters
Deterministic trim reference.
trim_borderusesimg.getpixel((0,0))as the reference color. For a fully-transparent corner pixel, PIL exposes whatever the encoder wrote into the RGB channels — that's spec-undefined; different PNG encoders write different RGB values underalpha=0. Runningremove_transparencyfirst normalizes transparent pixels to(255, 255, 255), so the trim reference is a fixed, encoder-independent color. Two encoders that disagree on the corner pixel's hidden RGB values would today produce different Semantic-Codes for the same visible image.No alpha-edge resize halos. Bilinear interpolation across an RGBA edge with straight (non-premultiplied) alpha pulls in arbitrary RGB values from under transparent pixels, producing visible color halos. Filling first means resize operates on opaque RGB and cannot produce halos.
Alignment with the standard. IEP-0004 is the normative spec for Image-Code preprocessing.
iscc-sdkfollows it;iscc-scishould too for the steps both pipelines share. A shared, IEP-0004-compliant prefix simplifies downstream tooling (iscc-gen,iscc_sdk.code_iscc(experimental=True)) that wants to run one preflight feeding both pipelines.Why retraining is not required
The ONNX model is robust to preprocessing-order changes by construction:
trim_borderandremove_transparencyare inference-time normalizations layered on top — they were never part of the training distribution, regardless of order.What does change: the exact embedding-bit output for the alpha+border subset of inputs. Per the project README, this kind of breakage is sanctioned pre-1.0:
Proposed patch
def preprocess_image(image): """Preprocess image for inference.""" with sci.metrics(name="Image preprocessing time {seconds:.4f} seconds"): image = ImageOps.exif_transpose(image) - image = trim_border(image) - image = image.resize((512, 512), Resampling.BILINEAR) image = remove_transparency(image) + image = trim_border(image) + image = image.resize((512, 512), Resampling.BILINEAR) image = np.array(image, dtype=np.float32) image /= 255.0 mean = np.array([0.5, 0.5, 0.5], dtype=np.float32) std = np.array([0.5, 0.5, 0.5], dtype=np.float32) image = (image - mean) / std image = np.expand_dims(np.transpose(image, (2, 0, 1)), axis=0) return image.astype(np.float32)remove_transparencyis already a no-op formode == "RGB"inputs (early-return at the top of the function), so the new ordering is safe for images that don't carry alpha.Output-byte impact
Backward compatibility is not a constraint here — the project is pre-1.0 and explicitly allows incompatible Semantic-Codes between releases.
Acceptance criteria
preprocess_imagematches IEP-0004 step order for steps 1–3 (exif_transpose→remove_transparency→trim_border).References
iscc-sdk/iscc_sdk/image.py:37-67(image_normalize— reference order)iscc-sci/iscc_sci/code_semantic_image.py:123-151(preprocess_image— current order)