Skip to content

ref(preprod): Revert odiff CLI-mode workaround after v4.3.8 fix#113335

Open
NicoHinderling wants to merge 1 commit intomasterfrom
nico/revert-odiff-workaround
Open

ref(preprod): Revert odiff CLI-mode workaround after v4.3.8 fix#113335
NicoHinderling wants to merge 1 commit intomasterfrom
nico/revert-odiff-workaround

Conversation

@NicoHinderling
Copy link
Copy Markdown
Contributor

Reverts the odiff CLI-mode workaround (#112829) now that odiff-bin v4.3.8 ships the upstream fix for the server-mode stdin buffer reuse bug (dmtrKovalenko/odiff#170). Snapshot comparison goes back to using server mode — one persistent subprocess per batch instead of spawning one per comparison.

Dependency pin

odiff-bin is pinned to exact 4.3.8 (no caret) in package.json, pnpm-lock.yaml, and the CI workflow. Exact pin keeps dev / prod / CI in lockstep with minimal machinery.

Bonus fix — binary resolution

_find_odiff_binary() now points at node_modules/odiff-bin/bin/odiff.exe instead of raw_binaries/odiff-<platform>. v4.3.8's post_install.js only sets the execute bit on bin/odiff.exe; the raw binaries ship without +x, which was causing PermissionError on fresh installs. Despite the .exe suffix, this is the package's cross-platform entrypoint on all OSes.

Minor cleanup

Extracted _fetch_batch_images helper in tasks.py so the per-batch closure / lock / cache setup lives in one place instead of being rebuilt inside every loop iteration.

odiff-bin v4.3.8 ships the upstream fix for the server-mode stdin buffer
reuse bug (dmtrKovalenko/odiff#170), so we can
drop the CLI-mode fallback introduced in #112829 and go back to using
server mode.

Also:
- Pin odiff-bin to exact 4.3.8 (no caret) in package.json and CI
- Point _find_odiff_binary() at node_modules/odiff-bin/bin/odiff.exe;
  v4.3.8's post-install only chmods this path, while raw_binaries/ ship
  without the execute bit
- Extract _fetch_batch_images helper in tasks.py to avoid per-iteration
  closure rebinding inside the batch loop

Co-Authored-By: Claude <noreply@anthropic.com>
@NicoHinderling NicoHinderling marked this pull request as ready for review April 17, 2026 18:17
@NicoHinderling NicoHinderling requested review from a team as code owners April 17, 2026 18:17
@github-actions github-actions bot added Scope: Frontend Automatically applied to PRs that change frontend components Scope: Backend Automatically applied to PRs that change backend components labels Apr 17, 2026
@github-actions
Copy link
Copy Markdown
Contributor

🚨 Warning: This pull request contains Frontend and Backend changes!

It's discouraged to make changes to Sentry's Frontend and Backend in a single pull request. The Frontend and Backend are not atomically deployed. If the changes are interdependent of each other, they must be separated into two pull requests and be made forward or backwards compatible, such that the Backend or Frontend can be safely deployed independently.

Have questions? Please ask in the #discuss-dev-infra channel.

Comment on lines +533 to +534
with OdiffServer() as server:
for batch in batches:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The OdiffServer is started unconditionally, causing the compare_snapshots task to fail if the odiff binary is unavailable, even when no images need diffing.
Severity: HIGH

Suggested Fix

The OdiffServer should only be started if there are batches to process. Wrap the with OdiffServer() as server: block in a conditional check, such as if batches:, to ensure it only executes when there is diffing work to be done. This restores the previous behavior of avoiding subprocess creation for zero-diff scenarios.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.

Location: src/sentry/preprod/snapshots/tasks.py#L533-L534

Potential issue: The `OdiffServer` is initialized unconditionally via a `with` statement
before iterating through image batches. In scenarios where no images require diffing,
such as when all images are identical, the `eligible` and `batches` lists will be empty.
Despite having no work to do, the `OdiffServer` still attempts to start its subprocess.
If the `odiff` binary is missing or has permission issues, this raises an unhandled
exception, causing the entire `compare_snapshots` task to fail. Previously, the server
was only started if there were batches to process, avoiding this failure.

Did we get this right? 👍 / 👎 to inform future reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Scope: Backend Automatically applied to PRs that change backend components Scope: Frontend Automatically applied to PRs that change frontend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants