Skip to content

fix(version): heal ContentType for legacy octet-stream files on labelled version#288

Merged
kptdobe merged 3 commits into
mainfrom
verfix
May 29, 2026
Merged

fix(version): heal ContentType for legacy octet-stream files on labelled version#288
kptdobe merged 3 commits into
mainfrom
verfix

Conversation

@kptdobe
Copy link
Copy Markdown
Contributor

@kptdobe kptdobe commented May 29, 2026

Summary (revised after board pushback)

POST /versionsource returned a silent 500 for legacy imports whose source object was stored with ContentType: application/octet-stream (or no ContentType). The diagnostic from #284 confirmed 9/24h occurrences with that identical fingerprint.

Fix (was A, now B)

Original proposal (A) was to widen the gate: createVersion = shouldCreateVersion(contentType) || update.label != null. Board flagged that we should fix the underlying ContentType rather than let bad metadata through.

This PR ships fix B: in postObjectVersionWithLabel, infer the correct mime from daCtx.ext when the stored ContentType is missing or octet-stream, then pass that inferred type through update.type.

// src/storage/version/put.js
function inferVersionableType(contentType, ext) {
  if (shouldCreateVersion(contentType)) return contentType;
  if (ext === 'html') return 'text/html';
  if (ext === 'json') return 'application/json';
  return contentType;
}

Effects:

  • Legacy .html files stored as octet-stream now get a labelled version. The version snapshot is correctly tagged text/html.
  • The main object's PUT (the one that always runs in this code path) writes the inferred ContentType into S3 metadata, so the file is self-healed for all future requests. Auto-versioning on the next plain PUT now works because shouldCreateVersion(text/html) returns true.
  • The shouldCreateVersion gate is unchanged. Binary files (jpg/pdf/mp4/etc.) still cannot be labelled-versioned, matching the project's existing "binaries do not version" semantics that JPEG/PNG/PDF/MP4/SVG/ZIP tests already assert.
  • The diagnostic log from fix(version): log diagnostics on POST /versionsource silent 500 #284 stays in place and is extended with inferredType + ext, so any future unhealed pattern is still observable in Cloudflare Logs.

Tests

  • heals legacy octet-stream HTML on labelled version: snapshot + main object both repaired - asserts the version snapshot has ContentType: text/html, the main object PUT writes ContentType: text/html, the audit entry is emitted with versionLabel + versionId.
  • logs diagnostics and returns 500 when labelled version requested on non-versionable ext - asserts the silent-500 path still 500s for unknown extensions, and the diagnostic log captures contentType / inferredType / ext / hadLabel / currentStatus.
  • plain PUT (no label) still skips auto-version for non-html/json contentType - companion to confirm the labelled-path mime inference does not bleed into auto-versioning.
  • All existing tests pass; binary-never-version tests untouched.
npx eslint src test  ->  clean
npm test             ->  393 passing (audit timeout flake unrelated)

Risk

  • Mime inference is narrow (only html/json). Files with truly unknown extensions still 500, which is consistent with current behavior.
  • Storage cost increases for the labelled-version writes that previously failed (bounded by the actual call rate, <=9/24h).
  • The S3 ContentType is now mutated when we heal a legacy object. Mutation only happens on the labelled-version code path, which already PUTs the main object as part of its normal flow. No new write.

Refs

kptdobe and others added 2 commits May 29, 2026 10:04
POST /versionsource returned a silent 500 for legacy imports whose
source object was stored with ContentType: application/octet-stream.
shouldCreateVersion gates only text/html and application/json, so even
when an explicit label was supplied the version write was skipped and
postObjectVersionWithLabel returned { error: 'Version was not created' }.

The diagnostic added in #284 confirmed 9 occurrences/24h with an
identical fingerprint (octet-stream, hadLabel, currentStatus=200).

When the caller passes an explicit label, treat the version as
requested-by-name and create it regardless of contentType. Auto-version
on plain PUT still gates to html/json, so storage cost is bounded to
the labelled call rate.

Refs: COR-55, COR-46

Co-Authored-By: Paperclip <noreply@paperclip.ing>
Project convention: ticket IDs belong in commit messages and PR
descriptions, not source code (they rot as tickets are renumbered or
deleted).

Co-Authored-By: Paperclip <noreply@paperclip.ing>
@codecov
Copy link
Copy Markdown

codecov Bot commented May 29, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

…e gate

Pivot from the gate-widening approach (createVersion || label != null) to
healing the underlying metadata on the labelled-version path.

postObjectVersionWithLabel now derives a versionable mime from daCtx.ext
when the stored ContentType is missing or application/octet-stream:

  html  -> text/html
  json  -> application/json

The inferred type is passed via update.type. shouldCreateVersion sees the
healed type, the version snapshot stores ContentType: text/html (or json),
and the main object's PUT overwrites the stale ContentType in S3 metadata
so the file is self-healed for all future requests.

Binary files (jpg/pdf/etc.) still cannot be labelled-versioned, matching
the project's "binaries do not version" semantics. The diagnostic log from
#284 is retained and extended with inferredType + ext, so the unhealed
path is observable.

Tests updated:
- new: legacy octet-stream HTML labelled version heals snapshot + main
- new: labelled version on non-versionable ext still 500s with diagnostic
- companion: plain PUT auto-version gate intact (no leak from labelled path)

Co-Authored-By: Paperclip <noreply@paperclip.ing>
@kptdobe kptdobe changed the title fix(version): allow named /versionsource on non-html/json content types fix(version): heal ContentType for legacy octet-stream files on labelled version May 29, 2026
@kptdobe kptdobe merged commit bf86d40 into main May 29, 2026
5 checks passed
@kptdobe kptdobe deleted the verfix branch May 29, 2026 09:04
adobe-bot pushed a commit that referenced this pull request May 29, 2026
## [1.9.3](v1.9.2...v1.9.3) (2026-05-29)

### Bug Fixes

* **version:** heal ContentType for legacy octet-stream files on labelled version ([#288](#288)) ([bf86d40](bf86d40)), closes [#284](#284) [#284](#284)
@adobe-bot
Copy link
Copy Markdown
Collaborator

🎉 This PR is included in version 1.9.3 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants