Skip to content

Canonicalize [Content_Types].xml Default/Override split#111

Merged
SimonCropp merged 1 commit into
mainfrom
Canonicalize-Content_Types].xml-Default/Override-split
Jul 3, 2026
Merged

Canonicalize [Content_Types].xml Default/Override split#111
SimonCropp merged 1 commit into
mainfrom
Canonicalize-Content_Types].xml-Default/Override-split

Conversation

@SimonCropp

Copy link
Copy Markdown
Owner

System.IO.Packaging (via document.Clone, and Aspose's own writer) chooses which content type wins a per-extension vs a per-part through internal collection ordering. For an extension shared by several content types (notably xml: workbook/worksheet/styles/sharedStrings), that choice — and thus which parts become Overrides — was unstable across runs and producers. ContentTypesPatcher only sorted the entries, so the flap survived and Aspose.Cells xlsx serialized non-deterministically run-to-run.

Canonicalize the split instead of just sorting: compute each part's effective content type, pick every extension's Default as its most-common content type (Ordinal tiebreak), and emit an explicit for each part that differs. This needs the full part list, so Convert now threads part names through to the patcher set. OPC-preserving: every part still resolves to its original content type (guarded by validator tests). Also drops a stale left behind when its part is removed.

System.IO.Packaging (via document.Clone, and Aspose's own writer) chooses which
content type wins a per-extension <Default> vs a per-part <Override> through
internal collection ordering. For an extension shared by several content types
(notably xml: workbook/worksheet/styles/sharedStrings), that choice — and thus
which parts become Overrides — was unstable across runs and producers.
ContentTypesPatcher only sorted the entries, so the flap survived and
Aspose.Cells xlsx serialized non-deterministically run-to-run.

Canonicalize the split instead of just sorting: compute each part's effective
content type, pick every extension's Default as its most-common content type
(Ordinal tiebreak), and emit an explicit <Override> for each part that differs.
This needs the full part list, so Convert now threads part names through to the
patcher set. OPC-preserving: every part still resolves to its original content
type (guarded by validator tests). Also drops a stale <Default Extension="psmdcp">
left behind when its part is removed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant