Skip to content

fix: preserve MPR v1 contents hash and UnitID across DROP+CREATE#258

Merged
ako merged 3 commits intomendixlabs:mainfrom
hjotha:submit/corruption-fix-and-v1-hash
Apr 22, 2026
Merged

fix: preserve MPR v1 contents hash and UnitID across DROP+CREATE#258
ako merged 3 commits intomendixlabs:mainfrom
hjotha:submit/corruption-fix-and-v1-hash

Conversation

@hjotha
Copy link
Copy Markdown
Contributor

@hjotha hjotha commented Apr 22, 2026

Summary

Two small v1 MPR storage fixes that together prevent Studio Pro from refusing to open projects after MDL roundtrips:

  • persist contents hashes for v1 unit writes — v1 format stored units inline in the DB but didn't refresh the ContentsHash row when a write path bypassed the writer's hash recomputation. Studio Pro would see a hash mismatch and treat the file as corrupt.
  • preserve microflow UnitID across DROP + CREATE OR MODIFY — running DROP MICROFLOW X; CREATE OR MODIFY MICROFLOW X ... in the same session used to delete the Unit row and re-insert with a freshly generated UUID. Studio Pro treats the rewritten ContainerID/UnitID pair as an unrelated document and refuses to open the resulting .mpr ("file does not look like a Mendix Studio Pro project"). Fix caches the dropped microflow's UnitID on the executor and reuses it when a subsequent CREATE OR REPLACE/MODIFY targets the same qualified name, so the delete+insert behaves like an in-place update.

Part of umbrella #257.

Test plan

  • New regression test TestDropThenCreatePreservesMicroflowUnitID exercises the full DROP → CREATE OR MODIFY flow through MockBackend and asserts the UnitID is preserved and the cache entry consumed.
  • go build ./... && go test ./... pass on top of current main.
  • Manual validation: a project that previously fails to open in Studio Pro after the DROP + CREATE pattern now opens cleanly.

hjothamendix and others added 2 commits April 22, 2026 14:22
MPR v1 stores unit bodies inline in the SQLite database with a
ContentsHash column that Studio Pro uses as an integrity check. The v1
INSERT and UPDATE paths in Writer.insertUnit / Writer.updateUnit were
passing an empty string (or nothing at all) for ContentsHash, which left
the stored hash out of sync with the actual bytes after any mxcli write.
On the next open, Studio Pro refused the file with a corruption error.

v2 already computed the hash correctly — the fix extracts that logic
into contentHashBase64() and uses it in both schema paths. The v1 UPDATE
also falls back to the older schema that doesn't have ContentsHash so
projects on pre-ContentsHash schemas still succeed.

Regression test (sdk/mpr/writer_units_test.go) verifies the hash column
matches sha256(Contents) after insert and update on both v1 and v2.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Studio Pro treats Unit rows with a new UnitID (for the same qualified
name) as unrelated documents, producing ".mpr does not look like a
Mendix Studio Pro project file" errors. The executor now caches the
dropped microflow's UnitID, ContainerID, and AllowedRoles via
droppedMicroflows, and a follow-up CREATE OR REPLACE/MODIFY reuses
them, turning delete+insert into a semantically in-place update.

Also hardens scripts/run-mdl-tests.sh with explicit input-path
validation so typos don't silently copy an empty sandbox.
@github-actions
Copy link
Copy Markdown

AI Code Review

What Looks Good

  • Clear problem statement: The PR identifies two related issues causing Studio Pro to reject projects after MDL roundtrips: missing content hash updates in v1 units and UnitID changes during DROP+CREATE sequences.
  • Appropriate fixes:
    • The v1 content hash fix centralizes hash computation in contentHashBase64() and ensures hashes are updated in both write paths (v1 and v2) with proper fallback for older schemas.
    • The UnitID preservation fix caches dropped microflow metadata and reuses it during subsequent CREATE OR REPLACE/MODIFY operations, making Studio Pro see the change as an in-place update.
  • Thorough test coverage:
    • New regression test TestDropThenCreatePreservesMicroflowUnitID exercises the full DROP → CREATE OR MODIFY flow via MockBackend
    • Additional test verifies allowed roles preservation during CREATE OR MODIFY
    • Unit tests for v1 content hash functionality cover insert/update scenarios and backward compatibility
    • All tests avoid time.Sleep for synchronization
  • Architectural compliance:
    • Executor changes properly use ctx.Backend.* interfaces (no direct sdk/mpr imports)
    • New backend methods would be implemented through the existing abstraction layer
    • Cache manipulation stays within executor boundaries
  • Code quality:
    • Changes are focused and consistent with existing patterns
    • Comments clearly explain the Studio Pro compatibility rationale
    • No duplication of existing functionality
    • Follows Go conventions and project naming standards

Recommendation

Approve - The PR correctly addresses a critical compatibility issue with appropriate fixes, comprehensive tests, and full adherence to the project's architectural guidelines. The two changes are closely related aspects of the same root cause (MPR v1 corruption after roundtrips) and are appropriately scoped together.


Automated review via OpenRouter (Nemotron Super 120B) — workflow source

@ako
Copy link
Copy Markdown
Collaborator

ako commented Apr 22, 2026

Code Review

Overview

Two independent v1 MPR correctness fixes:

  1. ContentsHash — v1 insertUnit/updateUnit/WriteUnit now populate ContentsHash, preventing Studio Pro from treating the file as corrupt.
  2. UnitID preservationDROP MICROFLOW X; CREATE OR MODIFY MICROFLOW X in the same session now reuses the original UnitID and ContainerID instead of generating fresh UUIDs, so Studio Pro sees an in-place update rather than a delete+insert pair.

Both fixes are correctly motivated, well-tested, and the regression test for the UnitID case is especially thorough.


Issues

Silent error swallowing in the ContentsHash fallback (both writer_units.go and writer_core.go)

_, err := wt.tx.Exec(`UPDATE Unit SET Contents = ?, ContentsHash = ? WHERE UnitID = ?`, ...)
if err != nil {
    _, err = wt.tx.Exec(`UPDATE Unit SET Contents = ? WHERE UnitID = ?`, ...)
}
return err

The fallback fires on any error from the first Exec, not just "table Unit has no column named ContentsHash". If the first update fails for a different reason — UnitID not found, transaction already rolled back, disk full — the fallback silently retries and returns the fallback's result. A write that should fail might silently succeed (without ContentsHash), or a misleading error is returned.

The same pattern appears in updateUnit (writer_units.go) and WriteUnit (writer_core.go). Minimal fix: check strings.Contains(err.Error(), "ContentsHash") before falling back, or detect schema capability once at open time and cache it. At minimum, add a comment acknowledging the broad catch and the risk.

Performance: double SQL on every v1 write for old-schema projects

With the current fallback pattern, every updateUnit and WriteUnit call on an old-schema v1 project executes two SQL statements (first attempt fails, second succeeds). For a script that writes many microflows, this doubles the write query count. Schema detection should ideally happen once at transaction start and the right query chosen statically. Not blocking for this fix, but worth a follow-up.

Missing mdl-examples/bug-tests/ script

The CLAUDE.md checklist requires every bug fix to include an MDL bug-test script in mdl-examples/bug-tests/ that reproduces the issue for Studio Pro verification. The scripts/run-mdl-tests.sh is useful for CI but targets the microflow spec test, not this specific bug. A two-line script like DROP MICROFLOW X; CREATE OR MODIFY MICROFLOW X ... in mdl-examples/bug-tests/ would close the checklist item and serve as a permanent regression artifact.


Minor notes

  • cloneRoleIDs is called twice for the drop-then-create path: once in rememberDroppedMicroflow (when stashing) and once in execCreateMicroflow (cloneRoleIDs(dropped.AllowedRoles)). The second clone is unnecessary since consumeDroppedMicroflow removes the cache entry, so the caller owns the slice. Not a bug, just a spurious allocation.

  • ctx.executor.cache = ctx.Cache in rememberDroppedMicroflow correctly syncs the cache pointer back to the executor so subsequent newExecContext calls see the same droppedMicroflows map. The test correctly exercises this with a single shared ctx rather than going through executeInner.


Strengths

  • TestDropThenCreatePreservesMicroflowUnitID: The test reproduces the exact bug scenario — ListMicroflows returns nil after DeleteMicroflow fires, so CREATE OR MODIFY sees no existing microflow and must fall through to the dropped-cache path. The white-box assertion on ctx.Cache.droppedMicroflows before and after CREATE precisely documents the invariants the fix relies on.

  • TestUnitV1_OldSchemaWithoutContentsHashStillWorks: Tests the old-schema fallback end-to-end including both insertUnit and updateUnit in sequence on the same row. Good coverage of the compatibility case.

  • contentHashBase64 extraction: The helper eliminates four copies of the same SHA256+base64 block. Clean refactor, correctly shared across writer_units.go and writer_core.go.

  • TestCreateOrModifyMicroflowPreservesAllowedRoles: Captures a second invariant — AllowedModuleRoles must survive a CREATE OR MODIFY on an existing microflow. Good addition alongside the drop-then-create test.


Summary

The two fixes are correct and the regression tests are solid. The main concern is the broad-catch fallback pattern in the ContentsHash writes — it should at minimum be narrowed or commented. The missing bug-test script is a checklist item. Everything else is clean.

hjotha pushed a commit to hjotha/mxcli that referenced this pull request Apr 22, 2026
- Narrow v1 ContentsHash fallback to actual schema errors (ako)
  Previously any UPDATE error would trigger the "older schema"
  retry, potentially masking real failures (disk full, invalid
  UnitID, rolled-back tx). Now only retries when the error
  message mentions ContentsHash.
- Drop spurious cloneRoleIDs in drop-then-create path (ako)
  consumeDroppedMicroflow already removed the cache entry, so
  the caller owns the slice — no need to clone again.
- Add mdl-examples/bug-tests/258-drop-create-microflow-unitid-preservation.mdl
  Bug reproducer script as required by CLAUDE.md checklist.
@hjotha
Copy link
Copy Markdown
Contributor Author

hjotha commented Apr 22, 2026

Thanks @ako — addressed in 9767368:

  1. ContentsHash fallback narrowed — added isContentsHashSchemaError helper that checks the error message for "ContentsHash". Only that specific schema mismatch triggers the retry now; disk-full, invalid UnitID, rolled-back-tx errors propagate unchanged.
  2. Spurious cloneRoleIDs droppedconsumeDroppedMicroflow removes the cache entry, so the caller owns the slice; no second clone needed. Added a comment pointing this out.
  3. Bug-test script addedmdl-examples/bug-tests/258-drop-create-microflow-unitid-preservation.mdl reproduces the DROP+CREATE OR MODIFY scenario end-to-end so Studio Pro verification stays in-tree.

On the double-SQL performance note: not addressed in this fixup — the fallback only fires on truly old v1 schemas that are rare in practice, and caching the schema capability at transaction start adds state I'd rather not introduce in a bug-fix PR. Happy to do it as a follow-up if you want me to file an issue.

@github-actions
Copy link
Copy Markdown

AI Code Review

What Looks Good

  • The PR correctly identifies and fixes two specific MPR v1 corruption issues that cause Studio Pro to reject projects
  • Changes are minimal and focused: microflow UnitID preservation logic and hash persistence fixes
  • Comprehensive test coverage added:
    • New bug test demonstrating the DROP/CREATE issue
    • Unit tests verifying UnitID preservation and role conservation
    • Unit tests validating hash persistence for v1 schemas (including backward compatibility)
  • The fix properly handles edge cases:
    • Schema version differences (older v1 schemas without ContentsHash column)
    • Cache consumption to prevent ID reuse collisions
    • Proper cloning of role ID slices to avoid reference sharing
  • Code follows existing patterns:
    • Uses the backend abstraction layer correctly (no direct sdk/mpr imports in executor)
    • Maintains deterministic behavior through proper cache management
    • Error handling is appropriate and well-documented
  • Test files are well-structured and descriptive
  • No MDL syntax changes were needed (this is a pure bug fix), so syntax design and full-stack consistency checks don't apply

Recommendation

Approve. The PR resolves critical corruption issues with appropriate fixes and tests, maintains code quality standards, and follows the project's contribution practices. The changes are narrowly scoped to the reported problems without introducing unnecessary complexity.


Automated review via OpenRouter (Nemotron Super 120B) — workflow source

hjotha pushed a commit to hjotha/mxcli that referenced this pull request Apr 22, 2026
@ako
Copy link
Copy Markdown
Collaborator

ako commented Apr 22, 2026

Updated review — fix: preserve MPR v1 contents hash and UnitID across DROP+CREATE

All three concerns from the first review have been addressed. Here's a pass over the changes.

What was fixed

Narrow fallback () — and now only retry on the schema-absent error, not on any error. The check correctly matches both SQLite message forms ("table Unit has no column named ContentsHash" / "no such column: ContentsHash"). Real errors (disk full, invalid UnitID, rolled-back tx) propagate. ✅

** call removed from CREATE path** — The comment makes the ownership transfer explicit: consumeDroppedMicroflow removes the entry from the map so the caller owns the slice. No double-clone. ✅

Bug reproducer script258-drop-create-microflow-unitid-preservation.mdl plus scripts/run-mdl-tests.sh with explicit path validation. The shell script's [[ -f "$VAR" ]] || guards are the right fix for the silent-empty-sandbox risk. ✅


Minor observation — insertUnit fallback is still broad-catch

updateUnit now uses isContentsHashSchemaError, but insertUnit still retries on any error:

_, err := w.reader.db.Exec(`INSERT ... ContentsHash ...`, ...)
if err != nil {
    // Try old schema with Type column
    _, err = w.reader.db.Exec(`INSERT ... Type ...`, ...)
}

This is intentional — the INSERT fallback handles two structurally different schemas (ContentsHash+ContentsConflicts vs Type column), not just one missing column, so the pre-existing broad-catch is appropriate. A failed INSERT leaves no data at all rather than stale data, so the risk profile is different from the UPDATE case. This is fine as-is; just noting the asymmetry is by design.


Test coverage

sdk/mpr/writer_units_test.go covers all three cases cleanly:

  • New v1 schema: hash populated on insert and correct after update
  • Old Type-column schema: insert and update succeed without ContentsHash

TestDropThenCreatePreservesMicroflowUnitID is a solid regression test: it simulates listedMicroflows = nil after deletion (exactly the bug condition), asserts the cache stash, asserts ID reuse, asserts role preservation, and asserts the cache entry is consumed. TestCreateOrModifyMicroflowPreservesAllowedRoles covers the "existing via list" path separately.


Verdict: LGTM. The fixes are targeted, the fallback is correctly narrowed, the test coverage is thorough, and the CLAUDE.md checklist items (bug-test script, narrow error handling, no spurious clone) are all satisfied.

@ako ako merged commit 2a5be7f into mendixlabs:main Apr 22, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants