Skip to content

Remove insert-vs-alter transaction conflict check#2

Open
fuziontech wants to merge 1 commit into
mainfrom
fix/allow-concurrent-insert-alter
Open

Remove insert-vs-alter transaction conflict check#2
fuziontech wants to merge 1 commit into
mainfrom
fix/allow-concurrent-insert-alter

Conversation

@fuziontech
Copy link
Copy Markdown
Member

Summary

  • Remove the overly conservative conflict check that prevents concurrent INSERT and ALTER TABLE operations on the same table
  • Add test covering concurrent INSERT + ADD COLUMN, INSERT + RENAME COLUMN scenarios

Problem

Tools like sqlmesh perform schema evolution (ALTER TABLE ADD COLUMN, RENAME COLUMN) as part of model materialization while concurrent data pipelines INSERT into the same tables. This causes:

Transaction conflict - attempting to insert into table with index "25667"
- but another transaction has altered it

DuckLake retries internally (up to 100 times) but the conflict is permanent because the check always compares against the transaction's original snapshot.

Why this is safe

Each data file carries its own mapping_id (stored in ducklake_data_file) that describes the column layout at write time. The DuckLakeMultiFileReader::CreateMapping function looks up the file's mapping and creates the correct column mapping for that specific file, regardless of the current table schema.

When an INSERT commits after an ALTER:

  1. The INSERT's data files have a mapping_id describing the pre-ALTER schema
  2. The reader uses this mapping to correctly read the file (missing columns → NULL)
  3. The ducklake_column_mapping and ducklake_name_mapping tables preserve the mapping
  4. Schema versioning in ducklake_schema_versions tracks which version applies at each snapshot

The insert-vs-drop conflict check is preserved since a dropped table is genuinely gone.

Test plan

  • New test transaction_insert_alter_no_conflict.test covering:
    • Concurrent INSERT + ADD COLUMN (both orderings)
    • Concurrent INSERT + RENAME COLUMN
    • Verification that data is readable with correct values after concurrent operations
  • Existing transaction conflict tests still pass (DROP, CREATE same name, etc.)

🤖 Generated with Claude Code

Concurrent INSERT and ALTER TABLE operations (e.g., ADD COLUMN, RENAME
COLUMN) on the same table should not conflict. Each data file carries
its own mapping_id that describes the column layout at write time. The
multi-file reader uses this mapping to correctly read files written
under older schemas, so an INSERT that was planned before an ALTER
can safely commit after it.

This is needed for tools like sqlmesh that perform schema evolution
(ALTER TABLE) on target tables while concurrent data pipelines INSERT
into those same tables. Previously, the INSERT would fail with
"Transaction conflict - attempting to insert into table - but another
transaction has altered it" after exhausting all retries.

The insert-vs-drop conflict check is preserved since a dropped table
is genuinely gone.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant