feat(versioning): capture and expose version history for charts, dashboards, and datasets#39603
Draft
mikebridge wants to merge 44 commits into
Draft
feat(versioning): capture and expose version history for charts, dashboards, and datasets#39603mikebridge wants to merge 44 commits into
mikebridge wants to merge 44 commits into
Conversation
mikebridge
pushed a commit
to mikebridge/superset
that referenced
this pull request
Apr 23, 2026
Phase 1 of versioning added ``sqlalchemy-continuum==1.6.0`` to ``requirements/base.txt`` directly, but the pin was missing from ``pyproject.toml``'s ``[project.dependencies]``. CI's ``check-python-deps`` job regenerates the pinned files from the ``.in`` sources via ``scripts/uv-pip-compile.sh``; without the pyproject declaration, regeneration strips the pin out, causing: ModuleNotFoundError: No module named 'sqlalchemy_continuum' …on every Python-based job (test-sqlite, test-postgres, test-mysql, unit-tests, test-postgres-hive, test-postgres-presto, test-load-examples, docker-build) because ``superset/extensions/ __init__.py`` unconditionally imports from it at module load time. Adds ``"sqlalchemy-continuum>=1.6.0, <2.0.0"`` to pyproject and re-runs ``uv-pip-compile.sh`` to sync ``base.txt`` and ``development.txt``. One package regenerates in place; the only other diffs are uv-resolver comment-graph updates (numpy's ``# via`` list) which CI's filter ignores. Fixes CI failures on PR apache#39603. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mikebridge
pushed a commit
to mikebridge/superset
that referenced
this pull request
Apr 24, 2026
Phase 1 of versioning added ``sqlalchemy-continuum==1.6.0`` to ``requirements/base.txt`` directly, but the pin was missing from ``pyproject.toml``'s ``[project.dependencies]``. CI's ``check-python-deps`` job regenerates the pinned files from the ``.in`` sources via ``scripts/uv-pip-compile.sh``; without the pyproject declaration, regeneration strips the pin out, causing: ModuleNotFoundError: No module named 'sqlalchemy_continuum' …on every Python-based job (test-sqlite, test-postgres, test-mysql, unit-tests, test-postgres-hive, test-postgres-presto, test-load-examples, docker-build) because ``superset/extensions/ __init__.py`` unconditionally imports from it at module load time. Adds ``"sqlalchemy-continuum>=1.6.0, <2.0.0"`` to pyproject and re-runs ``uv-pip-compile.sh`` to sync ``base.txt`` and ``development.txt``. One package regenerates in place; the only other diffs are uv-resolver comment-graph updates (numpy's ``# via`` list) which CI's filter ignores. Fixes CI failures on PR apache#39603. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
8774778 to
a1f0ddb
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #39603 +/- ##
==========================================
- Coverage 64.41% 64.00% -0.41%
==========================================
Files 2567 2578 +11
Lines 134411 135679 +1268
Branches 31203 31381 +178
==========================================
+ Hits 86584 86846 +262
- Misses 46330 47277 +947
- Partials 1497 1556 +59
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
6 tasks
mikebridge
pushed a commit
to mikebridge/superset
that referenced
this pull request
Apr 27, 2026
Phase 1 of versioning added ``sqlalchemy-continuum==1.6.0`` to ``requirements/base.txt`` directly, but the pin was missing from ``pyproject.toml``'s ``[project.dependencies]``. CI's ``check-python-deps`` job regenerates the pinned files from the ``.in`` sources via ``scripts/uv-pip-compile.sh``; without the pyproject declaration, regeneration strips the pin out, causing: ModuleNotFoundError: No module named 'sqlalchemy_continuum' …on every Python-based job (test-sqlite, test-postgres, test-mysql, unit-tests, test-postgres-hive, test-postgres-presto, test-load-examples, docker-build) because ``superset/extensions/ __init__.py`` unconditionally imports from it at module load time. Adds ``"sqlalchemy-continuum>=1.6.0, <2.0.0"`` to pyproject and re-runs ``uv-pip-compile.sh`` to sync ``base.txt`` and ``development.txt``. One package regenerates in place; the only other diffs are uv-resolver comment-graph updates (numpy's ``# via`` list) which CI's filter ignores. Fixes CI failures on PR apache#39603. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
7979999 to
70e21bc
Compare
✅ Deploy Preview for superset-docs-preview ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
mikebridge
pushed a commit
to mikebridge/superset
that referenced
this pull request
Apr 28, 2026
Phase 1 of versioning added ``sqlalchemy-continuum==1.6.0`` to ``requirements/base.txt`` directly, but the pin was missing from ``pyproject.toml``'s ``[project.dependencies]``. CI's ``check-python-deps`` job regenerates the pinned files from the ``.in`` sources via ``scripts/uv-pip-compile.sh``; without the pyproject declaration, regeneration strips the pin out, causing: ModuleNotFoundError: No module named 'sqlalchemy_continuum' …on every Python-based job (test-sqlite, test-postgres, test-mysql, unit-tests, test-postgres-hive, test-postgres-presto, test-load-examples, docker-build) because ``superset/extensions/ __init__.py`` unconditionally imports from it at module load time. Adds ``"sqlalchemy-continuum>=1.6.0, <2.0.0"`` to pyproject and re-runs ``uv-pip-compile.sh`` to sync ``base.txt`` and ``development.txt``. One package regenerates in place; the only other diffs are uv-resolver comment-graph updates (numpy's ``# via`` list) which CI's filter ignores. Fixes CI failures on PR apache#39603. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
db1b08c to
c338d49
Compare
mikebridge
pushed a commit
to mikebridge/superset
that referenced
this pull request
Apr 30, 2026
Phase 1 of versioning added ``sqlalchemy-continuum==1.6.0`` to ``requirements/base.txt`` directly, but the pin was missing from ``pyproject.toml``'s ``[project.dependencies]``. CI's ``check-python-deps`` job regenerates the pinned files from the ``.in`` sources via ``scripts/uv-pip-compile.sh``; without the pyproject declaration, regeneration strips the pin out, causing: ModuleNotFoundError: No module named 'sqlalchemy_continuum' …on every Python-based job (test-sqlite, test-postgres, test-mysql, unit-tests, test-postgres-hive, test-postgres-presto, test-load-examples, docker-build) because ``superset/extensions/ __init__.py`` unconditionally imports from it at module load time. Adds ``"sqlalchemy-continuum>=1.6.0, <2.0.0"`` to pyproject and re-runs ``uv-pip-compile.sh`` to sync ``base.txt`` and ``development.txt``. One package regenerates in place; the only other diffs are uv-resolver comment-graph updates (numpy's ``# via`` list) which CI's filter ignores. Fixes CI failures on PR apache#39603. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mikebridge
pushed a commit
to mikebridge/superset
that referenced
this pull request
May 4, 2026
Phase 1 of versioning added ``sqlalchemy-continuum==1.6.0`` to ``requirements/base.txt`` directly, but the pin was missing from ``pyproject.toml``'s ``[project.dependencies]``. CI's ``check-python-deps`` job regenerates the pinned files from the ``.in`` sources via ``scripts/uv-pip-compile.sh``; without the pyproject declaration, regeneration strips the pin out, causing: ModuleNotFoundError: No module named 'sqlalchemy_continuum' …on every Python-based job (test-sqlite, test-postgres, test-mysql, unit-tests, test-postgres-hive, test-postgres-presto, test-load-examples, docker-build) because ``superset/extensions/ __init__.py`` unconditionally imports from it at module load time. Adds ``"sqlalchemy-continuum>=1.6.0, <2.0.0"`` to pyproject and re-runs ``uv-pip-compile.sh`` to sync ``base.txt`` and ``development.txt``. One package regenerates in place; the only other diffs are uv-resolver comment-graph updates (numpy's ``# via`` list) which CI's filter ignores. Fixes CI failures on PR apache#39603. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
c338d49 to
04c6b8b
Compare
mikebridge
pushed a commit
to mikebridge/superset
that referenced
this pull request
May 5, 2026
Phase 1 of versioning added ``sqlalchemy-continuum==1.6.0`` to ``requirements/base.txt`` directly, but the pin was missing from ``pyproject.toml``'s ``[project.dependencies]``. CI's ``check-python-deps`` job regenerates the pinned files from the ``.in`` sources via ``scripts/uv-pip-compile.sh``; without the pyproject declaration, regeneration strips the pin out, causing: ModuleNotFoundError: No module named 'sqlalchemy_continuum' …on every Python-based job (test-sqlite, test-postgres, test-mysql, unit-tests, test-postgres-hive, test-postgres-presto, test-load-examples, docker-build) because ``superset/extensions/ __init__.py`` unconditionally imports from it at module load time. Adds ``"sqlalchemy-continuum>=1.6.0, <2.0.0"`` to pyproject and re-runs ``uv-pip-compile.sh`` to sync ``base.txt`` and ``development.txt``. One package regenerates in place; the only other diffs are uv-resolver comment-graph updates (numpy's ``# via`` list) which CI's filter ignores. Fixes CI failures on PR apache#39603. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
d632e5e to
e4f548e
Compare
5549d67 to
f031a2c
Compare
f031a2c to
c2b6db7
Compare
Adds a scheduled Celery task that prunes version history older than ``SUPERSET_VERSION_HISTORY_RETENTION_DAYS`` (default 30; settable via env var; ``0`` disables retention entirely). **Task** — ``superset.tasks.version_history_retention.prune_old_versions``: 1. Computes ``cutoff = utcnow() - timedelta(days=N)``. 2. Selects ``version_transaction.id`` rows with ``issued_at < cutoff`` and filters out any tx whose parent shadow includes a live row (``end_transaction_id IS NULL``). The live row is the only preservation rule — closed historical rows including the baseline (``operation_type=0``) age out. Per-entity minimum-history floor is an open question tracked in ``future-work.md``. 3. Deletes rows owned by surviving txs in each parent shadow table (``dashboards_version`` / ``slices_version`` / ``tables_version``). 4. Deletes child-shadow rows for the same transactions (``table_columns_version`` / ``sql_metrics_version`` / ``dashboard_slices_version``). 5. Drops the surviving ``version_transaction`` rows. The ``version_changes`` rows cascade via the FK from the previous commit. Idempotent and safely retried on partial failure. **Schedule** — ``superset/config.py`` adds the task to the default ``CeleryConfig.beat_schedule`` (nightly at 03:00). Operators who override ``CeleryConfig`` in their ``superset_config.py`` need to merge this entry — see UPDATING.md. Also adds ``"expose_headers": ["ETag"]`` to the default ``CORS_OPTIONS`` so cross-origin browser clients can read the ``ETag`` header introduced in the next commit. (Co-located here because both touch ``superset/config.py``; the ETag mechanism itself ships in the next commit.) **Auto-discovery** — ``superset/tasks/celery_app.py`` adds ``version_history_retention`` to its late-imports so Celery's auto-discovery picks up the task. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Helper module that derives the strong-validator ``ETag`` value from an entity's current live ``version_uuid`` and attaches it to a Flask response. Two functions: - ``set_version_etag(response, version_uuid)`` — direct path used by PUT handlers that already compute ``new_version_uuid`` (see the REST API commit two prior). Cheap; no extra query. - ``set_version_etag_by_uuid(response, model_cls, entity_uuid)`` — used by version endpoints that operate on ``entity_uuid``; looks up ``entity_id`` then derives ``version_uuid`` via ``VersionDAO``. Costs one extra ``SELECT id WHERE uuid = ?``; documented in the docstring so callers prefer the cheap variant when they have the id already. Integration tests cover all three entity types and four endpoint shapes (entity GET, save PUT, version-list GET, single-version GET) plus the entity-with-no-versions edge case (header is correctly absent). The ETag is wired into the API endpoints in the REST-API commit (group 3) and the CORS ``expose_headers: ["ETag"]`` ships with the retention commit (group 4) since both touch ``superset/config.py``. Locking enforcement (``If-Match`` → 412) is explicitly NOT in this change — deferred to the follow-up UI SIP per Open Question §7. ``ETag`` is informational in v1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Locks in the no-op-suppression behavior implemented by ``SkipUnmodifiedPlugin`` (which lives in ``superset/versioning/factory.py`` shipping with the foundation commit). Five integration tests: 1. Owners-only edit doesn't mint a version row — exercises the case where every dirty column is an excluded relationship. 2. Re-save with identical scalar values doesn't mint a row — exercises the json_metadata re-serialise path where ``set_dash_metadata`` rewrites the column to a different byte sequence with identical parsed content; the plugin must compare post-flush values against the prior shadow row to detect this. 3. Real scalar change DOES mint a row — guards against the plugin over-suppressing. 4. Same assertion on a Slice (covers the ``String`` column path on a different entity type). 5. ``json_metadata`` sub-key edit DOES mint a row — covers the ``MediumText`` column path past the plugin's content-equality check. Tests are designed so a column-type change in the parent entities (e.g. flipping ``json_metadata`` from ``MediumText`` to ``JSON``) will fail one of these if the plugin's Python ``!=`` comparison breaks for the new type. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds debug-only ``VersionHistoryDropdown`` widgets to the chart,
dashboard, and dataset list pages so the version surface can be
exercised from the UI during the spike. Each row's actions column
gets a clock-icon dropdown that fetches ``/api/v1/{resource}/<uuid>/
versions/`` on click, lists the ten most recent versions with a
formatted change-log summary, and offers per-version restore via
``POST .../versions/<uuid>/restore``.
Strings are wrapped in ``t('...')`` with placeholder formatting
(e.g. ``t('Added %(kind)s "%(name)s"', { kind, name })``) so
translators can reorder verbs and nouns rather than concatenating
fragments. ``KIND_LABELS`` is a static map keying English layout
kinds (``chart``, ``row``, ``column``, ``tab``, ``markdown``, etc.)
to ``t(...)``-extractable labels. Empty change lists render as
"Baseline" rather than "No changes recorded" since the empty case
is overwhelmingly the ``operation_type=0`` baseline row.
Locale-aware date rendering: ``new Date(iso).toLocaleString(lang)``
where ``lang`` comes from ``document.documentElement.lang`` (set
by ``src/views/App.tsx`` from the bootstrap ``locale``), so dates
follow the user's chosen Superset locale rather than the browser's.
French translations for the new strings are appended to
``superset/translations/fr/LC_MESSAGES/messages.po`` (Ajouté,
Supprimé, Modifié, Version initiale, kind labels, …). Run
``npm run build-translation`` and ``pybabel compile -l fr`` to
regenerate the JSON / MO packs.
This commit is **demo-only** per ADR-005 (V1 is backend-only). It
is intentionally marked ``temp`` so it can be reverted before the
PR splits — the production V1 ships without UI.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The v1 import pipeline previously wrote dashboard ↔ chart membership via raw Core DML (``db.session.execute(delete(dashboard_slices)…)`` + ``db.session.execute(insert(dashboard_slices)…)``). With Continuum's M2M tracker enabled by the versioning feature, those Core writes emit malformed shadow INSERTs into ``dashboard_slices_version`` — the tracker can't see the composite-PK columns through the Core layer and produces rows with only ``(transaction_id, operation_type)`` populated, triggering a ``NOT NULL`` violation on ``(dashboard_id, slice_id)``. Rewrites both import paths (``ImportAssetsCommand._import`` in ``commands/importers/v1/assets.py`` and ``ImportDashboardsCommand._import`` in ``commands/dashboard/importers/v1/__init__.py``) to use ORM-level ``dashboard.slices = [...]`` reassignment followed by an explicit ``db.session.flush()``. The explicit flush is necessary to land the M2M rows before any subsequent autoflush fires an inner-flush event handler that would reset the relationship change (cf. the SAWarning ``Attribute history events accumulated on N previously clean instances within inner-flush event handlers have been reset``). The unit tests previously called ``_import`` directly twice in the same session — production wraps ``run()`` in ``@transaction`` so each invocation gets its own DB+Continuum transaction. Added ``db.session.commit()`` between calls in ``test_import_adds_dashboard_charts``, ``test_import_removes_dashboard_charts``, and ``test_dashboard_import_with_overwrite_replaces_charts`` so the tests mirror production semantics; otherwise the second call's M2M shadow inserts conflict with the first call's on ``UNIQUE (dashboard_id, slice_id, transaction_id)``.
…ard.json_metadata Continuum's no-op suppression compared post-flush column values byte-for-byte against the previous live shadow row. For ``Dashboard.json_metadata`` that produced false-positive version rows on saves where the user authored nothing — the frontend re-stamps ``map_label_colors`` (regenerated from the ``LabelsColorMap`` singleton) on every save, plus ``chart_configuration`` / ``global_chart_configuration`` / ``show_chart_timestamps`` / ``color_namespace`` (derived from the current chart set), so two consecutive identical saves produce different bytes for the column. The diff engine already excluded those keys via ``DASHBOARD_JSON_METADATA_AUDIT_KEYS`` when computing change records; the skip-plugin diverged. Adds a ``_COLUMN_NORMALIZERS`` registry keyed on ``(class_name, column_name)`` that maps to a per-column normalizer applied to both pre- and post-image before equating. The first entry parses ``Dashboard.json_metadata`` as JSON and drops the audit-key set before comparing. The same registry is the extension point for analogous transient fields on charts and datasets. Promotes ``_DASHBOARD_JSON_METADATA_AUDIT_KEYS`` to a public name (``DASHBOARD_JSON_METADATA_AUDIT_KEYS``) so the skip-plugin can import it from ``superset.versioning.diff`` without reaching across a leading-underscore boundary. Integration coverage: ``test_map_label_colors_only_change_does_not_create_version``.
SQLAlchemy doesn't mark a parent as dirty when only its children (``TableColumn`` / ``SqlMetric`` on ``SqlaTable``) are modified. Continuum's UnitOfWork only creates operations for entities in ``session.dirty``, so a column-only edit produces shadow rows in ``table_columns_version`` but no parent shadow row in ``tables_version``. ``VersionDAO.list_versions`` queries the parent shadow, so the version dropdown is empty for child-only saves — exactly the failure mode reported when "I edited a column description but no version appeared." Extends ``register_baseline_listener`` with a new before-flush hook ``_force_parent_dirty_on_child_change`` that walks the existing ``_child_to_parent_registry`` and ``attributes.flag_modified(parent, <first non-excluded versioned column>)`` whenever a versioned child is dirty / new / deleted but the parent's own scalars haven't been touched. The flag puts the parent in ``session.dirty`` so Continuum's UoW creates a parent UPDATE operation; the resulting shadow row's scalar columns mirror the previous version (only the children actually changed), and the row exists to anchor the transaction in the parent's version chain. ``SkipUnmodifiedPlugin._is_no_op_update`` is updated in this commit's predecessor to recognize the "scalars match but children dirty" case via ``_has_dirty_versioned_children`` so the forced parent UPDATE isn't skipped. Integration coverage: ``test_dataset_column_edit_creates_parent_version``.
…ert restore VersionDAO.restore_version previously called Continuum's Reverter once per relation in a split-revert loop with flush + expire between calls. That closed an autoflush race in the Reverter when multiple relations were reverted at once, but split one logical restore across multiple Continuum transactions — and once the change-records listener was wired up, the listener's tx-dedup guard skipped the second pass, silently dropping child-addition records from version_changes. A restore that re-added a calculated column would render as an empty "Baseline" entry in the dropdown. Replaces the split-revert with a single ``target_version.revert(relations=relations)`` call wrapped in a new ``single_flush_scope(db.session)`` context manager (``superset/versioning/utils.py``). The context manager suppresses autoflush inside the block and issues one trailing flush on clean exit; on exception, the trailing flush is skipped so the session's normal rollback path handles cleanup. Same autoflush window closed, one Continuum transaction instead of N, the change-records listener sees the complete shadow state in one after_flush pass. The wrapper carries the full autoflush-race / cascade-add rationale in its docstring so the restore_version call site can be a short 6-line block referencing it. Integration coverage: ``test_restore_emits_full_child_diff_in_one_transaction``.
…sion bug The full-Continuum spike (ADR-004 revised) replaced the JSON-snapshot restore path with Continuum's native Reverter and removed the ``dataset_snapshots`` / ``dashboard_snapshots`` tables from the migration chain. Seven VersionDAO methods and two module-level helpers that read/wrote those tables stayed in the code anyway and went unused — dead code that looked live. Worse, ``VersionDAO.get_version`` still read from ``dataset_snapshots`` in its SqlaTable branch. On any environment where the snapshot tables don't exist (current production behavior), ``GET /api/v1/dataset/<uuid>/versions/<version_uuid>/`` raised ``OperationalError``. The branch is rewritten to read column and metric state from Continuum's child shadow tables (``table_columns_version`` / ``sql_metrics_version``) via the existing ``_shadow_rows_valid_at`` helper. Deleted: - ``_deserialize_snapshot_value`` (module helper) - ``_coerce_snapshot_list`` (module helper) - ``RESTORE_EXCLUDE_FIELDS`` (constant — only referenced by deleted code and a docstring) - ``VersionDAO._restore_dataset_children`` - ``VersionDAO._parse_slice_ids_json`` - ``VersionDAO._apply_dashboard_slices`` - ``VersionDAO._restore_dashboard_children`` - ``VersionDAO._apply_snapshot_children`` The corresponding ~17 unit tests in ``tests/unit_tests/daos/test_version_dao.py`` are removed alongside. Stale docstring references in ``versioning/changes.py`` and ``versioning/diff.py`` that pointed at the retired snapshot tables are also cleaned up. Also strips an 8-line comment block in ``restore_version`` that duplicated the docstring of ``_stamp_audit_fields_for_restore``. Net: −290 lines from ``daos/version.py``; a production-shape bug fixed; dead code that looked live is gone.
…store commands onto BaseRestoreVersionCommand Two coupled clean-code review fixes: (1) Rename ``VersionDAO._find_active_entity_by_uuid`` → ``find_active_by_uuid``. The leading-underscore + three ``# pylint: disable=protected-access`` suppressions in the restore commands were the smell of a wrongly-private API. The method is a perfectly reasonable public DAO operation; dropping the underscore removes the suppressions. (2) Collapse ``RestoreChartVersionCommand``, ``RestoreDashboardVersionCommand``, ``RestoreDatasetVersionCommand`` onto a shared ``BaseRestoreVersionCommand`` (``superset/commands/version_restore.py``). The three classes were textbook copy-paste — identical except for the model class and three exception types. Each subclass now declares ``model_cls`` + ``not_found_exc`` + ``forbidden_exc`` and overrides ``run()`` with one ``@transaction(reraise=<failed_exc>)``-decorated line delegating to ``self._do_restore()``. ~80 lines per file → ~45 lines per file; one shared workflow instead of three drift sources. The api.py imports of ``RestoreChartVersionCommand`` / ``RestoreDashboardVersionCommand`` / ``RestoreDatasetVersionCommand`` are unchanged — public class names preserved.
… regen lockfile DashboardList demo dropdown previously instructed the user to "Reload the page to see the change" after a restore. The URL the user returns to may still carry ``?native_filters_key=…`` / ``permalink_key`` / ``form_data_key`` from a prior session — those point at server-cached snapshots (in ``key_value`` and the filter-state cache) captured before the restore. On rehydration the cached state is merged on top of the restored ``json_metadata``, masking the rollback (e.g. dashboard-level colour-scheme restore appears not to take effect). Replaces the alert + manual reload with a direct ``window.location.href`` navigation to ``/superset/dashboard/<uuid>/`` — drops all URL params, forcing hydration from the freshly restored DB state. Also regenerates ``package-lock.json`` to pick up the ``zod 4.4.1 → 4.4.3`` bump that master's ``package.json`` already reflects. (``temp(versioning)`` prefix per the demo dropdown's status — this file is not part of V1 scope per ADR-005; the V2 UI SIP owns the actual restore UI surface.)
VersionDAO carried five distinct concerns under one class — UUID derivation, version metadata queries, change-record loading, single-version snapshot retrieval, and restore orchestration. Bob's "and" test (the clean-code review flagged this as the next structural fix after the dead-code purge) gives ~600 lines of "queries about versioned state of one entity AND the workflow that mutates it." Splits the read and write sides into purpose-built modules: - ``superset/versioning/queries.py`` — UUID derivation (``VERSION_UUID_NAMESPACE``, ``derive_version_uuid``) + read-side helpers (``find_active_by_uuid``, ``current_version_number``, ``current_live_transaction_id``, ``current_live_version_uuid``, ``list_versions``, ``resolve_version_uuid``, ``get_version``, ``list_change_records_batch``). ~475 lines. - ``superset/versioning/restore.py`` — write-side (``restore_version``, ``_stamp_audit_fields_for_restore``, ``_RESTORE_RELATIONS``). ~140 lines. Depends only on ``queries.find_active_by_uuid`` and ``utils.single_flush_scope``. - ``superset/daos/version.py`` — collapsed to an ~85-line backward-compat façade that re-exports both modules under a single ``VersionDAO`` class via ``staticmethod`` aliases. The module also re-exports ``VERSION_UUID_NAMESPACE`` and ``derive_version_uuid`` at module level so the ~10 existing callers (api.py handlers, command classes, the ETag emitter, integration tests) don't have to change their imports. New code is encouraged to import from the sub-modules directly. The functions themselves are unchanged byte-for-byte aside from internal call sites being rewritten from ``VersionDAO.foo`` to the bare function name (since they now live as module-level functions, not class methods). One unit-test mock target moved: ``test_restore_version_returns_none_for_unknown_entity`` now patches ``superset.versioning.restore.find_active_by_uuid`` (the actual call site) instead of ``VersionDAO.find_active_by_uuid`` (which is now just an alias). Each of the three modules now has one reason to change. When the sc-103157 soft-delete pass adds the ``deleted_at IS NULL`` filter to ``find_active_by_uuid``, it touches only ``queries.py``. When a per-entity-type restore Strategy replaces the string-keyed ``_RESTORE_RELATIONS`` dispatch, it touches only ``restore.py``.
Cleanup pass from the SQLAlchemy + migration code review. Eight items,
all in the "warnings / suggestions" tier — no behaviour change visible
to the API, but each closes a real correctness, perf, or maintainability
concern surfaced in review.
baseline.py
- Delete unused ``_get_user_id`` (W1). The function wrapped a broad
``except Exception: # noqa: S110`` swallow that hid bugs; grep
confirmed no callers anywhere. The legitimate audit-field paths
(``row.get("changed_by_fk")`` etc.) already drive the
``version_transaction.user_id`` write.
- Batch ``_baseline_attached_slices`` from O(N) round-trips to
three queries (W2): one membership SELECT, one existing-shadow
SELECT, one bulk live-row SELECT for the missing ids. The previous
per-slice ``COUNT(*)`` + ``SELECT`` was a measurable first-save
hotspot on dashboards with many charts. Drops the now-unused
``_slice_has_shadow`` helper.
- Pick a stable column name for ``flag_modified`` in
``_force_parent_dirty_on_child_change`` (W3). ``uuid`` is on all
three versioned parent classes and excluded by none, so the
flagged attribute is deterministic across SQLAlchemy versions /
mapper-config orders instead of depending on
``versioned_column_properties(parent)[0]``. Falls back to the
first available column for forks that exclude ``uuid``.
changes.py
- Add ``Decimal`` handling to ``_jsonable`` (W4) — ``json.dumps``
rejects ``Decimal``, so any numeric column (e.g. ``SqlMetric.currency``
contents, or fork/plugin Decimal columns) would crash the bulk
insert. Stringify rather than ``float()`` to preserve precision;
the diff engine compares ``from_value`` / ``to_value`` by string
equality after this coercion so both sides round-trip identically.
queries.py
- Promote the inline ``{0: "baseline", 1: "update", 2: "delete"}``
dict to module-level ``_OP_TYPE_LABELS`` (W7). The literal was
duplicated across ``list_versions`` and ``get_version``; the third
caller is one bug fix away.
- Comment on ``resolve_version_uuid``'s Python-side ``derive_version_uuid``
loop (W8) — no portable SQL form for UUIDv5 across PostgreSQL /
MySQL / SQLite, iteration count is bounded by the retention
window. Flags the place to revisit if retention is ever disabled
(``=0``) on a heavily-edited entity.
migrations/2026-05-01_23-36 (composite-PK)
- Belt-and-braces guard in ``_downgrade_mysql_table`` (W6): asserts
``t.name in AFFECTED_TABLES`` before interpolating into the
backtick-quoted ALTER statements. The invariant was already
structurally implied (callers iterate ``AFFECTED_TABLES``), but
making it load-bearing means a future refactor can't slip an
arbitrary table name through.
(W5 was verified-no-change: grepped ``tests/`` for ``metadata.create_all``
callers that exercise versioning tables; none. The cascade-FK
gap on ``version_changes.transaction_id`` is already documented
in ``tests/integration_tests/versioning/change_records_tests.py:27-32``.)
62 versioning unit tests pass.
…t_version After the SRP split (8c9cf36) put both functions in the same module ~150 lines apart, their overlap became visible: same JOIN of version_table → version_transaction → ab_user, same baseline-first ordering, same user-row → ``changed_by`` projection, same lookup ``_ENTITY_KIND_BY_CLASS_NAME.get(model_cls.__name__)``. About 30 lines of duplication. Five small helpers extracted at the module top: - ``_resolve_version_tables(model_cls)`` returns ``(ver_tbl, tx_tbl, user_tbl)`` - ``_version_with_tx_user_join(ver_tbl, tx_tbl, user_tbl)`` builds the join - ``_baseline_first_ordering(ver_tbl)`` returns the order-by tuple - ``_user_select_cols(user_tbl)`` returns the user-column list with ``user_id`` as the stable label (normalises the prior asymmetry where ``list_versions`` labelled it ``user_id`` and ``get_version`` labelled it ``_user_id`` to dodge a column-name collision — the ``user_id`` label collides with neither) - ``_changed_by_from_row(row)`` projects user columns onto the API shape - ``_entity_kind_for(model_cls)`` resolves the change-records taxonomy lookup Both call sites get shorter and read what they do (build query / project user / build row) rather than how. Behavior unchanged; no test changes. Also two small inline tidyings while in the file: - Replace the ternary ``changes_by_tx = list_change_records_batch(...) if entity_kind else {}`` with an explicit two-line if-statement in both functions. The ternary buries the decision; the if-statement reads as one thought. - Inline the one-shot ``meta_cols`` set declaration in ``get_version`` into the ``if col.name in {...}`` check that uses it three lines later. Net: about 110 lines → about 80 lines across the two functions, plus a small helper section at the top.
baseline.py:_insert_baseline_row and changes.py:_read_pre_state both
issued the same "read a single row through ``session.connection()``
inside ``with session.no_autoflush:``" pattern. Same five-line block,
same intent ("read the pre-flush state without triggering the in-flight
edit's flush").
Promoted to ``superset.versioning.utils.read_row_outside_flush(session,
table, entity_id)``. Companion to ``single_flush_scope`` — they sit
next to each other in utils.py and frame the two directions of the
"don't autoflush mid-listener" pattern.
Returns ``dict[str, Any]`` (or ``None``) so callers can't accidentally
hold a cursor-bound ``RowMapping`` past the listener boundary. Both
call sites get shorter by ~5 lines.
Also picks up Decimal stringification in the changes.py docstring
update (was listed in the W4 commit but the docstring still said
"(datetime, UUID, bytes)" — now matches the implementation).
Behaviour unchanged. 96 unit tests pass.
…icle order)
Pure file shuffle, zero behaviour change. Reorders ``baseline.py`` so it
reads top-down by level of abstraction (newspaper-article rule): the
public entry point at the top, supporting helpers descending below.
Before: 14 private helpers, then ``register_baseline_listener`` at the
bottom. A reader opening the file met the leaf builders first and had
to accumulate context before finding the call site.
After (top-down):
- Entry point: ``register_baseline_listener`` + inner ``capture_baseline``
- High-level helpers used by ``capture_baseline``:
``_force_parent_dirty_on_child_change``,
``_collect_parents_to_baseline``,
``_child_to_parent_registry``,
``_version_table_for``,
``_shadow_row_count``,
``_insert_baseline_and_children``
- Mid-level builders:
``_insert_baseline_row``,
``_baseline_children_for_parent``
- Per-entity child handlers + their dispatch table:
``_baseline_dataset_children``,
``_baseline_dashboard_children``,
``_CHILD_BASELINE_HANDLERS``
- Leaf builders:
``_insert_child_baseline_rows``,
``_baseline_attached_slices``,
``_insert_synthetic_slice_baseline``
Three section-divider comments mark the abstraction levels. The
``_CHILD_BASELINE_HANDLERS`` dict literal stays after its referenced
handlers (module-level literals evaluate at import time and need names
already bound); a comment now flags this constraint.
Function bodies are byte-for-byte unchanged; ``git log -L`` on any
function shows only its relocation. 96 unit tests pass.
…phan version_transaction rows inline
Extends the existing docstring note ("the orphan is swept by retention")
with the reasoning behind not cleaning it up in the same flush. The
inline-delete is appealing in principle but would couple this plugin
to the change-records listener's buffer state via the ON DELETE
CASCADE on ``version_changes.transaction_id``: both listeners would
have to agree that the flush produced nothing before the version_transaction
row could be dropped safely. The orphan's ~40-byte storage cost +
retention's correct-by-construction handling (orphans have no parent
shadow, so they're never in the "preserve" set) make the coordination
overhead not worth it.
Captures the design decision in the file where the next reader will
look for it.
c2b6db7 to
9d5a459
Compare
added 7 commits
May 20, 2026 14:12
…/M3/M5) Three small follow-ups surfaced by aminghadersohi's review of the SoftDeleteMixin PR (apache#39977) that apply equally here: - H1: cache _child_to_parent_registry() with functools.cache. Called twice per save flush; mapping depends only on import-time model classes, so unbounded cache is the right shape (no invalidation). - M5: tighten _CHILD_BASELINE_HANDLERS type from dict[str, Any] to dict[str, Callable[[Session, Any, int], None]] via a named alias. Mypy now catches a future broken handler signature. - M3/M4: explain the inline-import pattern once in the module docstrings of baseline.py and changes.py. Both modules use pylint disable=import-outside-toplevel uniformly because they load during init_versioning() before mappers are configured; the per-callsite "why" comments would just repeat the same reason. Module-level explanation + a hint to comment unusual cases is the cleaner shape. M6 (listener placement) doesn't apply — init_versioning() already runs inside init_app_in_ctx(). M8 (loose OpenAPI schema in */api.py docstrings) is real but its own change.
The force-parent-dirty listener was calling attributes.flag_modified on every parent reachable from a dirty child — including parents themselves in session.new (e.g. brand-new SqlaTable + brand-new TableColumns from POST /api/v1/dataset/). flag_modified rejects unloaded attributes, and a session.new SqlaTable's uuid (default=uuid4 fires at flush time) is unloaded until then. CI caught this with InvalidRequestError cascading into 422s across dataset creation / upload / Playwright dataset specs. The hook is only needed for the persistent-and-clean case (child edited, parent's own scalars untouched, dropdown otherwise empty). Anything in session.new will flush anyway; anything in session.dirty is already flagged; session.deleted shouldn't be touched. Short- circuit before the flag_modified call. Unblocks test-sqlite, test-mysql, test-postgres (previous), and playwright dataset specs.
…ions When one ORM flush touches multiple versioned entities (dashboard + slice + dataset all save at tx=X), each gets a shadow row sharing that tx. If only the dashboard is later edited at tx=Y, the dashboard row at tx=X is closed (end_tx=Y) while slice/dataset rows stay live at tx=X. Retention then preserves tx=X (slice/dataset are live there) and prunes tx=Y. The dashboard's closed row at tx=X survives step 1, then its end_transaction_id=Y trips the FK when step 2 deletes version_transaction row Y. Fix: extend the shadow-row delete to also match end_transaction_id IN tx_ids. Live rows have end_tx=NULL so they're never matched by either predicate. Closed rows that touch a pruned tx at either endpoint are pruned together — consistent with retention semantics (any tx in the row's lifespan is gone, so the row's chain is broken anyway). Unblocks test_retention_prunes_old_rows on sqlite, mysql, postgres.
- ruff: import sort + E501 reflow on the parent-state guard in baseline.py - ruff format: function-signature collapse and join-chain reflow in queries.py - auto-walrus: two ``entity_kind = …; if … is not None:`` patterns in queries.py converted to assignment-expressions
… catch The previous attempt (d0520f6) was too aggressive: skipping when parent is in session.dirty/new/deleted bypassed the persistent-and-clean case the hook EXISTS for. Some upstream code paths put the dataset in session.dirty *before* this listener fires (API controllers touching audit fields, etc.), so the session-membership pre-check made us silently no-op on the very scenario the hook needs to handle. CI symptom: test_dataset_column_edit_creates_parent_version showed before=317, after=317 (parent shadow not written). Restore the unconditional flag_modified and catch the specific InvalidRequestError that fires only for the session.new case (uuid default callable hasn't populated state yet). Other states fall through to the original behavior: - persistent + clean → flag_modified succeeds, parent goes dirty, Continuum picks it up, SkipUnmodifiedPlugin keeps the row via _has_dirty_versioned_children. ✓ - persistent + dirty → flag_modified is harmless (already dirty). - session.new → InvalidRequestError, skip (parent INSERTs anyway). - session.deleted → flag_modified may or may not raise; if it does, we skip; if not, the delete dominates. Should unblock test_dataset_column_edit_creates_parent_version, test_get_version_returns_historical_snapshot_with_children, and test_restore_with_column_edits_reverts_columns.
- factory.py: TID251 banned ``import json``; switch to ``from superset.utils import json`` (project convention). - factory.py: ruff format reflow on _matches_previous_version. - version_restore.py: ruff format collapse on restore_version call. CI was pinning a different ruff version than my local uvx default; re-ran against ruff==0.9.7 (the version in requirements/development.txt) which surfaced these.
…rent-dirty flag_modified(parent, "uuid") was producing FK integrity failures via the column's BLOB/BINARY round-trip: SQLAlchemy logs the param as ``<memory at 0x…>`` and the UUID round-trip doesn't always match the in-memory value byte-for-byte. Symptom: in scenarios where the parent is already going to flush (Reverter applying historical state during restore, RLS test triggering autoflush during a query), our added ``uuid`` UPDATE column tripped the FK check. Pick ``description`` instead — plain Text column on all three versioned parent classes (Dashboard, Slice, SqlaTable), no TypeDecorator, no marshaling layer. Flagging it round-trips its current value safely. Fallback chain ``description → uuid → col_keys[0]`` keeps the original deterministic-pick property for forks/subclasses that excluded ``description``. Should unblock test_restore_applies_scalar_field and the test_rls_filter_alters_no_role_user_birth_names_query autoflush error.
5 tasks
added 3 commits
May 21, 2026 15:55
…hanges _force_parent_dirty_on_child_change was firing whenever ANY TableColumn or SqlMetric of the parent appeared in session.dirty / new / deleted — even when the child was there for non-content reasons: - Lazy-load side effects when a relationship is touched - M2M relationship-cascade artifacts (e.g. RLS setUp doing rls_entry.tables.extend([dataset]) triggers cascade behavior that pulls children into the session) - AuditMixin auto-bumps from earlier code paths - Reverter side passes during restore Force-touching the parent in those cases produced an incidental UPDATE tables SET description=…, changed_on=…, changed_by_fk=… whose changed_by_fk value or autoflush ordering tripped FK integrity on some dialects. Symptoms: - test_rls_filter_alters_no_role_user_birth_names_query → FK IntegrityError on autoflush during a query - test_restore_applies_scalar_field → 422 "Dataset could not be updated" during restore Fix: gate on Continuum's is_modified(child), which returns True only when a non-excluded versioned column on the child has SQLAlchemy attribute-history changes. New objects (session.new) and genuinely-modified rows still flag the parent; phantom-dirty rows do not. The intended hook semantics — "child edit forces a parent shadow row" — are preserved: a column-description edit through the dataset API still triggers is_modified True, still flags the parent. See test_dataset_column_edit_creates_parent_version.
Pre-commit (previous) flagged I001 unsorted-imports on the backward-compat façade. Two queries imports merged into one block (the aliased ``derive_version_uuid as _derive_version_uuid`` moves inline rather than living in its own block), and the restore-side names sorted: ``_RESTORE_RELATIONS``, ``_stamp_audit_fields_for_restore``, ``restore_version``. Pure mechanical reformatting; no behaviour change.
Previous fix (9c2391d) gated the force-parent-dirty hook on is_modified(child) for ALL session collections (dirty/new/deleted). That was over-restrictive: is_modified checks attribute history, and deletion is a state transition with no attribute history — so deleted children evaluated as not-modified and the parent wasn't flagged. The change-records listener then didn't see the deletion and no removal record was emitted. Symptom: test_restore_emits_full_child_diff_in_one_transaction failed expecting a column-removed change record after a restore that removed the column; instead only the parent's scalar fields appeared in observed paths. Refine: apply the is_modified filter ONLY to persistent rows in session.dirty. session.new (creation) and session.deleted (removal) are always real content changes by virtue of their session-collection membership — no is_modified check needed (and in deletion's case, the check returns the wrong answer).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
SUMMARY
Adds backend plumbing to capture a version history for every save of a chart, dashboard, or dataset, and to expose that history via three new REST endpoints per entity (list, get, restore). No frontend in this PR. Second PR in the Versioning epic (sc-103156), depending on #39859 (composite-PK reshape on M2M association tables — sc-105349) and orthogonal to #39286 (sc-103157 soft-delete). See SIP-210 / issue #39492 for full design rationale.
🚧🚧🚧 This is still a draft/spike, not ready for final review. Branch contains two
temp(*)commits (demo UI dropdowns + French i18n; URL-param stripping on restore navigation) that will revert before merge. 🚧🚧🚧What changed:
Continuum wiring. Adds
sqlalchemy-continuumas a base dependency. Wired insuperset/extensions/__init__.pywith thevaliditystrategy; a customVersionTransactionFactoryrenames the transaction table toversion_transaction(the wordtransactionis reserved in several dialects) and aVersioningFlaskPluginsupplies the acting user viaget_user_id()(not Flask-Login'scurrent_user) so CLI / Celery / JWT-auth API saves all attribute correctly.Six shadow tables, all Continuum-native:
dashboards_version,slices_version,tables_versiontable_columns_version,sql_metrics_versiondashboard_slices_versionPlus one
version_transactiontable (the per-flush "who/when/where" envelope) and oneversion_changestable (structured diff records, FK toversion_transactionwithON DELETE CASCADE).Three endpoints per entity type (
/chart,/dashboard,/dataset):GET /api/v1/<resource>/<uuid>/versions/— list history (version_number,version_uuid,issued_at,changed_by)GET /api/v1/<resource>/<uuid>/versions/<version_uuid>/— one snapshot (scalar fields; pluscolumns/metricsfor datasets,slicesfor dashboards)POST /api/v1/<resource>/<uuid>/versions/<version_uuid>/restore— restore entity to that version<version_uuid>is a deterministic UUIDv5 (fixed namespace, derived from the entity UUID + Continuum transaction id). Stable across replicas and retention pruning — the same transaction always produces the same version uuid, so API consumers can cache references safely.ETag headers (
ETag: W/"<version_uuid>") on all three GET endpoints + the live entity GET. Foundation for optimistic-locking enforcement on writes (Phase 2); not enforced in this PR.Restore uses Continuum's native
Reverterwrapped in asingle_flush_scopecontext manager (suppresses autoflush inside the block, emits one trailing flush). The single-revert / single-flush shape was the spike outcome — earlier attempts at split-revert and JSON-snapshot tables were abandoned (seespike-continuum-restore.mdand the revised ADR-004 in the spec folder).Baseline capture. First save under versioning of an entity that pre-existed the migration inserts a synthetic
operation_type=0row capturing the pre-edit state, attributed to the entity's existingchanged_on/changed_by_fk. Listener runs before Continuum's ownbefore_flushso the baselinetransaction_idis lower than the edit's (correct ordering).No-op suppression. A
SkipUnmodifiedPluginmarks Continuum Operationsprocessed=Truewhen post-flush column values are content-equal to the previous shadow row — including JSON-aware comparison forDashboard.json_metadatathat strips frontend-stamped audit sub-keys (map_label_colors,chart_configuration, …) so saves that only re-stamp those don't pollute history.Force-parent-dirty on child changes. A
before_flushlistener flags the versioned parent (SqlaTable) as dirty when only its versioned children (TableColumn/SqlMetric) changed, so child-only edits surface in the parent's version dropdown.Structured change records. Every save writes per-field diff records to
version_changeskeyed to the sametransaction_id. Records carrykind/path/from_value/to_value— backbone for the Phase-2 UI's "Added column X" rendering, captured in V1 so the data is available from day one without a backfill.Retention is time-based, run by a Celery beat task.
SUPERSET_VERSION_HISTORY_RETENTION_DAYS(default90;0orNonedisables versioning entirely). Deletes shadow rows older than the cutoff while preserving the live row regardless of age. ON DELETE CASCADE onversion_changes.transaction_idkeeps diffs in sync. No write-path overhead; the prune is asynchronous.Composite PK reshape on M2M associations (sc-105349, PR refactor(db): composite PK on M2M association tables (sc-105349) #39859 — required for Continuum's M2M tracker to populate
dashboard_slices_versioncorrectly). The PRs are intended to merge in order refactor(db): composite PK on M2M association tables (sc-105349) #39859 → this one; the migration is included on this branch because the rebased history depends on it.Authorisation. Version endpoints reuse the resource's existing
can_writepermission. No new FAB permissions. Row-level access enforced viasecurity_manager.raise_for_ownership(entity)in the restore command.What is NOT versioned in v1 (see
specs/sc-103156-entity-versioning/future-work.md):position_jsonis versioned as an opaque blob (restored wholesale on dashboard restore); finer-grained layout versioning is Phase 2.Coordination with #39286 (sc-103157 soft-delete) — orthogonal in design; merge order can go either way. When sc-103157 merges, one small change hooks
deleted_atintofind_active_by_uuid()and the versioned models' Continuumexcludelists. Tracked as T043 in the spec.BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
N/A — backend-only. No UI is wired to the new endpoints yet. The
temp(*)commits add non-final demo dropdowns and i18n strings for manual testing only; they will revert before merge.TESTING INSTRUCTIONS
Expect an ordered array with
version_number,version_uuid,issued_at,changed_by. Response will include anETag: W/"<version_uuid>"header for the most recent version.Expect 200 + ETag header. A second request with
If-None-Match: "<that-etag>"returns 304.Expect 200.
GET /api/v1/chart/$CHART_UUIDshould now reflect the restored state, and a new version row (the restore itself) appears in the version list.Each change carries
{kind, path, from_value, to_value}.Repeat for dashboards and datasets using
/api/v1/dashboard/and/api/v1/dataset/. Datasets exercise child shadows (columns/metrics); dashboards exercise the M2M shadow (slices).Run the test suite:
pytest tests/integration_tests/charts/version_history_tests.py \ tests/integration_tests/dashboards/version_history_tests.py \ tests/integration_tests/datasets/version_history_tests.py \ tests/integration_tests/versioning/ -vAsserts the three Success Criteria: list < 1 s, restore < 3 s, save p95 overhead < 50 ms.
ADDITIONAL INFORMATION
Migration list (in dependency order):
2bee73611e32dashboard_slices+ 7 other association tables (sc-105349 / #39859)56cd24c07170version_transaction+ parent shadow tables (dashboards_version,slices_version,tables_version)e1f3c5a7b9d0version_changes(structured diff records, ON DELETE CASCADE FK toversion_transaction)f7a2b3c4d5e6table_columns_version,sql_metrics_version) + M2M shadow (dashboard_slices_version)All migrations are additive on the pre-existing
slices/dashboards/tables/ child tables — no existing columns altered. The composite-PK migration (2bee73611e32) reshapes the M2M association tables; round-trip tested on PostgreSQL, MySQL, and SQLite, including the MySQL FK /AUTO_INCREMENTquirks that required raw SQL workarounds (commits56c36fde54,65a3491861).Write cost per save:
version_transaction*_versionparent shadowSkipUnmodifiedPluginfiltered the save; 1 if scalars changedversion_changesNo write-path retention overhead — pruning is asynchronous via Celery beat.
Performance:
The numbers below were captured before the ADR-004 reversal (JSON-snapshot → full Continuum). Architecture has changed since — child writes now go through Continuum shadows instead of
dataset_snapshots/dashboard_snapshotsJSON tables. Re-validation against the final architecture pending before review. Targets unchanged:Harness:
SUPERSET_PERF_VALIDATION=1 pytest tests/integration_tests/versioning/perf_validation_tests.py -v -s.