fix(importexport): honor overwrite flag on /api/v1/assets/import#39502
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #39502 +/- ##
==========================================
- Coverage 63.88% 63.87% -0.02%
==========================================
Files 2583 2583
Lines 136602 136657 +55
Branches 31501 31514 +13
==========================================
+ Hits 87274 87284 +10
- Misses 47812 47856 +44
- Partials 1516 1517 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Code Review Agent Run #29c7ea
Actionable Suggestions - 1
-
superset/commands/importers/v1/assets.py - 1
- Inefficient UUID query · Line 237-239
Review Details
-
Files reviewed - 4 · Commit Range:
4c8ce19..74c5ade- superset/commands/importers/v1/assets.py
- superset/importexport/api.py
- tests/unit_tests/commands/importers/v1/assets_test.py
- tests/unit_tests/importexport/api_test.py
-
Files skipped - 0
-
Tools
- Whispers (Secret Scanner) - ✔︎ Successful
- Detect-secrets (Secret Scanner) - ✔︎ Successful
- MyPy (Static Code Analysis) - ✔︎ Successful
- Astral Ruff (Static Code Analysis) - ✔︎ Successful
Bito Usage Guide
Commands
Type the following command in the pull request comment and save the comment.
-
/review- Manually triggers a full AI review. -
/pause- Pauses automatic reviews on this pull request. -
/resume- Resumes automatic reviews. -
/resolve- Marks all Bito-posted review comments as resolved. -
/abort- Cancels all in-progress reviews.
Refer to the documentation for additional commands.
Configuration
This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.
Documentation & Help
| existing_uuids = { | ||
| str(uuid) for (uuid,) in db.session.query(model_cls.uuid).all() | ||
| } |
There was a problem hiding this comment.
The _prevent_overwrite_existing_assets method queries all UUIDs for each model, which can be inefficient with many existing assets. Optimize by filtering the query to only check UUIDs present in the import bundle.
Code Review Run #29c7ea
Should Bito avoid suggestions like this for future reviews? (Manage Rules)
- Yes, avoid them
The assets import endpoint previously ignored the ``overwrite`` parameter and always overwrote existing assets. This threads an ``overwrite`` flag (default ``True`` for backwards compatibility) through ``ImportAssetsCommand`` to ``import_database``, ``import_saved_query``, ``import_dataset``, ``import_chart`` and ``import_dashboard``. When ``overwrite=false`` and any asset in the bundle already exists, the import now fails with a clear validation error listing the conflicting assets, matching the behavior of the per-resource import endpoints. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… parse_boolean_string Address review feedback: - Add ``"queries/": SavedQuery`` to ``_MODEL_BY_PREFIX`` so existing saved queries trigger a validation error when ``overwrite=false`` — previously ``import_saved_query`` would silently return the existing row, letting the endpoint appear to succeed despite the conflict. - Use ``parse_boolean_string`` in the API instead of an ad-hoc ``.lower() == "true"`` check. - Add tests for the saved-query prefix and for partial conflicts (some assets already exist, others are new). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
74c5ade to
6c089ce
Compare
| existing_uuids = { | ||
| str(uuid) for (uuid,) in db.session.query(model_cls.uuid).all() | ||
| } |
There was a problem hiding this comment.
Suggestion: This validation does full-table UUID scans for every asset type (Database, SqlaTable, Slice, Dashboard, SavedQuery) on every import, even if the bundle contains only a few files. On large instances this can cause major latency and memory pressure. Build a per-prefix set of incoming UUIDs and query only matching rows with IN (...) instead of loading all UUIDs from each table. [possible bug]
Severity Level: Major ⚠️
- ⚠️ /api/v1/assets/import overwrite=false always scans all asset tables.
- ⚠️ Asset import latency grows linearly with total stored assets.
- ⚠️ Additional queries add memory pressure on metadata database.Steps of Reproduction ✅
1. Call the bulk import API by POSTing to `/api/v1/assets/import/` (implemented in
`superset/importexport/api.py:95-201`) with a valid ZIP bundle and form field
`overwrite=false`, so that `ImportExportRestApi.import_` constructs
`ImportAssetsCommand(..., overwrite=False)` (`importexport/api.py:195-237`) and calls
`command.run()`.
2. Inside `ImportAssetsCommand.run` (`superset/commands/importers/v1/assets.py:205-215`),
`self.validate()` is invoked. `validate()` loads the bundle configs via `load_configs`
(`assets.py:59-21`) into `self._configs` and then calls
`_prevent_overwrite_existing_assets(exceptions)` (`assets.py:22`).
3. `_prevent_overwrite_existing_assets` (`assets.py:17-35`) first checks `if
self.overwrite: return` and, since `overwrite=False`, iterates over `_MODEL_BY_PREFIX`
(`assets.py:7-15`), which maps `"databases/"` → `Database`, `"datasets/"` → `SqlaTable`,
`"charts/"` → `Slice`, `"dashboards/"` → `Dashboard`, and `"queries/"` → `SavedQuery`.
4. For each prefix/model pair, it executes `db.session.query(model_cls.uuid).all()` and
builds `existing_uuids = {str(uuid) for (uuid,) in ...}` (`assets.py:27-30`), pulling
every UUID from each of the five tables into Python sets, regardless of how many files of
that type are actually present in `self._configs`. These full-table UUID scans run on
every import with `overwrite=false`, giving O(total stored assets) database and memory
work per import, rather than O(assets in the bundle).Fix in Cursor | Fix in VSCode Claude
(Use Cmd/Ctrl + Click for best experience)
Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** superset/commands/importers/v1/assets.py
**Line:** 237:239
**Comment:**
*Possible Bug: This validation does full-table UUID scans for every asset type (`Database`, `SqlaTable`, `Slice`, `Dashboard`, `SavedQuery`) on every import, even if the bundle contains only a few files. On large instances this can cause major latency and memory pressure. Build a per-prefix set of incoming UUIDs and query only matching rows with `IN (...)` instead of loading all UUIDs from each table.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fixThere was a problem hiding this comment.
Good catch — fixed in 38d6293. The validation now groups the bundle's (file_name, uuid) pairs by prefix in one pass over self._configs, and only issues one WHERE uuid IN (...) query per prefix that has entries. Prefixes with nothing in the bundle skip the database entirely, so cost scales with bundle size instead of the size of each asset table. Added test_prevent_overwrite_queries_only_bundle_uuids to lock in the behavior (asserts only the relevant model is queried for a single-prefix bundle).
|
The flagged issue is correct—the validation performs full-table UUID scans for all asset types on every import with overwrite=false, causing O(total stored assets) work instead of O(assets in bundle). To resolve, collect incoming UUIDs per prefix and query only matching rows with IN(...). I've implemented the concise fix below. No other comments found in this PR. superset/commands/importers/v1/assets.py |
Code Review Agent Run #3fc73cActionable Suggestions - 0Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
Before: ``_prevent_overwrite_existing_assets`` ran a full-table UUID scan against every asset model (``Database``, ``SqlaTable``, ``Slice``, ``Dashboard``, ``SavedQuery``) on every import with ``overwrite=false``, giving ``O(total stored assets)`` work per import regardless of how many files the bundle actually contains. Fix: collect the incoming UUIDs from ``self._configs`` per prefix, then issue one ``WHERE uuid IN (...)`` query per prefix that has entries — prefixes with no entries skip the database entirely. The cost now scales with the bundle size rather than with the size of the asset tables. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
✅ Deploy Preview for superset-docs-preview ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Code Review Agent Run #4d77f5Actionable Suggestions - 0Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
…che#39502) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
SUMMARY
The
/api/v1/assets/importendpoint previously ignored theoverwriteparameter and always overwrote existing assets. This PR threads anoverwriteflag (defaulting totruefor backwards compatibility) throughImportAssetsCommandto each ofimport_database,import_saved_query,import_dataset,import_chart, andimport_dashboard— all of which were hard-coded tooverwrite=True.When
overwrite=falseand any asset in the bundle already exists, the import now fails with a clear validation error listing each conflicting asset (e.g."Slice already exists and \overwrite=true` was not passed"), matching the behavior of the per-resource import endpoints (seeImportModelsCommand._prevent_overwrite_existing_model`).Because the default remains
True, existing clients that omit the flag will see no behavior change.BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
N/A — backend-only change.
TESTING INSTRUCTIONS
GET /api/v1/assets/export/.overwrite=true) — succeeds and overwrites as before.overwrite=falsewhile the assets exist — fails with a 422 and clear per-asset error messages.overwrite=false— succeeds.Automated tests added under:
tests/unit_tests/commands/importers/v1/assets_test.py— command-level behavior (default, flag threading, validation).tests/unit_tests/importexport/api_test.py— API plumbing for the new form field.Run with:
ADDITIONAL INFORMATION