Skip to content

fix(cli): encrypt sqlalchemy_uri password on import_datasources (#31983)#40584

Merged
rusackas merged 3 commits into
masterfrom
tdd/issue-31983-import-datasources-encrypt-password
Jun 2, 2026
Merged

fix(cli): encrypt sqlalchemy_uri password on import_datasources (#31983)#40584
rusackas merged 3 commits into
masterfrom
tdd/issue-31983-import-datasources-encrypt-password

Conversation

@rusackas
Copy link
Copy Markdown
Member

@rusackas rusackas commented Jun 1, 2026

SUMMARY

Fixes #31983.

superset import_datasources stored sqlalchemy_uri as cleartext (including the password) instead of encrypting it. A workaround using the REST API to re-save connections post-import existed, but the CLI should handle this transparently.

Root cause:
The superset import_datasources -p file.yaml CLI command dispatches to the legacy v0 ImportDatasetsCommand, which calls Database.import_from_dict(database, sync=sync). The generic import_from_dict helper in superset/models/helpers.py sets sqlalchemy_uri directly on the model via setattr, bypassing Database.set_sqlalchemy_uri(). That method is the only place that extracts the password, stores it in the encrypted password column, and replaces the password in the URI with XXXXXXXXXX.

Fix:
After calling Database.import_from_dict, immediately call db_obj.set_sqlalchemy_uri(db_obj.sqlalchemy_uri) on the returned object. This re-runs the password-extraction and masking logic, matching the behaviour of the v1 ZIP import path (ImportDatabasesCommand) in superset/commands/database/importers/v1/utils.py.

The v1 ZIP import path is not affected — it already calls database.set_sqlalchemy_uri(sqlalchemy_uri) explicitly.

BEFORE / AFTER

Before: database.sqlalchemy_uri stored postgresql://user:secret-password@host/db — plaintext password visible in the metadata DB.

After: database.sqlalchemy_uri stores postgresql://user:XXXXXXXXXX@host/db and database.password (encrypted column) holds the real credential so connections continue to work.

TESTING INSTRUCTIONS

pytest tests/unit_tests/databases/commands/importers/v1/import_test.py::test_import_datasources_cli_encrypts_password -v

The test was introduced as a TDD regression test on this same branch. It previously failed (confirming the bug), and passes after the fix.

ADDITIONAL INFORMATION

Regression for #31983: the CLI import path must route passwords
through the same EncryptedType field used by the REST API,
not store them as cleartext.

Closes #31983

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@bito-code-review
Copy link
Copy Markdown
Contributor

bito-code-review Bot commented Jun 1, 2026

Code Review Agent Run #4b81fc

Actionable Suggestions - 0
Filtered by Review Rules

Bito filtered these suggestions based on rules created automatically for your feedback. Manage rules.

  • tests/unit_tests/databases/commands/importers/v1/import_test.py - 1
Review Details
  • Files reviewed - 1 · Commit Range: fba6002..fba6002
    • tests/unit_tests/databases/commands/importers/v1/import_test.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 1, 2026

Codecov Report

❌ Patch coverage is 57.14286% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.94%. Comparing base (a33fcb0) to head (ad3c45f).
⚠️ Report is 6 commits behind head on master.

Files with missing lines Patch % Lines
superset/commands/dataset/importers/v0.py 57.14% 1 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #40584      +/-   ##
==========================================
- Coverage   63.97%   63.94%   -0.03%     
==========================================
  Files        2654     2658       +4     
  Lines      142768   143017     +249     
  Branches    32837    32868      +31     
==========================================
+ Hits        91329    91456     +127     
- Misses      49878    49996     +118     
- Partials     1561     1565       +4     
Flag Coverage Δ
hive 39.75% <28.57%> (+0.03%) ⬆️
mysql 58.40% <57.14%> (-0.03%) ⬇️
postgres 58.47% <57.14%> (-0.03%) ⬇️
presto 41.36% <28.57%> (+0.02%) ⬆️
python 59.96% <57.14%> (-0.04%) ⬇️
sqlite 58.13% <57.14%> (-0.03%) ⬇️
unit 100.00% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

The v0 CLI import path called import_from_dict which used setattr
to set sqlalchemy_uri directly, bypassing set_sqlalchemy_uri().
That method is the only place that extracts the password, stores
it in the encrypted password column, and masks the URI.

Fixes #31983

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@rusackas rusackas changed the title test(cli): import_datasources must encrypt sqlalchemy_uri passwords (#31983) fix(cli): encrypt sqlalchemy_uri password on import_datasources (#31983) Jun 1, 2026
@rusackas rusackas requested review from Vitor-Avila and Copilot June 1, 2026 20:20
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a security/privacy bug in the legacy superset import_datasources (YAML/v0) import path where database sqlalchemy_uri values were persisted with plaintext passwords, by re-applying Database.set_sqlalchemy_uri() after import and adding a regression unit test.

Changes:

  • Update the v0 dataset importer to call set_sqlalchemy_uri() after Database.import_from_dict() so embedded URI passwords are extracted to the encrypted password column and masked in sqlalchemy_uri.
  • Add a unit test regression for #31983 verifying passwords are masked in sqlalchemy_uri and stored in Database.password after YAML import.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
superset/commands/dataset/importers/v0.py Ensures imported database URIs are re-processed to mask/extract passwords after legacy dict-based import.
tests/unit_tests/databases/commands/importers/v1/import_test.py Adds regression coverage to prevent plaintext sqlalchemy_uri storage via the legacy YAML import path.

Comment thread superset/commands/dataset/importers/v0.py Outdated
@bito-code-review
Copy link
Copy Markdown
Contributor

This question isn’t related to the pull request. I can only help with questions about the PR’s code or comments.

Importing a URI with no password segment — the common pattern when
users keep secrets out of YAML — previously called set_sqlalchemy_uri
unconditionally, which set password = None and clobbered any encrypted
password stored from a prior import.  Guard the call so it only runs
when the parsed URI carries a non-empty, non-masked password.  Adds a
regression test for the no-password case.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@pull-request-size pull-request-size Bot added size/L and removed size/M labels Jun 1, 2026
@bito-code-review
Copy link
Copy Markdown
Contributor

bito-code-review Bot commented Jun 1, 2026

Code Review Agent Run #b0ef56

Actionable Suggestions - 0
Additional Suggestions - 1
  • superset/commands/dataset/importers/v0.py - 1
    • Consistent implementation with v1 utils · Line 228-230
      The conditional check at lines 228-230 prevents calling `set_sqlalchemy_uri` when the imported URI has no password or already carries the mask, preserving the encrypted `password` column from prior imports.
Review Details
  • Files reviewed - 2 · Commit Range: fba6002..ad3c45f
    • superset/commands/dataset/importers/v0.py
    • tests/unit_tests/databases/commands/importers/v1/import_test.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

@rusackas rusackas requested a review from sadpandajoe June 1, 2026 22:39
Copy link
Copy Markdown
Contributor

@aminghadersohi aminghadersohi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All 20 automated scans run against HEAD SHA ad3c45fff117bc9cf07643fef185c49527f6a428. No blockers, high, medium, or nit findings.

The fix is correct and minimal — calls set_sqlalchemy_uri() (the single authoritative place for password extraction and masking) immediately after Database.import_from_dict(), matching the existing v1 ZIP import path in superset/commands/database/importers/v1/utils.py. The guard if parsed.password and parsed.password != PASSWORD_MASK correctly prevents clobbering an existing encrypted password when re-importing with a URI that has no password segment — the inline comment explains the invariant well.

Both new tests are behavioral: test_import_datasources_cli_encrypts_password fails without the fix, and test_import_datasources_cli_no_password_does_not_clobber_existing pins the correctness of the guard.

@rusackas rusackas merged commit c6faa50 into master Jun 2, 2026
62 checks passed
@rusackas rusackas deleted the tdd/issue-31983-import-datasources-encrypt-password branch June 2, 2026 05:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

superset import_datasources does not encrypt DB password

3 participants