fix(cli): encrypt sqlalchemy_uri password on import_datasources (#31983)#40584
Conversation
Code Review Agent Run #4b81fcActionable Suggestions - 0Filtered by Review RulesBito filtered these suggestions based on rules created automatically for your feedback. Manage rules.
Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #40584 +/- ##
==========================================
- Coverage 63.97% 63.94% -0.03%
==========================================
Files 2654 2658 +4
Lines 142768 143017 +249
Branches 32837 32868 +31
==========================================
+ Hits 91329 91456 +127
- Misses 49878 49996 +118
- Partials 1561 1565 +4
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
The v0 CLI import path called import_from_dict which used setattr to set sqlalchemy_uri directly, bypassing set_sqlalchemy_uri(). That method is the only place that extracts the password, stores it in the encrypted password column, and masks the URI. Fixes #31983 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR fixes a security/privacy bug in the legacy superset import_datasources (YAML/v0) import path where database sqlalchemy_uri values were persisted with plaintext passwords, by re-applying Database.set_sqlalchemy_uri() after import and adding a regression unit test.
Changes:
- Update the v0 dataset importer to call
set_sqlalchemy_uri()afterDatabase.import_from_dict()so embedded URI passwords are extracted to the encryptedpasswordcolumn and masked insqlalchemy_uri. - Add a unit test regression for #31983 verifying passwords are masked in
sqlalchemy_uriand stored inDatabase.passwordafter YAML import.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
superset/commands/dataset/importers/v0.py |
Ensures imported database URIs are re-processed to mask/extract passwords after legacy dict-based import. |
tests/unit_tests/databases/commands/importers/v1/import_test.py |
Adds regression coverage to prevent plaintext sqlalchemy_uri storage via the legacy YAML import path. |
|
This question isn’t related to the pull request. I can only help with questions about the PR’s code or comments. |
Importing a URI with no password segment — the common pattern when users keep secrets out of YAML — previously called set_sqlalchemy_uri unconditionally, which set password = None and clobbered any encrypted password stored from a prior import. Guard the call so it only runs when the parsed URI carries a non-empty, non-masked password. Adds a regression test for the no-password case. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Code Review Agent Run #b0ef56Actionable Suggestions - 0Additional Suggestions - 1
Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
aminghadersohi
left a comment
There was a problem hiding this comment.
All 20 automated scans run against HEAD SHA ad3c45fff117bc9cf07643fef185c49527f6a428. No blockers, high, medium, or nit findings.
The fix is correct and minimal — calls set_sqlalchemy_uri() (the single authoritative place for password extraction and masking) immediately after Database.import_from_dict(), matching the existing v1 ZIP import path in superset/commands/database/importers/v1/utils.py. The guard if parsed.password and parsed.password != PASSWORD_MASK correctly prevents clobbering an existing encrypted password when re-importing with a URI that has no password segment — the inline comment explains the invariant well.
Both new tests are behavioral: test_import_datasources_cli_encrypts_password fails without the fix, and test_import_datasources_cli_no_password_does_not_clobber_existing pins the correctness of the guard.
SUMMARY
Fixes #31983.
superset import_datasourcesstoredsqlalchemy_urias cleartext (including the password) instead of encrypting it. A workaround using the REST API to re-save connections post-import existed, but the CLI should handle this transparently.Root cause:
The
superset import_datasources -p file.yamlCLI command dispatches to the legacy v0ImportDatasetsCommand, which callsDatabase.import_from_dict(database, sync=sync). The genericimport_from_dicthelper insuperset/models/helpers.pysetssqlalchemy_uridirectly on the model viasetattr, bypassingDatabase.set_sqlalchemy_uri(). That method is the only place that extracts the password, stores it in the encryptedpasswordcolumn, and replaces the password in the URI withXXXXXXXXXX.Fix:
After calling
Database.import_from_dict, immediately calldb_obj.set_sqlalchemy_uri(db_obj.sqlalchemy_uri)on the returned object. This re-runs the password-extraction and masking logic, matching the behaviour of the v1 ZIP import path (ImportDatabasesCommand) insuperset/commands/database/importers/v1/utils.py.The v1 ZIP import path is not affected — it already calls
database.set_sqlalchemy_uri(sqlalchemy_uri)explicitly.BEFORE / AFTER
Before:
database.sqlalchemy_uristoredpostgresql://user:secret-password@host/db— plaintext password visible in the metadata DB.After:
database.sqlalchemy_uristorespostgresql://user:XXXXXXXXXX@host/dbanddatabase.password(encrypted column) holds the real credential so connections continue to work.TESTING INSTRUCTIONS
The test was introduced as a TDD regression test on this same branch. It previously failed (confirming the bug), and passes after the fix.
ADDITIONAL INFORMATION
superset import_datasourcesdoes not encrypt DB password #31983