-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(ssh-tunnelling): Setup SSH Tunneling Commands for Database Connections #21912
Conversation
Codecov Report
@@ Coverage Diff @@
## master #21912 +/- ##
===========================================
- Coverage 66.91% 55.91% -11.01%
===========================================
Files 1851 1859 +8
Lines 70709 71016 +307
Branches 7766 7764 -2
===========================================
- Hits 47316 39709 -7607
- Misses 21371 29285 +7914
Partials 2022 2022
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
superset/databases/models.py
Outdated
database_id = sa.Column(sa.Integer, sa.ForeignKey("dbs.id"), nullable=False) | ||
database: Database = relationship( | ||
"Database", | ||
backref=backref("ssh_tunnel_credentials", cascade="all, delete-orphan"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a one-to-one, correct? If so, I think you'll need uselist=False
here as well
superset/databases/models.py
Outdated
app_config = app.config | ||
|
||
|
||
class SSHTunnelCredentials( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, but naming-wise, I feel like a model contains credentials, therefore the name of the model can be more generic, like just SSHTunnel
. Open to other thoughts/ideas on that.
superset/databases/models.py
Outdated
database: Database = relationship( | ||
"Database", | ||
backref=backref("ssh_tunnel_credentials", cascade="all, delete-orphan"), | ||
foreign_keys=[database_id], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need this foreign_key here since you declared it in the database_id column
superset/databases/models.py
Outdated
|
||
# password protected pkey authentication | ||
private_key = sa.Column( | ||
EncryptedType(sa.String, app_config["SECRET_KEY"]), nullable=True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is more of a question than a suggestion, but I see that we're using encrypted_field_factory
for database passwords and encrypted_extra. Is there any benefit to using the EncryptedType here instead?
This looks great @hughhhh. If you wanted to write a test, one scenario that would be useful to see would be the test connection where you would create instances of the db and ssh_tunnel without saving them and be able to pull information out of both of the models. |
app_config = app.config | ||
|
||
|
||
def upgrade(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that previously we had established the pattern that migration and logic should be in separate PRs. I thought that made sense, since it helps with reversing the migration in case something happens. Thoughts on doing that for this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, that pattern is hard to work with -- it's hard to make a single PR with migrations that are later depended upon by other code as the migrations tend to evolve as you work on the feature. Also, encapsulating related changes in a single PR makes more sense as the revert would remove the DB migrations along with the code that's depending on them :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@craig-rueda I think we're always going to have this discussion... :)
Having migrations in separate PRs really helps when you need to cherry-pick a PR with a migration, since when that happens you also need to cherry-pick every PR that has a migration in between, regardless of what it is. If those intermediary PRs are harmless DB migrations the cherry-pick is much easier.
My recommendation has been: work on a single PR, since as you said the migrations tend to evolve during development. When the PR is ready, split it into 2. This of course assumes that the DB migration can live independently from the code (eg, it adds a new table or a new column).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@betodealmeida, I see your point, but still think it's easier to reason and easier to revert like @craig-rueda said. Also is happens you don't get things right the first time, then you end up with a stream of db mig PR's. I think it's easier to reason on 1 PR 1 feature (at least on the backend), follow up PR's are fixes or frontend code. Has always there can be exceptions on really big features.
Just adding to the discussion :)
I think it might be better to just add this model into the PR you're planning down the road that adds the SSH handling. It's a little hard to grok what's going on out of context. Also, you almost ALWAYS end up making model tweaks as you implement a new feature, which will require another migration 😬 |
IMHO the best approach is to write a single PR with logic + migration (to prevent the issues you raised), but then split it in two for review. Having self-contained migrations really helps with cherry-picking, because if you need to cherry-pick a PR that has a migration you have to also cherry-pick all previous PRs with migrations. |
superset/models/core.py
Outdated
database_id=self.id | ||
): | ||
# if ssh_tunnel is available build engine with information | ||
url = make_url_safe(self.sqlalchemy_uri_decrypted) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This block should be managed by a SSHManager
that is configured via app_config. It's currently not possible for other folks to update the behavior of the tunneling mechanism
superset/models/core.py
Outdated
@@ -66,6 +66,7 @@ | |||
from superset.utils.memoized import memoized | |||
|
|||
config = app.config | |||
ssh_manager = config["SSH_TUNNEL_MANAGER"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be exposed via extensions
.
c346ae0
to
06e115b
Compare
superset/migrations/versions/2022-10-20_10-48_f3c2d8ec8595_create_ssh_tunnel_credentials_tbl.py
Outdated
Show resolved
Hide resolved
…ate_ssh_tunnel_credentials_tbl.py Co-authored-by: Elizabeth Thompson <eschutho@gmail.com>
superset/migrations/versions/2022-10-20_10-48_f3c2d8ec8595_create_ssh_tunnel_credentials_tbl.py
Outdated
Show resolved
Hide resolved
…ate_ssh_tunnel_credentials_tbl.py Co-authored-by: Elizabeth Thompson <eschutho@gmail.com>
…te-sshtunnelconfig-tbl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Co-authored-by: hughhhh <hughmil3s@gmail.com>
Ephemeral environment shutdown and build artifacts deleted. |
SUMMARY
Related to #21789
Creating table to store ssh tunnel credentials. This a starter ticket for allowing using to enable ssh tunneling when trying to connect to a database.
BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
TESTING INSTRUCTIONS
ADDITIONAL INFORMATION