Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ssh-tunnelling): Setup SSH Tunneling Commands for Database Connections #21912

Merged
merged 98 commits into from
Jan 3, 2023

Conversation

hughhhh
Copy link
Member

@hughhhh hughhhh commented Oct 21, 2022

SUMMARY

Related to #21789

Creating table to store ssh tunnel credentials. This a starter ticket for allowing using to enable ssh tunneling when trying to connect to a database.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@codecov
Copy link

codecov bot commented Oct 24, 2022

Codecov Report

Merging #21912 (9b09fc7) into master (a7a4561) will decrease coverage by 11.00%.
The diff coverage is 72.78%.

@@             Coverage Diff             @@
##           master   #21912       +/-   ##
===========================================
- Coverage   66.91%   55.91%   -11.01%     
===========================================
  Files        1851     1859        +8     
  Lines       70709    71016      +307     
  Branches     7766     7764        -2     
===========================================
- Hits        47316    39709     -7607     
- Misses      21371    29285     +7914     
  Partials     2022     2022               
Flag Coverage Δ
hive 52.46% <51.77%> (+<0.01%) ⬆️
mysql ?
postgres ?
presto 52.36% <51.77%> (+<0.01%) ⬆️
python 58.15% <72.78%> (-23.12%) ⬇️
sqlite ?
unit 51.46% <71.89%> (+0.27%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
superset/constants.py 100.00% <ø> (ø)
superset/databases/commands/update.py 24.13% <20.00%> (-55.87%) ⬇️
superset/utils/ssh_tunnel.py 30.00% <30.00%> (ø)
superset/databases/commands/create.py 60.60% <35.71%> (-26.19%) ⬇️
superset/models/core.py 77.27% <51.06%> (-13.35%) ⬇️
superset/extensions/ssh.py 65.71% <65.71%> (ø)
superset/config.py 90.93% <66.66%> (-0.56%) ⬇️
superset/databases/api.py 53.63% <66.66%> (-39.91%) ⬇️
superset/databases/commands/test_connection.py 64.10% <75.00%> (-34.57%) ⬇️
superset/databases/ssh_tunnel/commands/create.py 87.75% <87.75%> (ø)
... and 314 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

database_id = sa.Column(sa.Integer, sa.ForeignKey("dbs.id"), nullable=False)
database: Database = relationship(
"Database",
backref=backref("ssh_tunnel_credentials", cascade="all, delete-orphan"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a one-to-one, correct? If so, I think you'll need uselist=False here as well

app_config = app.config


class SSHTunnelCredentials(
Copy link
Member

@eschutho eschutho Oct 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, but naming-wise, I feel like a model contains credentials, therefore the name of the model can be more generic, like just SSHTunnel. Open to other thoughts/ideas on that.

database: Database = relationship(
"Database",
backref=backref("ssh_tunnel_credentials", cascade="all, delete-orphan"),
foreign_keys=[database_id],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need this foreign_key here since you declared it in the database_id column


# password protected pkey authentication
private_key = sa.Column(
EncryptedType(sa.String, app_config["SECRET_KEY"]), nullable=True
Copy link
Member

@eschutho eschutho Oct 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more of a question than a suggestion, but I see that we're using encrypted_field_factory for database passwords and encrypted_extra. Is there any benefit to using the EncryptedType here instead?

@eschutho
Copy link
Member

This looks great @hughhhh. If you wanted to write a test, one scenario that would be useful to see would be the test connection where you would create instances of the db and ssh_tunnel without saving them and be able to pull information out of both of the models.

app_config = app.config


def upgrade():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that previously we had established the pattern that migration and logic should be in separate PRs. I thought that made sense, since it helps with reversing the migration in case something happens. Thoughts on doing that for this PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, that pattern is hard to work with -- it's hard to make a single PR with migrations that are later depended upon by other code as the migrations tend to evolve as you work on the feature. Also, encapsulating related changes in a single PR makes more sense as the revert would remove the DB migrations along with the code that's depending on them :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@craig-rueda I think we're always going to have this discussion... :)

Having migrations in separate PRs really helps when you need to cherry-pick a PR with a migration, since when that happens you also need to cherry-pick every PR that has a migration in between, regardless of what it is. If those intermediary PRs are harmless DB migrations the cherry-pick is much easier.

My recommendation has been: work on a single PR, since as you said the migrations tend to evolve during development. When the PR is ready, split it into 2. This of course assumes that the DB migration can live independently from the code (eg, it adds a new table or a new column).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@betodealmeida, I see your point, but still think it's easier to reason and easier to revert like @craig-rueda said. Also is happens you don't get things right the first time, then you end up with a stream of db mig PR's. I think it's easier to reason on 1 PR 1 feature (at least on the backend), follow up PR's are fixes or frontend code. Has always there can be exceptions on really big features.

Just adding to the discussion :)

@hughhhh hughhhh marked this pull request as ready for review October 27, 2022 19:41
@hughhhh hughhhh requested a review from a team as a code owner October 27, 2022 19:41
@craig-rueda
Copy link
Member

I think it might be better to just add this model into the PR you're planning down the road that adds the SSH handling. It's a little hard to grok what's going on out of context. Also, you almost ALWAYS end up making model tweaks as you implement a new feature, which will require another migration 😬

@betodealmeida
Copy link
Member

I think it might be better to just add this model into the PR you're planning down the road that adds the SSH handling. It's a little hard to grok what's going on out of context. Also, you almost ALWAYS end up making model tweaks as you implement a new feature, which will require another migration 😬

IMHO the best approach is to write a single PR with logic + migration (to prevent the issues you raised), but then split it in two for review. Having self-contained migrations really helps with cherry-picking, because if you need to cherry-pick a PR that has a migration you have to also cherry-pick all previous PRs with migrations.

database_id=self.id
):
# if ssh_tunnel is available build engine with information
url = make_url_safe(self.sqlalchemy_uri_decrypted)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block should be managed by a SSHManager that is configured via app_config. It's currently not possible for other folks to update the behavior of the tunneling mechanism

@@ -66,6 +66,7 @@
from superset.utils.memoized import memoized

config = app.config
ssh_manager = config["SSH_TUNNEL_MANAGER"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be exposed via extensions.

…ate_ssh_tunnel_credentials_tbl.py

Co-authored-by: Elizabeth Thompson <eschutho@gmail.com>
…ate_ssh_tunnel_credentials_tbl.py

Co-authored-by: Elizabeth Thompson <eschutho@gmail.com>
Copy link
Member

@craig-rueda craig-rueda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Co-authored-by: hughhhh <hughmil3s@gmail.com>
@hughhhh hughhhh merged commit ebaad10 into master Jan 3, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Jan 3, 2023

Ephemeral environment shutdown and build artifacts deleted.

@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 2.1.0 and removed 🚢 2.1.3 labels Mar 13, 2024
@mistercrunch mistercrunch deleted the create-sshtunnelconfig-tbl branch March 26, 2024 16:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/XXL 🚢 2.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants