Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ssh-tunnelling): Setup SSH Tunneling Commands for Database Connections #21912

Merged
merged 98 commits into from
Jan 3, 2023
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
98 commits
Select commit Hold shift + click to select a range
830a283
save
hughhhh Oct 19, 2022
2c1e736
create migration
hughhhh Oct 20, 2022
f78df83
created schema and rename
hughhhh Oct 21, 2022
d482df4
linting
hughhhh Oct 21, 2022
9edb581
fix encrpytions
hughhhh Oct 21, 2022
da27d8f
remove map tabl
hughhhh Oct 24, 2022
773a6c8
fix linting
hughhhh Oct 25, 2022
2f2dda2
add constraint
hughhhh Oct 25, 2022
fd0d7f2
add fk to migration
hughhhh Oct 25, 2022
158da8d
init
hughhhh Oct 26, 2022
face73f
update all the examples
hughhhh Oct 26, 2022
95d079e
change remaining bits
hughhhh Oct 26, 2022
d5926e3
add id
hughhhh Oct 27, 2022
f7a6a41
use factory instead
hughhhh Oct 28, 2022
30e380a
Merge branch 'master' of https://github.com/apache/superset into crea…
hughhhh Oct 28, 2022
87c0d79
Merge branch 'master' into ref-get-sqla-engine-2
hughhhh Oct 28, 2022
11b240b
fix confict
hughhhh Oct 28, 2022
1bfdbda
fix conflict
hughhhh Oct 28, 2022
4146d5a
setup return value for contextmanager
hughhhh Oct 31, 2022
f8b877d
add sshtunnel pip
hughhhh Oct 31, 2022
54fc147
updates test
hughhhh Nov 1, 2022
fdc6ca3
fix linting
hughhhh Nov 2, 2022
66c0801
renaming function
hughhhh Nov 3, 2022
1f9ec5e
fix test
hughhhh Nov 5, 2022
c698cf4
Merge branch 'ref-get-sqla-engine-2' of https://github.com/apache/sup…
hughhhh Nov 7, 2022
41bd19b
add schema to test_connection api
hughhhh Nov 7, 2022
8811a99
fix get engine to return contextmanager
hughhhh Nov 7, 2022
82d7532
why
hughhhh Nov 7, 2022
1f829ac
yerp
hughhhh Nov 7, 2022
d53d116
update typing
hughhhh Nov 7, 2022
752161d
update comment
hughhhh Nov 7, 2022
8c4b081
Merge branch 'ref-get-sqla-engine-2' of https://github.com/apache/sup…
hughhhh Nov 7, 2022
1a19a97
save
hughhhh Nov 8, 2022
58b9cce
save
hughhhh Nov 8, 2022
0ac6fb1
Merge branch 'master' of https://github.com/apache/superset into ref-…
hughhhh Nov 8, 2022
31f3c1d
fix pylint
hughhhh Nov 8, 2022
a0b30e6
Merge branch 'ref-get-sqla-engine-2' of https://github.com/apache/sup…
hughhhh Nov 8, 2022
e089a8d
last one
hughhhh Nov 8, 2022
81b2f88
Merge branch 'ref-get-sqla-engine-2' of https://github.com/apache/sup…
hughhhh Nov 8, 2022
45686b7
update naming on ssh tunnel
hughhhh Nov 9, 2022
7ce5836
Merge branch 'master' into ref-get-sqla-engine-2
hughhhh Nov 9, 2022
d9c8d0d
Merge branch 'ref-get-sqla-engine-2' of https://github.com/apache/sup…
hughhhh Nov 9, 2022
ec27b80
fix renaming
hughhhh Nov 10, 2022
65e3e29
fix renaming 2
hughhhh Nov 10, 2022
9fa9db5
Merge branch 'master' of https://github.com/apache/superset into crea…
hughhhh Nov 10, 2022
1a11ff4
oops
hughhhh Nov 10, 2022
3f0dae1
fix linting errors
hughhhh Nov 10, 2022
2777807
feat(ssh_tunnel): DAO Changes for SSH Tunnel (#22120)
Antonio-RiveroMartnez Nov 15, 2022
8ed02cd
fix merge conflicts
hughhhh Nov 16, 2022
6a68147
Merge branch 'create-sshtunnelconfig-tbl' of https://github.com/apach…
hughhhh Nov 16, 2022
6bd32e8
feat(ssh_tunnel): Delete command & exceptions (#22131)
Antonio-RiveroMartnez Nov 16, 2022
8a3ee35
Merge branch 'master' of https://github.com/apache/superset into crea…
hughhhh Nov 16, 2022
bc89194
Merge branch 'create-sshtunnelconfig-tbl' of https://github.com/apach…
hughhhh Nov 16, 2022
adb9451
fix indenting for superset/databases/commands/validate.py
hughhhh Nov 17, 2022
16d960b
change tablename
hughhhh Nov 17, 2022
d2ab4a6
feat(ssh_tunnel): DELETE SSH Tunnels API (#22153)
Antonio-RiveroMartnez Nov 17, 2022
fb2acd0
Revert "feat(ssh_tunnel): DELETE SSH Tunnels API" (#22156)
hughhhh Nov 17, 2022
4d807c9
feat(ssh_tunnel): Update command & exceptions (#22132)
Antonio-RiveroMartnez Nov 17, 2022
dc0c848
forgot server_port
hughhhh Nov 17, 2022
21fcdf0
bind_port + bind_host :)
hughhhh Nov 17, 2022
68cb75f
oops
hughhhh Nov 17, 2022
44ca56b
fix linting
hughhhh Nov 17, 2022
7e1461e
feat(ssh_tunnel): SSH Tunnel updates from Code Review (#22182)
Antonio-RiveroMartnez Nov 21, 2022
92e41f1
Merge branch 'master' of https://github.com/apache/superset into crea…
hughhhh Nov 22, 2022
6c59663
feat(ssh_tunnel): Create command & exceptions (#22148)
hughhhh Nov 22, 2022
466703a
Update schemas.py
hughhhh Nov 28, 2022
554de53
Merge branch 'master' of https://github.com/apache/superset into crea…
hughhhh Nov 30, 2022
4448739
chore(ssh-tunnel): create `contextmanager` for sql.inspect (#22251)
hughhhh Nov 30, 2022
bb78055
fix lint
hughhhh Nov 30, 2022
f507385
fix migrations
hughhhh Nov 30, 2022
45aa022
Merge branch 'master' of https://github.com/apache/superset into crea…
hughhhh Dec 1, 2022
3d3b71b
Merge branch 'master' of https://github.com/apache/superset into crea…
hughhhh Dec 1, 2022
86436b6
Revert "chore(ssh-tunnel): create `contextmanager` for sql.inspect (#…
hughhhh Dec 1, 2022
54a8d7f
debugging
hughhhh Dec 1, 2022
3f6afec
fix(ssh_tunnel): Address Base PR comments from peer review (#22306)
Antonio-RiveroMartnez Dec 5, 2022
7625566
fix pre-commit
hughhhh Dec 5, 2022
0578a8e
working changes
hughhhh Dec 6, 2022
ec20429
refactor bind_host and bind_port
hughhhh Dec 6, 2022
1f57d4a
refactor create flow for temp ssh tunnels
hughhhh Dec 7, 2022
ed19a3e
remove logger
hughhhh Dec 8, 2022
852c8bb
chore(ssh_tunnel): Add extra tests to SSHTunnel commands (#22372)
Antonio-RiveroMartnez Dec 8, 2022
be5c005
add flush to allow database.id to be populated
hughhhh Dec 8, 2022
948f748
Merge branch 'create-sshtunnelconfig-tbl' of https://github.com/apach…
hughhhh Dec 8, 2022
c636ce7
make sure to use inspector with context
hughhhh Dec 9, 2022
908896f
remove id and database_id
hughhhh Dec 12, 2022
e3ef835
uselist
hughhhh Dec 12, 2022
c5c50ed
feat(ssh-tunnel): ssh manager config + feature flag (#22201)
hughhhh Dec 15, 2022
06e115b
update kwarg function name
hughhhh Dec 15, 2022
13ed50d
chore(ssh-tunnel): Move SSHManager to extensions pattern (#22433)
hughhhh Dec 16, 2022
54d51e2
add flag to indicate ssh tunneling is enabled for this engine
hughhhh Dec 16, 2022
53eaa63
Update superset/migrations/versions/2022-10-20_10-48_f3c2d8ec8595_cre…
hughhhh Dec 19, 2022
8f8faff
Update superset/migrations/versions/2022-10-20_10-48_f3c2d8ec8595_cre…
hughhhh Dec 19, 2022
607c682
fix linting
hughhhh Dec 20, 2022
1ea0e8b
Merge branch 'master' of https://github.com/apache/superset into crea…
hughhhh Dec 22, 2022
7cc7bc8
fix requirements
hughhhh Dec 22, 2022
7c539d2
Merge branch 'master' of https://github.com/apache/superset into crea…
hughhhh Jan 3, 2023
394afc1
get df with get_raw_connection function
hughhhh Jan 3, 2023
9b09fc7
feat(ssh_tunnel): APIs for SSH Tunnels (#22199)
Antonio-RiveroMartnez Jan 3, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
66 changes: 66 additions & 0 deletions superset/databases/models.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

import sqlalchemy as sa
from flask_appbuilder import Model
from sqlalchemy.orm import backref, relationship
from sqlalchemy_utils import EncryptedType

from superset import app
from superset.models.core import Database
from superset.models.helpers import (
AuditMixinNullable,
ExtraJSONMixin,
ImportExportMixin,
)

app_config = app.config


class SSHTunnelCredentials(
Copy link
Member

@eschutho eschutho Oct 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, but naming-wise, I feel like a model contains credentials, therefore the name of the model can be more generic, like just SSHTunnel. Open to other thoughts/ideas on that.

Model, AuditMixinNullable, ExtraJSONMixin, ImportExportMixin
):
"""
A ssh tunnel configuration in a database.
"""

__tablename__ = "ssh_tunnel_credentials"

id = sa.Column(sa.Integer, primary_key=True)
database_id = sa.Column(sa.Integer, sa.ForeignKey("dbs.id"), nullable=False)
database: Database = relationship(
"Database",
backref=backref("ssh_tunnel_credentials", cascade="all, delete-orphan"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a one-to-one, correct? If so, I think you'll need uselist=False here as well

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to rename this to be ssh_tunnel in this file?

foreign_keys=[database_id],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need this foreign_key here since you declared it in the database_id column

)

server_address = sa.Column(EncryptedType(sa.String, app_config["SECRET_KEY"]))
server_port = sa.Column(EncryptedType(sa.String, app_config["SECRET_KEY"]))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be sa.Column(EncryptedType(sa.Integer, app_config["SECRET_KEY"])) ?

username = sa.Column(EncryptedType(sa.String, app_config["SECRET_KEY"]))

# basic authentication
password = sa.Column(
EncryptedType(sa.String, app_config["SECRET_KEY"]), nullable=True
)

# password protected pkey authentication
private_key = sa.Column(
EncryptedType(sa.String, app_config["SECRET_KEY"]), nullable=True
Copy link
Member

@eschutho eschutho Oct 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more of a question than a suggestion, but I see that we're using encrypted_field_factory for database passwords and encrypted_extra. Is there any benefit to using the EncryptedType here instead?

)
private_key_password = sa.Column(
EncryptedType(sa.String, app_config["SECRET_KEY"]), nullable=True
)
20 changes: 20 additions & 0 deletions superset/databases/schemas.py
Original file line number Diff line number Diff line change
Expand Up @@ -676,6 +676,26 @@ class EncryptedDict(EncryptedField, fields.Dict):
pass


class DatabaseSSHTunnelCredentials(Schema):
id = fields.Integer()
database_id = fields.Integer()

server_address = fields.String()
server_port = fields.Integer()
username = fields.String()

# Basic Authentication
password = fields.String(required=False)

# password protected private key authentication
private_key = fields.String(required=False)
private_key_password = fields.String(required=False)

# remote binding port
bind_host = fields.String()
bind_port = fields.Integer()


def encrypted_field_properties(self, field: Any, **_) -> Dict[str, Any]: # type: ignore
ret = {}
if isinstance(field, EncryptedField):
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
"""create_ssh_tunnel_credentials_tbl

Revision ID: f3c2d8ec8595
Revises: deb4c9d4a4ef
Create Date: 2022-10-20 10:48:08.722861

"""

# revision identifiers, used by Alembic.
revision = "f3c2d8ec8595"
down_revision = "deb4c9d4a4ef"

from uuid import uuid4

import sqlalchemy as sa
from alembic import op
from sqlalchemy_utils import EncryptedType, UUIDType

from superset import app

app_config = app.config


def upgrade():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that previously we had established the pattern that migration and logic should be in separate PRs. I thought that made sense, since it helps with reversing the migration in case something happens. Thoughts on doing that for this PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, that pattern is hard to work with -- it's hard to make a single PR with migrations that are later depended upon by other code as the migrations tend to evolve as you work on the feature. Also, encapsulating related changes in a single PR makes more sense as the revert would remove the DB migrations along with the code that's depending on them :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@craig-rueda I think we're always going to have this discussion... :)

Having migrations in separate PRs really helps when you need to cherry-pick a PR with a migration, since when that happens you also need to cherry-pick every PR that has a migration in between, regardless of what it is. If those intermediary PRs are harmless DB migrations the cherry-pick is much easier.

My recommendation has been: work on a single PR, since as you said the migrations tend to evolve during development. When the PR is ready, split it into 2. This of course assumes that the DB migration can live independently from the code (eg, it adds a new table or a new column).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@betodealmeida, I see your point, but still think it's easier to reason and easier to revert like @craig-rueda said. Also is happens you don't get things right the first time, then you end up with a stream of db mig PR's. I think it's easier to reason on 1 PR 1 feature (at least on the backend), follow up PR's are fixes or frontend code. Has always there can be exceptions on really big features.

Just adding to the discussion :)

op.create_table(
"ssh_tunnel_credentials",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this means user will still be creating a ssh_tunnel_credentials table? because our tests are expecting ssh_tunnel so they are failing.

# AuditMixinNullable
sa.Column("created_on", sa.DateTime(), nullable=True),
eschutho marked this conversation as resolved.
Show resolved Hide resolved
sa.Column("changed_on", sa.DateTime(), nullable=True),
sa.Column("created_by_fk", sa.Integer(), nullable=True),
sa.Column("changed_by_fk", sa.Integer(), nullable=True),
# ExtraJSONMixin
sa.Column("extra_json", sa.Text(), nullable=True),
# ImportExportMixin
sa.Column("uuid", UUIDType(binary=True), primary_key=False, default=uuid4),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we put a uniqueness constraint on this column?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@betodealmeida do we need to put a constaint on these, in the ImportExportMixin columns i didn't a constraint being made so didn't feel like it was necessary

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you ever look up by uuid?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no right now, but this is mostly for export/import if we wanted to have ssh tunnels exported with the db credentials

hughhhh marked this conversation as resolved.
Show resolved Hide resolved
# SSHTunnelCredentials
sa.Column("database_id", sa.INTEGER(), sa.ForeignKey("dbs.id"), nullable=True),
sa.Column("server_address", EncryptedType(sa.String, app_config["SECRET_KEY"])),
sa.Column("server_port", EncryptedType(sa.String, app_config["SECRET_KEY"])),
sa.Column(
"username",
EncryptedType(sa.String, app_config["SECRET_KEY"]),
),
sa.Column(
"password",
EncryptedType(sa.String, app_config["SECRET_KEY"]),
nullable=True,
),
sa.Column(
"private_key",
EncryptedType(sa.String, app_config["SECRET_KEY"]),
nullable=True,
),
sa.Column(
"private_key_password",
EncryptedType(sa.String, app_config["SECRET_KEY"]),
nullable=True,
),
sa.PrimaryKeyConstraint("id"),
)


def downgrade():
op.drop_table("ssh_tunnel_credentials")