Skip to content

chore: switching out ConnectorRegistry references for DatasourceDAO#20149

Closed
hughhhh wants to merge 25 commits intoapache:masterfrom
hve-labs:connreg-to-datadao_2
Closed

chore: switching out ConnectorRegistry references for DatasourceDAO#20149
hughhhh wants to merge 25 commits intoapache:masterfrom
hve-labs:connreg-to-datadao_2

Conversation

@hughhhh
Copy link
Member

@hughhhh hughhhh commented May 20, 2022

SUMMARY

Switching out all references of ConnectorRegistry to DatasourceDAO. In the PR, we are focusing on just removing the references and moving towards allowing other Datasources to power charts.

One issue is the circular dependencies that are blocking us from fully removing the logic to allow the app to run. So i've moved that code out the ConnectorRegistry for now and will prioritize a bigger refactor to stop the import issues in superset moving forward.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@hughhhh hughhhh changed the title Connreg to datadao 2 chore: switching out ConnectorRegistry references for DatasourceDAO May 20, 2022
datasource_type = request.args.get("datasource_type")
if datasource_id and datasource_type:
ds_class = ConnectorRegistry.sources.get(datasource_type)
ds_class = DatasourceDAO.sources.get(datasource_type)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use DatasourceType(datasource_type) here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DatasourceDAO.sources should be able to take either a string or the DatasourceType instance, so I think either works here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

of course you'll get an error either way if the datasource_type is not valid.

datasource_id, datasource_type = request.args["datasourceKey"].split("__")
datasource = ConnectorRegistry.get_datasource(
datasource = DatasourceDAO.get_datasource(
datasource_type,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use DatasourceType(datasource.type)

database_id = datasource_dict["database"].get("id")
orm_datasource = ConnectorRegistry.get_datasource(
orm_datasource = DatasourceDAO.get_datasource(
datasource_type, datasource_id, db.session
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use DatasourceType(datasource.type)

# make temporary change and revert it to refresh the changed_on property
datasource = ConnectorRegistry.get_datasource(
datasource = DatasourceDAO.get_datasource(
datasource_type=payload["datasource"]["type"],
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use DatasourceType(datasource.type)


# make temporary change and revert it to refresh the changed_on property
datasource = ConnectorRegistry.get_datasource(
datasource = DatasourceDAO.get_datasource(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use DatasourceType(datasource.type)

if id_ is None:
continue
datasource = ConnectorRegistry.get_datasource_by_id(session, id_)
# todo(phillip): get datasource type for this method
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this comment

# pylint: disable=import-outside-toplevel
from superset.datasource.dao import DatasourceDAO

return DatasourceDAO.sources[self.datasource_type]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use DatasourceType(datasource_type)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update DatasourceType(datasource_type)

@hughhhh hughhhh force-pushed the connreg-to-datadao_2 branch 3 times, most recently from c9cd671 to 6a85df3 Compare May 20, 2022 21:04
@hughhhh hughhhh force-pushed the connreg-to-datadao_2 branch from 66cdb1c to 95c5eb2 Compare June 5, 2022 16:50
@pull-request-size pull-request-size bot added size/XL and removed size/L labels Jun 5, 2022
@hughhhh hughhhh force-pushed the connreg-to-datadao_2 branch 5 times, most recently from 821da1c to ac2d8aa Compare June 5, 2022 19:04
@hughhhh hughhhh force-pushed the connreg-to-datadao_2 branch 2 times, most recently from efb667e to 051c26c Compare June 5, 2022 23:09
@hughhhh hughhhh force-pushed the connreg-to-datadao_2 branch from 051c26c to ec17f51 Compare June 5, 2022 23:24
@hughhhh hughhhh requested a review from eschutho June 5, 2022 23:36
@hughhhh
Copy link
Member Author

hughhhh commented Jun 6, 2022

I left a comment about moving sqlatable specific lookups to the model or a sqlatable dao

Decided to move all the functions into the model

"owner": owner,
"datasource_id": datasource_id,
"datasource_type": datasource_type,
"datasource_type": DatasourceType(datasource_type),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this one different from the above one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I am referring to this: https://github.com/apache/superset/pull/20149/files/b678a8ba981f7687c17bc68b4f529209bde2d961#diff-6029cd3077bd4123d18311935a0fef83a580ccf22f4a98b6085ed780fd7257e9R27

THis one has DatasourceType(datasource_type) the version above has DatasourceType

def get_datasource(
cls, session: Session, datasource_type: DatasourceType, datasource_id: int
) -> Datasource:
if datasource_type not in cls.sources:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're going to want to accept the string value as well here.
I don't think "table" in cls.sources is going to work for example.

Copy link
Member Author

@hughhhh hughhhh Jun 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm currently casting all the values going into the get_datasource function like this:

DatasourceDAO.get_datasource(
  db.session, DatasourceType(datasource["type"]), int(datasource["id"])
)

So moving forward i think we should push for the enums since it's easy to cast via just wrapping it in a DatasourceType() constructor. If we allow string we'd have to add logic to handle/report whenever the user sending us invalid values.

I'm okay with putting the Union[str, DatasourceType] but don't think it takes much work to just wrap the string with the DatasourceType constructor before passing it into the function

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could also just wrap source = cls.sources[datasource_type] in a try/catch block. I just think there's too much room for error here across the application with people who may not know that they can't pass in a string. This should work if datasource_type is either a string or the enum key.

set
)
session = self.get_session
all_datasources = SqlaTable.get_all_sqlatables_datasources(session)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, but any reason why you wouldn't name this method get_all_datasources?

Copy link
Member

@eschutho eschutho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

most are small questions, but if datasource_type not in cls.sources: could cause problems if a string is passed in. I personally think we could allow both types, but we could prob go either way. It's just going to be difficult to enforce with typing.


def get_table_connector_registry() -> Any:
return ConnectorRegistry.sources["table"]
return DatasourceDAO.sources[DatasourceType.TABLE]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use the SqlaTable model directly since this is the only option here?

cls, session: Session, datasource_type: DatasourceType, datasource_id: int
) -> Datasource:
if datasource_type not in cls.sources:
raise DatasourceTypeNotSupportedError()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

notes: set this exception to return 422


@classmethod
def get_datasource(
cls, session: Session, datasource_type: DatasourceType, datasource_id: int
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

go with a Union[str, DatasourceType] and add test to verify works both and will throw 422

@hughhhh
Copy link
Member Author

hughhhh commented Jun 14, 2022

Moving here #20380

@hughhhh hughhhh closed this Jun 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants