Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Load SQL engine backends via entrypoints #41728

Open
yehoshuadimarsky opened this issue May 30, 2021 · 8 comments
Open

ENH: Load SQL engine backends via entrypoints #41728

yehoshuadimarsky opened this issue May 30, 2021 · 8 comments
Assignees
Labels
Compat pandas objects compatability with Numpy or Python functions Enhancement IO SQL to_sql, read_sql, read_sql_query

Comments

@yehoshuadimarsky
Copy link
Contributor

Follow up to #36893. Per @xhochy's suggestion we should use entrypoints as a mechanism for loading alternate SQL engine backends, similar to what we do with plotting backends. Creating a new issue for it per the comment here.

@yehoshuadimarsky yehoshuadimarsky added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels May 30, 2021
@lithomas1 lithomas1 added Compat pandas objects compatability with Numpy or Python functions IO SQL to_sql, read_sql, read_sql_query labels May 30, 2021
@lithomas1 lithomas1 added this to the Contributions Welcome milestone May 30, 2021
@lithomas1 lithomas1 removed the Needs Triage Issue that has not been reviewed by a pandas team member label May 30, 2021
@yehoshuadimarsky
Copy link
Contributor Author

@xhochy can you please clarify for me, in this scenario that we will load SQL engine backends via entrypoints, what exactly does the SQL backend library need to provide to be compatible? I see you wrote here that it needs to add this to its setup.{py|cfg}

entry_points={"pandas_sql_engine": ["turbodbc = turbodbc.TurbodbcPandasEngine"]}

but what does turbodbc.TurbodbcPandasEngine actually need to implement?

@yehoshuadimarsky
Copy link
Contributor Author

yehoshuadimarsky commented May 30, 2021

Actually some good reading here:

Appears that the group portion should be delimited by dots, not underscores, e.g.

entry_points={"pandas.sql.engine": ["turbodbc = turbodbc.TurbodbcPandasEngine"]}

@yehoshuadimarsky
Copy link
Contributor Author

Looking around, we can potentially reuse some of the plugin approaches that other popular libraries use. Some examples that come to mind:

Pytest

Flake8

@yehoshuadimarsky
Copy link
Contributor Author

And maybe we should centralize all plugin loading into a single module with common functions, and that can be invoked from the specific places that need it like plotting, sql engine, etc. Would make adding more plugins easier

@jreback
Copy link
Contributor

jreback commented May 30, 2021

happy to get something to work then move to a more formal plugin structure

@yehoshuadimarsky
Copy link
Contributor Author

Sounds good, working on it. @xhochy how do we enforce that the sql engine has the correct interface? Or do we not do so, we just let the user select an engine and it's up to the engine library to create a function that has the correct signature, and if it fails, it fails? Should an engine library import the base abstract SqlEngine class to subclass from, or is that too strong of a dependency?

Trying to understand what the expectations are for these kinds of loosly coupled dependencies...

@yehoshuadimarsky
Copy link
Contributor Author

take

@yehoshuadimarsky
Copy link
Contributor Author

Sounds good, working on it. @xhochy how do we enforce that the sql engine has the correct interface? Or do we not do so, we just let the user select an engine and it's up to the engine library to create a function that has the correct signature, and if it fails, it fails? Should an engine library import the base abstract SqlEngine class to subclass from, or is that too strong of a dependency?

Trying to understand what the expectations are for these kinds of loosly coupled dependencies...

@xhochy surfacing this again, are you able to provide some guidance around this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Compat pandas objects compatability with Numpy or Python functions Enhancement IO SQL to_sql, read_sql, read_sql_query
Projects
None yet
Development

No branches or pull requests

4 participants