Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Allow df.to_sql() to pass dialect kwargs to SQLAlchemy. #45259

Open
cansjt opened this issue Jan 8, 2022 · 0 comments
Open

ENH: Allow df.to_sql() to pass dialect kwargs to SQLAlchemy. #45259

cansjt opened this issue Jan 8, 2022 · 0 comments
Labels
Enhancement IO SQL to_sql, read_sql, read_sql_query

Comments

@cansjt
Copy link

cansjt commented Jan 8, 2022

Is your feature request related to a problem?

Some dialects need dialect specific arguments to properly define and create a Table, before being able to dump data into it.
But as of today there are no ways to pass such parameters through the DataFrame.to_sql() method.

As a consequence, some dialects have resorted to hacks to pass along the necessary information. But with the changes coming with SQLAlchemy 2.0, the removal of most SchemaItem.bind attribute, those hacks will no longer be viable.

Hence, there is a need for another way to pass the same information to the table.

Note: this is an enhancement proposal, up until the release of SQLAlchemy 2.0, where this will become an issue for some users.

Describe the solution you'd like

The DataFrame.to_sql() should accept a keyword argument that receives dialect kwargs (as a dict) when necessary and pass them through to the Table definition.

df.to_sql('<table name>',
          connection,
          schema='some schema',
          dialect_kwargs={'awsathena_location': 's3://bucket/prefix'})

API breaking implications

This should not break the API, as it just adds a new optional parameter to it.
Users code though, depending on the dialect they use, might need to be modified to pass that extra argument.

Describe alternatives you've considered

I do not know the code base well enough to know if there are alternatives, just pointing out the obvious solution. But given that pandas is supposed to be easy, you might also want to provide to your users with a mean to set default dialect kwargs somehow, so they don't need to pass that information to all DataFrame.to_sql() calls. I see there is a settings mechanism that might help in that respect. Note that you can specify SQLAlchemy's dialect kwargs for several dialects, as their names are prefixed by the dialect name. You can bluntly pass a bunch of them when instantiating the Table class, and SQLAlchemy will sort things out for you.

Also I have no use cases for Pandas, so I don't have much interest to work on this issue. I only stumbled on this working on the PyAthena dialect for SQLAlchemy. I don't know if any or how many other dialects might be impacted.

Additional context

You might want to read this discussion on the SQLAlchemy ML and the changes being done in PyAthena.

@cansjt cansjt added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Jan 8, 2022
@cansjt cansjt changed the title ENH: Allows df.to_sql() to pass dialect kwargs to SQLAlchemy. ENH: Allow df.to_sql() to pass dialect kwargs to SQLAlchemy. Jan 8, 2022
@jbrockmendel jbrockmendel added the IO SQL to_sql, read_sql, read_sql_query label Jan 10, 2022
@mroeschke mroeschke removed the Needs Triage Issue that has not been reviewed by a pandas team member label Jan 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement IO SQL to_sql, read_sql, read_sql_query
Projects
None yet
Development

No branches or pull requests

3 participants