Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DbApiHook: Support kwargs in get_pandas_df #9730

Merged
merged 6 commits into from
Aug 12, 2020

Conversation

22quinn
Copy link
Contributor

@22quinn 22quinn commented Jul 9, 2020

Support all parameters that are supported by pandas read_sql function: https://github.com/pandas-dev/pandas/blob/1.0.x/pandas/io/sql.py#L336-L345

Closes #8468


Make sure to mark the boxes below before creating PR: [x]

  • Description above provides context of the change
  • Unit tests coverage for changes (not needed for documentation changes)
  • Target Github ISSUE in description if exists
  • Commits follow "How to write a good git commit message"
  • Relevant documentation is updated including usage instructions.
  • I will engage committers as explained in Contribution Workflow Example.

In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.
Read the Pull Request Guidelines for more information.

@22quinn
Copy link
Contributor Author

22quinn commented Jul 9, 2020

Static check failed:
airflow/providers/google/cloud/hooks/bigquery.py:166: error: Signature of "get_pandas_df" incompatible with supertype "DbApiHook"

Added kwargs in BigQueryHook as well

@potiuk
Copy link
Member

potiuk commented Aug 2, 2020

Looks good @zikun -> but there is a handful of other hooks that are defining this method and (hive/exasol) - and by a quick look they could also benefit from those kwargs. Would you mind adding them there as well ?

@22quinn
Copy link
Contributor Author

22quinn commented Aug 6, 2020

@potiuk Added to all the overriding methods

@@ -106,7 +106,7 @@ def get_sqlalchemy_engine(self, engine_kwargs=None):
engine_kwargs = {}
return create_engine(self.get_uri(), **engine_kwargs)

def get_pandas_df(self, sql, parameters=None):
def get_pandas_df(self, sql, parameters=None, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind adding information to method docstring?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

@22quinn
Copy link
Contributor Author

22quinn commented Aug 8, 2020

@turbaszek let's merge it :)

@turbaszek turbaszek merged commit 8f8db89 into apache:master Aug 12, 2020
@22quinn 22quinn deleted the DbApiHook-pandas-kwargs branch August 15, 2020 11:47
@kaxil kaxil added this to the Airflow 1.10.13 milestone Aug 24, 2020
dimberman pushed a commit that referenced this pull request Sep 17, 2020
* DbApiHook: Support kwargs in get_pandas_df
* BigQueryHook: Support kwargs in get_pandas_df
* PrestoHook: Support kwargs in get_pandas_df
* HiveServer2Hook: Support kwargs in get_pandas_df

(cherry picked from commit 8f8db89)
RaviTezu pushed a commit to RaviTezu/airflow that referenced this pull request Oct 25, 2020
* DbApiHook: Support kwargs in get_pandas_df
* BigQueryHook: Support kwargs in get_pandas_df
* PrestoHook: Support kwargs in get_pandas_df
* HiveServer2Hook: Support kwargs in get_pandas_df

(cherry picked from commit 8f8db89)
kaxil pushed a commit that referenced this pull request Nov 12, 2020
* DbApiHook: Support kwargs in get_pandas_df
* BigQueryHook: Support kwargs in get_pandas_df
* PrestoHook: Support kwargs in get_pandas_df
* HiveServer2Hook: Support kwargs in get_pandas_df

(cherry picked from commit 8f8db89)
@potiuk potiuk added the type:improvement Changelog: Improvements label Nov 14, 2020
potiuk pushed a commit that referenced this pull request Nov 16, 2020
* DbApiHook: Support kwargs in get_pandas_df
* BigQueryHook: Support kwargs in get_pandas_df
* PrestoHook: Support kwargs in get_pandas_df
* HiveServer2Hook: Support kwargs in get_pandas_df

(cherry picked from commit 8f8db89)
cfei18 pushed a commit to cfei18/incubator-airflow that referenced this pull request Mar 5, 2021
* DbApiHook: Support kwargs in get_pandas_df
* BigQueryHook: Support kwargs in get_pandas_df
* PrestoHook: Support kwargs in get_pandas_df
* HiveServer2Hook: Support kwargs in get_pandas_df

(cherry picked from commit 8f8db89)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:improvement Changelog: Improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DbApiHook: add chunksize to get_pandas_df parameters
4 participants