-
Notifications
You must be signed in to change notification settings - Fork 13.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Impovements for RedshiftDataOperator #29434
Conversation
Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
:param return_sql_result: if True will return the result of an SQL statement, | ||
if False will return statement ID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add that the default is False?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides the comment left by @ephraimbuddy, LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would recommend to use pre-commit to fix most of the statics checks failures, see: Static code checks
from mypy_boto3_redshift_data.type_defs import GetStatementResultResponseTypeDef | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move it under the if TYPE_CHECKING:
block
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
@mock.patch("airflow.providers.amazon.aws.hooks.redshift_data.RedshiftDataHook.conn") | ||
def test_return_sql_result(self, mock_conn): | ||
expected_result = {"Result": True} | ||
mock_conn.execute_statement.return_value = {"Id": STATEMENT_ID} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mock_conn.execute_statement.return_value = {"Id": STATEMENT_ID} | |
mock_conn.batch_execute_statement.return_value = {"Id": STATEMENT_ID} |
You need to mock batch_execute_statement
here because you call
airflow/airflow/providers/amazon/aws/operators/redshift_data.py
Lines 124 to 136 in 2a34dc9
def execute_batch_query(self): | |
kwargs: dict[str, Any] = { | |
"ClusterIdentifier": self.cluster_identifier, | |
"Database": self.database, | |
"Sqls": self.sql, | |
"DbUser": self.db_user, | |
"Parameters": self.parameters, | |
"WithEvent": self.with_event, | |
"SecretArn": self.secret_arn, | |
"StatementName": self.statement_name, | |
} | |
resp = self.hook.conn.batch_execute_statement(**trim_none_values(kwargs)) | |
return resp["Id"] |
You could check it locally in Breeze or venv, for more detail see: Airflow Unit Tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. Actually, I needed execute_statement
, not batch_execute_statement
. Looks like it was a copy-paste problem.
…bility to return SQL results
f6c9eea
to
e588b28
Compare
I rebased and squashed my changes. |
Looks like tests are passing and changes have been made. @Taragolis your requested changes are blocking the merge. Does everything look acceptable from your end? |
Awesome work, congrats on your first merged pull request! |
I added two improvements to the
RedshiftDataOperator
:FAILURE
). It's not very helpful. I added the whole response in the exception, so it will be printed in the logs and it will give a possibility to troubleshoot problems easily.return_sql_result
optional parameter. By default it'sFalse
, so it's a backward-compatible change. If it'sTrue
operator will return a result of a SQL query from theexecute
method. It means that this result will be available through xcom variables. It may be handy if you need to get some small portions of data (or metadata) from Redshift. For example, I'm runningshow table
query to get a DDL of some table and create a temporary copy of it.