Conversation
| # the sasl library anyway (and there sasl library version is not relevant) | ||
| - sasl>=0.3.1; python_version>="3.9" | ||
| - thrift>=0.9.2 | ||
| - impyla |
There was a problem hiding this comment.
impyla doesn't make any asyncio calls in execute_async.
There was a problem hiding this comment.
We have written our own asyncio method as shown below, because impyla returns immediately after submitting the query.
Please check the link
https://github.com/cloudera/impyla/blob/v0.16a2/impala/hiveserver2.py#L334-L338
async def partition_exists(self, table: str, schema: str, partition: str, polling_interval: float) -> str:
"""
Checks for the existence of a partition in the given hive table.
:param table: table in hive where the partition exists.
:param schema: database where the hive table exists
:param partition: partition to check for in given hive database and hive table.
:param polling_interval: polling interval in seconds to sleep between checks
"""
client = self.get_hive_client()
cursor = client.cursor()
query = f"show partitions {schema}.{table} partition({partition})"
cursor.execute_async(query)
while cursor.is_executing():
await asyncio.sleep(polling_interval)
results = cursor.fetchall()
if len(results) == 0:
return "failure"
return "success"
There was a problem hiding this comment.
This unfortunetly not transform sync code into asyncio. If this kind transformation would be so easy than we had to transform any sync method to asyncio implementation.
So offhand most of impyla methods makes blocking io requests:
when you call cursor.execute_async as well as cursor.is_executing() and also cursor.fetchall()
There was a problem hiding this comment.
most sync method wait for the operation to complete, which is not what impyla does here.
@kaxil any thoughts ?
There was a problem hiding this comment.
Not every async implementation is asyncio-compatible.
asyncio stand for Asynchronous I/O however impyla provides asynchronous execution with block I/O.
Implementation of impyla would be perfect for regular Sensor rather than Trigger and defer Operators
| self.conn = self.get_connection(conn_id=metastore_conn_id) | ||
| self.auth_mechanism = self.conn.extra_dejson.get("authMechanism", "PLAIN") |
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions. |
This PR donates the
HivePartitionAsyncSensorfrom astronomer-providers repo to Airflow^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in newsfragments.