boto code for Athena internal partitioned tables #3812
Labels
feature-request
This issue requests a feature.
response-requested
Waiting on additional information or feedback.
service-api
This issue is caused by the service API, not the SDK implementation.
Describe the feature
I have an Athena table "mss_athena_intake" which contains 20 partitioned columns and 30 non partitioned columns and around 2 TB of data. From athena console when i make a select query call on partitioned columns for this table and if i give the limit for 200 records , the query when executed takes around 40-50 seconds. The same query when it is called from boto script , it takes around 1 min.
Now my recommendation is , Athena internally creates an internal partition table where only the partitioned columns exist. The table name for this one will be "mss_athena_intake$partitions" . The select query for only partitioned columns executed against this internal table hardly took 5-10 sec as compared to 40-50sec.
But from boto, if i make a select query on this Athena internal partitioned table, i get the error
{
"errorMessage": "An error occurred (InvalidRequestException) when calling the StartQueryExecution operation: Queries of this type are not supported",
"errorType": "InvalidRequestException",
"requestId": "121e08fd-256d-4600-88d6-ed33c47490a7",
"stackTrace": [
" File "/var/task/lambda_function.py", line 18, in lambda_handler\n response = client.start_query_execution(\n",
" File "/var/lang/lib/python3.11/site-packages/botocore/client.py", line 534, in _api_call\n return self._make_api_call(operation_name, kwargs)\n",
" File "/var/lang/lib/python3.11/site-packages/botocore/client.py", line 976, in _make_api_call\n raise error_class(parsed_response, operation_name)\n"
]
}
My suggestion is can't we allow these internal queries from boto as well when i am able to execute the same from Athena query console.. This will really benefit some complex processing and it will drastically reduce the processing time of the queries which are called on partitioned columns only.
I am adding below the boto code which i used in test lambda
Use Case
Boto will drastically reduce the query processing time for Athena table, if it makes direct query on Athena internally created partitioned table rather than actual Athena table.
Currently I get the below error when i try to execute the boto code
{
"errorMessage": "An error occurred (InvalidRequestException) when calling the StartQueryExecution operation: Queries of this type are not supported",
"errorType": "InvalidRequestException",
"requestId": "121e08fd-256d-4600-88d6-ed33c47490a7",
"stackTrace": [
" File "/var/task/lambda_function.py", line 18, in lambda_handler\n response = client.start_query_execution(\n",
" File "/var/lang/lib/python3.11/site-packages/botocore/client.py", line 534, in _api_call\n return self._make_api_call(operation_name, kwargs)\n",
" File "/var/lang/lib/python3.11/site-packages/botocore/client.py", line 976, in _make_api_call\n raise error_class(parsed_response, operation_name)\n"
]
}
Proposed Solution
My suggestion is can't we allow these internal queries from boto as well when i am able to execute the same from Athena query console.. This will really benefit some complex processing and it will drastically reduce the processing time of the queries which are called on partitioned columns only.
Other Information
I am adding below the boto code which i used in test lambda
Acknowledgements
SDK version used
boto 3, python 3.11(runtime)
Environment details (OS name and version, etc.)
windows 11
The text was updated successfully, but these errors were encountered: