Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

list_query_executions is not fetching queries older than specific date #3847

Closed
Apporve-Chandra opened this issue Sep 8, 2023 · 5 comments
Closed
Assignees
Labels
athena service-api This issue is caused by the service API, not the SDK implementation.

Comments

@Apporve-Chandra
Copy link

Describe the bug

I wanted to extract an old query I ran, with a specific 'OutputLocation'.
Since my query was executed sometime in mid July, I wrote a Python script which fetches queries via 'list_query_executions', then iterates on them to get its details via 'get_query_execution'.
After this, I again fetch next set of queries using 'NextToken' received in previous 'list_query_executions'.

Issue: 'list_query_executions' is not providing 'NextToken' for queries executed before 25th July 2023.
Is this a bug - for I cannot find any hard limits on stored queries in official AWS docs.
If there is some parameter I can add to fetch such very old queries, please let me know.

Py script I used is mentioned below.

Expected Behavior

'list_query_executions' should always have a 'NextToken' field, if older queries are present

Current Behavior

'list_query_executions' is not returning 'NextToken' field after certain iterations - giving impression it cannot fetch queries executed before a date

Reproduction Steps

Python script I ran, with proper proper aws creds in place:
(hardCoded NextToken is from 26th July, while failure happens when fetching queries older than 25th July. Hardcoded so that failure happens after first few successful iterations)

`import boto3

client = boto3.client('athena')

response = client.list_query_executions()

response = client.list_query_executions(NextToken = "AQn3/LrM6NheQc07p0GuR6kZz7lYAm+hB6vrfbobpG4PEkjcGp76CGVitUR7/dPllpMcMZfhL1wuJX2a9EOtWA9Yhk9pqwDL+pltyP28xcp2epG82VkzwIVZjCFxtw4WVF1wrzPLsFdRFS4f+B2T3t1PmkmSHfpoD/Bm/Ajwuu1ZFjosPB0V76Rye55TYtuaHBPcUfHGepLQLBCXXfp1v3XlxStXZb6OqWGG3Ntd2sNn5Is1rOP2ail7ozIqDK9CFXPXeDKbtvsFpTiM1h5baC8AVN6YuYiGY37m7Q1CYSo3TsWHH73X2yFkjlWJjFkAzApCurYWI4h0")

print(f"Response: {response}")

loop_count = 0
found_count = 0

for outer_loop in range(0,1000):
next_token = response['NextToken']
for query_id in response['QueryExecutionIds']:
query_execution_response = client.get_query_execution(
QueryExecutionId=query_id
)

    query_string = query_execution_response['QueryExecution']['Query']
    query_output = query_execution_response['QueryExecution']['ResultConfiguration']['OutputLocation']
    query_created_at = query_execution_response['QueryExecution']['Status']['SubmissionDateTime']

    # print(f"=> Query id: {query_id}")
    loop_count +=1
    # print(f"count: {loop_count}")
    # print(f"Output location: {query_output}")
    print(f"query_created_at: {query_created_at}")

    if "apchandr" in query_output:
            print(f"Executed SQL statement: {query_string}")
            found_count += 1

print(f"NextToken: {next_token}")
response = client.list_query_executions(NextToken = next_token)
print(f"loop count: {loop_count}")
print(f"found count: {found_count}")

print(f"Final loop count: {loop_count}")
print(f"Final found count: {found_count}")`

Possible Solution

Indication of

  • Some error I am doing while making the call, or
  • additional argument which need to be provided to fetch old queries, or
  • documentation indicating hard limits on queries which can be fetched programmatically

Additional Information/Context

No response

SDK version used

1.28.42

Environment details (OS name and version, etc.)

MacOS 12.5.1

@Apporve-Chandra Apporve-Chandra added bug This issue is a confirmed bug. needs-triage This issue or PR still needs to be triaged. labels Sep 8, 2023
@tim-finnigan tim-finnigan self-assigned this Sep 11, 2023
@tim-finnigan
Copy link
Contributor

Hi @Apporve-Chandra thanks for reaching out. For reference here is the documentation for the list_query_executions command: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/athena/client/list_query_executions.html

And here is documentation on Athena Service Quotas: https://docs.aws.amazon.com/athena/latest/ug/service-limits.html#service-limits-api-calls

The Boto3 command corresponds to the Athena ListQueryExecutions API, so it's possible that you encountered a quota limit for that API call. In order for us to investigate any potential issue here, could you provide your debug logs (with any sensitive info redacted) by adding boto3.set_stream_logger('') to your script?

@tim-finnigan tim-finnigan added response-requested Waiting on additional information or feedback. service-api This issue is caused by the service API, not the SDK implementation. athena and removed bug This issue is a confirmed bug. needs-triage This issue or PR still needs to be triaged. labels Sep 11, 2023
@Apporve-Chandra
Copy link
Author

@tim-finnigan I reran with set_stream_logger(''), and this time oldest query executed was from 2023-07-31.
Attached log file as asked.
std2.log

It seems there is a max time duration, before which queries cant be fetched.
Can you please confirm this duration, and if there is a way to retrieve queries before it.

@tim-finnigan
Copy link
Contributor

Hi @Apporve-Chandra thanks for following up. Athena keeps a query history for 45 days per this documentation: https://docs.aws.amazon.com/athena/latest/ug/querying.html#queries-viewing-history

Also noted on that page:

If you want to keep the query history longer than 45 days, you can retrieve the query history and save it to a data store such as Amazon S3.

@tim-finnigan tim-finnigan added closing-soon This issue will automatically close in 4 days unless further comments are made. and removed response-requested Waiting on additional information or feedback. labels Sep 13, 2023
@Apporve-Chandra
Copy link
Author

Apporve-Chandra commented Sep 14, 2023 via email

@tim-finnigan
Copy link
Contributor

Hi Apporve — I don't think there is a way to get the query history beyond 45 days. At least, I couldn't find any info on that in the Athena documentation. You could try reaching out through AWS Support for further assistance, but this issue is beyond the scope of Boto3 as it involves the Athena service API and your query execution history.

@tim-finnigan tim-finnigan removed the closing-soon This issue will automatically close in 4 days unless further comments are made. label Sep 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
athena service-api This issue is caused by the service API, not the SDK implementation.
Projects
None yet
Development

No branches or pull requests

2 participants