Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get job runs by timeframe #3833

Closed
1 of 2 tasks
JayaChinta123 opened this issue Aug 22, 2023 · 1 comment
Closed
1 of 2 tasks

Get job runs by timeframe #3833

JayaChinta123 opened this issue Aug 22, 2023 · 1 comment
Assignees
Labels
feature-request This issue requests a feature. glue investigating This issue is being investigated and/or work is in progress to resolve the issue. p3 This is a minor priority issue

Comments

@JayaChinta123
Copy link

Describe the feature

For glue service,get-job-runs method return all the job runs from the time the job is created and we have to filter the response basing on completed-on attribute. It will be nice to have get job runs accept time as an input parameter,so the response can be limited to fewer number of records.

Use Case

We need to calculate max concurrent dpu utilized in an account at some point in time, so basing on which we can calculate dpu concurrency and can request the quota for the account.
For that we had query glue service for job runs and it is returning all job runs from the time job is created. In an account where we more than thousand jobs and to retrieve job runs for each job from the time they are created ..the script takes a day to run!!! As the job get older the latency increases

Proposed Solution

Expose time as input parameter for glue Job runs. And remove if possible older job runs basing on some rention period.so we don’t get large response data to scan results.

Other Information

No response

Acknowledgements

  • I may be able to implement this feature request
  • This feature might incur a breaking change

SDK version used

Boto1.28.30

Environment details (OS name and version, etc.)

Mac

@JayaChinta123 JayaChinta123 added feature-request This issue requests a feature. needs-triage This issue or PR still needs to be triaged. labels Aug 22, 2023
@RyanFitzSimmonsAK RyanFitzSimmonsAK self-assigned this Aug 23, 2023
@RyanFitzSimmonsAK RyanFitzSimmonsAK added investigating This issue is being investigated and/or work is in progress to resolve the issue. glue p3 This is a minor priority issue and removed needs-triage This issue or PR still needs to be triaged. labels Aug 23, 2023
@RyanFitzSimmonsAK
Copy link
Contributor

Hi @JayaChinta123, thanks for submitting this feature request. Server-side filtering is something that would need to be implemented by the service team. I've made an issue in our cross-SDK repository (aws/aws-sdk#591) to be used for tracking and updates. I've also reached out to the Glue service team to ask about the possibility of this being implemented.

One possible workaround you could use in the meantime is using an alternative data store like DynamoDB, and updating each job run in there. You could then query DynamoDB instead of having to make frequent GetJobRun calls.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request This issue requests a feature. glue investigating This issue is being investigated and/or work is in progress to resolve the issue. p3 This is a minor priority issue
Projects
None yet
Development

No branches or pull requests

2 participants