Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Amazon Ads: changes for increasing timeout for long taking report generation #10513

Merged
merged 18 commits into from
Mar 30, 2022

Conversation

aakashkumarmaple
Copy link
Contributor

@aakashkumarmaple aakashkumarmaple commented Feb 21, 2022

What

Describe what the change is solving
Since Rate limits associated with the Sponsored Products reports API and the Sponsored Brands reports API depend on the size of the report generation queue, therefore for few report generation, it takes more than 30 minutes for reports to get generated.

Thus increasing the REPORT_WAIT_TIMEOUT to 120 minutes from the previous 30 minutes may solve a problem where the speed is affected by choice of region, utilization period, and reporting queue time.

Please Refer Document (Rate Limiting)[https://advertising.amazon.com/API/docs/en-us/concepts/rate-limiting#rate-limiting] for more information on Amazon Ads rate limiting conditions.

"""
Rate limits associated with the [Sponsored Products reports API](https://advertising.amazon.com/API/docs/en-us/sponsored-products/2-0/openapi#/Reports) and the [Sponsored Brands reports API](https://advertising.amazon.com/API/docs/en-us/sponsored-brands/2/reports) depends on the size of the report generation queue. These limits are determined on a per-region basis with multiple limit tiers that correspond to the current reporting queue size in a given region. Higher utilization periods with higher report queue sizes result in lower rate limits. Lower utilization periods result in higher rate limits with a higher number of accepted report generation requests.
"""

How

Upon changing the REPORT_WAIT_TIMEOUT to 120 minutes, the connector waits for more time for getting the data rather than giving up after 30 minutes. For this, we have made modifications in report_stream. Also, for maximum retries, we have also increased max_tried for _init_and_try_read_records for the same reason

Recommended reading order

  1. report_stream.python

🚨 User Impact 🚨

No Breaking Changes

@CLAassistant
Copy link

CLAassistant commented Feb 21, 2022

CLA assistant check
All committers have signed the CLA.

@github-actions github-actions bot added area/connectors Connector related issues area/documentation Improvements or additions to documentation labels Feb 21, 2022
@aakashkumarmaple aakashkumarmaple marked this pull request as draft February 21, 2022 18:13
@aakashkumarmaple
Copy link
Contributor Author

Hi, @harshithmullapudi,
There are the changes for increasing REPORT_WAIT_TIMEOUT. Can you please check this out?

@aakashkumarmaple aakashkumarmaple marked this pull request as ready for review February 21, 2022 19:48
@@ -153,7 +153,7 @@ def read_records(
@backoff.on_exception(
backoff.expo,
ReportGenerationFailure,
max_tries=5,
max_tries=1000,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These value increasing is not ideal (5 to 1000 and 30 seconds to 120 seconds). It's possible to find an optimal way to handle those situations? @misteryeo this is the case where we should add options to users have a better control in connector config?

Copy link
Contributor Author

@aakashkumarmaple aakashkumarmaple Feb 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have added the timeout to 120 minutes and 1000 iterations because sometimes for our use case it takes more than 30 minutes (To be on the save side, we kept it 120 minutes). The same thing goes for max_tries to 1000.

It will be better if we can give these values user-defined @marcosmarxm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @marcosmarxm @harshithmullapudi ,
We have added an optional parameter for the user to enter report_wait_timeout. By default, it's 30 minutes, but if the user enters any time greater than 30 minutes, then that timing will be used.
Regarding max_tries, the majority of our time, we get reports at around 1000 retries. Thus we proposed to increase the value by 1000.
Since we are currently using this connector in our production, we request you to kindly merge this PR as soon as possible.

@harshithmullapudi
Copy link
Contributor

Hey can you run

  1. ./gradlew format
  2. Run integration tests local and share the screenshot here

Also that max retries can we move that into configuration rather than hardcoding it. For companies which are having less customers it would be consuming too many API requests

@marcosmarxm
Copy link
Member

Hey can you run

  1. ./gradlew format
  2. Run integration tests local and share the screenshot here

Also that max retries can we move that into configuration rather than hardcoding it. For companies which are having less customers it would be consuming too many API requests

For this type of UX change in spec I recommend to have a talk with @misteryeo and validate with connector team the best sollution.

@aakashkumarmaple
Copy link
Contributor Author

Hi @misteryeo,
Can we have a discussion on the spec file changes for this PR? Let's get into a discussion and resolve this issue wherever you are free.

@misteryeo
Copy link
Contributor

Hi folks, sorry for the delay here.

Agreed with @harshithmullapudi and @marcosmarxm that we don't want to be hard coding these values as not everyone requires such high values. Let's move forward with adding both as options for users to adjust but have the original values as default.

Copy link
Member

@marcosmarxm marcosmarxm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments! Almost there @aakashkumarmaple !!!

LABEL io.airbyte.version=0.1.3
LABEL io.airbyte.name=airbyte/source-amazon-ads
LABEL io.airbyte.version=0.1.4
LABEL io.airbyte.name=airbyte/source-amazon-ads
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eof

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @marcosmarxm , can you elaborate.

@marcosmarxm
Copy link
Member

@aakashkumarmaple do you want to continue this work?

@aakashkumarmaple
Copy link
Contributor Author

Hi @marcosmarxm, sorry for the delay. I will make the necessary changes and update you asap

@harshithmullapudi
Copy link
Contributor

@marcosmarxm you want me to take over this?

@marcosmarxm
Copy link
Member

@marcosmarxm you want me to take over this?

Let me review it @harshithmullapudi

@marcosmarxm marcosmarxm changed the title Changes for increasing timeout for long taking report generation Amazon Ads: changes for increasing timeout for long taking report generation Mar 28, 2022
Copy link
Member

@marcosmarxm marcosmarxm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @aakashkumarmaple I'll publish the new version tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation blocked community
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants