Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🎉 Source Facebook Marketing: improve sleeps time in rate limit handler #10698

Conversation

vladimir-remar
Copy link
Contributor

What

In our Facebook Marketing connection when we want to do a full refresh sync, this probably will take more than 1 day , the connector reach the rate limite often and it seems the default sleep time it is not enough.

I attached a image when the connector reaches the 100%
Screenshot from 2022-02-28 12-07-28

How

Set sleeps time to 1min between 90%-95% of usage and set to 5min when the usage is greater than or equal 95%

🚨 User Impact 🚨

Increase of time in connection runs, mostly in a full refresh sync mode, due to this new approach.

Pre-merge Checklist

Expand the relevant checklist and delete the others.

Updating a connector

Community member or Airbyter

  • Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • Changelog updated in docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • Credentials added to Github CI. Instructions.
  • /test connector=connectors/<name> command is passing
  • New Connector version released on Dockerhub by running the /publish command described here
  • After the new connector version is published, connector version bumped in the seed directory as described here
  • Seed specs have been re-generated by building the platform and committing the changes to the seed spec files, as described here

Tests

Unit

Put your unit tests output here.

Integration

Put your integration tests output here.

Acceptance

Put your acceptance tests output here.

@github-actions github-actions bot added the area/connectors Connector related issues label Feb 28, 2022
@marcosmarxm
Copy link
Member

@vladimir-remar it's possible to run a sync comparing the current method and the method proposed by you? Or, there is any docs in Facebook maybe explaining about the best approach to do rate limit waiting?

@vladimir-remar
Copy link
Contributor Author

vladimir-remar commented Mar 2, 2022

@vladimir-remar it's possible to run a sync comparing the current method and the method proposed by you? Or, there is any docs in Facebook maybe explaining about the best approach to do rate limit waiting?

Hello @marcosmarxm Thanks for the answer. We did a sync with (ads_insights, ads_insights_country and ads_insights_platform_and_device), I attached the log, as you will see, with my suggestion the sync will keeps around 90%-94% of usage pausing 1min between requests until usage is low. This increase the success chances otherwise it will fail.
About the Facebook documentation, there is not much to do, you guys already did most of the work with rate liminting.

logs-17365.txt

@alafanechere alafanechere self-assigned this Mar 4, 2022
Copy link
Contributor

@alafanechere alafanechere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @vladimir-remar, thank you for this contribution.
I made several suggestions 😄, let me know what you think!

Could you please also:

  • bump the connector version in the dockerfile and in airbyte-config/init/src/main/resources/seed/source_definitions.yaml
  • update the changelog in docs/integrations/sources/facebook-marketing.md

Comment on lines 33 to 36
call_rate_threshold = 95 # maximum percentage of call limit utilization
min_call_rate_threslold = 90 # minimum percentage of call limit utilization
pause_interval_minimum = pendulum.duration(minutes=1) # default pause interval if reached or close to call rate limit
pause_interval_minimum_at_max_usage = pendulum.duration(minutes=5)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
call_rate_threshold = 95 # maximum percentage of call limit utilization
min_call_rate_threslold = 90 # minimum percentage of call limit utilization
pause_interval_minimum = pendulum.duration(minutes=1) # default pause interval if reached or close to call rate limit
pause_interval_minimum_at_max_usage = pendulum.duration(minutes=5)
MAX_RATE, MAX_PAUSE_INTERVAL = (95, pendulum.duration(minutes=5))
MIN_RATE, MAX_PAUSE_INTERVAL = (90, pendulum.duration(minutes=1)

@@ -104,17 +112,22 @@ def handle_call_rate_limit(self, response, params):
max_usage = max(max_usage, usage)
max_pause_interval = max(max_pause_interval, pause_interval)

if max_usage > self.call_rate_threshold:
if max_usage >= self.call_rate_threshold:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if max_usage >= self.call_rate_threshold:
max_rate, max_sleep_duration = self. min_max_rate_sleep_time["max"]
min_rate, min_sleep_duration = self. min_max_rate_sleep_time["min"]
if max_usage >= max_rate:

@@ -87,6 +89,12 @@ def _parse_call_rate_header(headers):

return usage, pause_interval

@staticmethod
def sleep_time_handler(usage, pause_interval: pendulum.duration):

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the definition of the custom pause interval according to usage should be determined in this function and not in handle_call_rate_limit.
I'd rename it to compute_pause_interval and make it return the total seconds to sleep. Then I'd call sleep with this value from handle_call_rate_limit. I think it would greatly improve the readability and maintainability of this logic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alafanechere Thanks, I did some changes, tell me what you think.

@github-actions github-actions bot added the area/documentation Improvements or additions to documentation label Mar 7, 2022
@vladimir-remar
Copy link
Contributor Author

Hello @alafanechere let me know if there is something I can do to improve this PR, this change would be a big step for us.

@alafanechere
Copy link
Contributor

/test connector=connectors/source-facebook-marketing

Copy link
Contributor

@alafanechere alafanechere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @vladimir-remar for the changes, I still have an important suggestion. Could you please also bump the version in the Dockerfile?

@alafanechere
Copy link
Contributor

@vladimir-remar could you please fix the conflicts and share the output of your acceptance tests:
./gradlew :airbyte-integrations:connectors:source-facebook-marketing:integrationTest

@vladimir-remar
Copy link
Contributor Author

@vladimir-remar could you please fix the conflicts and share the output of your acceptance tests: ./gradlew :airbyte-integrations:connectors:source-facebook-marketing:integrationTest

Hello @alafanechere the problem here is the quota from the credentials that I have is exhauted, but even with quota I can't inccour in a full refresh run. I'm open to suggestions.

Copy link
Member

@marcosmarxm marcosmarxm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small fix and it's ready to merge!

Copy link
Contributor

@alafanechere alafanechere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the changes you made @vladimir-remar. I'm going to run the acceptance tests and merge if they pass.

Comment on lines 97 to 100
usage = 0
pause_interval = self.MIN_PAUSE_INTERVAL

if "batch" in params:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
usage = 0
pause_interval = self.MIN_PAUSE_INTERVAL
if "batch" in params:
if "batch" in params:
usage = 0
pause_interval = self.MIN_PAUSE_INTERVAL

@alafanechere
Copy link
Contributor

alafanechere commented Mar 16, 2022

/test connector=connectors/source-facebook-marketing

🕑 connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/1993130973
✅ connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/1993130973
Python tests coverage:

Name                                                 Stmts   Miss  Cover
------------------------------------------------------------------------
source_acceptance_test/utils/__init__.py                 6      0   100%
source_acceptance_test/tests/__init__.py                 4      0   100%
source_acceptance_test/__init__.py                       2      0   100%
source_acceptance_test/tests/test_full_refresh.py       52      2    96%
source_acceptance_test/utils/asserts.py                 37      2    95%
source_acceptance_test/config.py                        74      6    92%
source_acceptance_test/utils/json_schema_helper.py     105     13    88%
source_acceptance_test/utils/common.py                  70     17    76%
source_acceptance_test/utils/compare.py                 62     23    63%
source_acceptance_test/tests/test_core.py              275    106    61%
source_acceptance_test/base.py                          10      4    60%
source_acceptance_test/utils/connector_runner.py       110     48    56%
source_acceptance_test/tests/test_incremental.py        69     38    45%
------------------------------------------------------------------------
TOTAL                                                  876    259    70%
Name                                                        Stmts   Miss  Cover
-------------------------------------------------------------------------------
source_facebook_marketing/streams/__init__.py                   2      0   100%
source_facebook_marketing/spec.py                              33      0   100%
source_facebook_marketing/__init__.py                           2      0   100%
source_facebook_marketing/api.py                               97     13    87%
source_facebook_marketing/streams/base_streams.py             127     27    79%
source_facebook_marketing/streams/common.py                    41     13    68%
source_facebook_marketing/streams/streams.py                   97     32    67%
source_facebook_marketing/source.py                            35     13    63%
source_facebook_marketing/streams/base_insight_streams.py     129     54    58%
source_facebook_marketing/streams/async_job.py                204    128    37%
source_facebook_marketing/streams/async_job_manager.py         74     56    24%
-------------------------------------------------------------------------------
TOTAL                                                         841    336    60%
Name                                                        Stmts   Miss  Cover
-------------------------------------------------------------------------------
source_facebook_marketing/streams/async_job.py                204      0   100%
source_facebook_marketing/streams/__init__.py                   2      0   100%
source_facebook_marketing/spec.py                              33      0   100%
source_facebook_marketing/source.py                            35      0   100%
source_facebook_marketing/__init__.py                           2      0   100%
source_facebook_marketing/streams/common.py                    41      1    98%
source_facebook_marketing/streams/async_job_manager.py         74      3    96%
source_facebook_marketing/streams/base_insight_streams.py     129     13    90%
source_facebook_marketing/api.py                               97     16    84%
source_facebook_marketing/streams/streams.py                   97     22    77%
source_facebook_marketing/streams/base_streams.py             127     30    76%
-------------------------------------------------------------------------------
TOTAL                                                         841     85    90%

@alafanechere
Copy link
Contributor

I'm requesting reviews from @sherifnada and @misteryeo because this connector is targeted for GA.

@marcosmarxm
Copy link
Member

just a comment, this change solves the problem. I saw some clients getting their sync finished because of 100% quota limit and not respecting the correct backoff time. The only drawback is: increase in time to finish the sync.

thanks for this contribution @vladimir-remar

Copy link
Contributor

@sherifnada sherifnada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vladimir-remar could you add unit tests for this functionality please?

Copy link
Contributor

@misteryeo misteryeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@alafanechere alafanechere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @vladimir-remar I wrote some unit tests for the functions you implemented.
Feel free to review these tests and add more cases if needed.
I have a remaining question about the _compute_pause_interval function that I added in the review comment.

"""The sleep time will be calculated based on usage consumed."""
if usage >= self.MAX_RATE:
return max(self.MAX_PAUSE_INTERVAL, pause_interval)
return self.MIN_PAUSE_INTERVAL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return self.MIN_PAUSE_INTERVAL
return max(self.MIN_PAUSE_INTERVAL, pause_interval)

If the usage is lower than the max rate but the pause interval is higher than the default minimum one: don't we want to be conservative and pause for the time the API requests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the usage is lower than the max rate but the pause interval is higher than the default minimum one: don't we want to be conservative and pause for the time the API requests?

for sure, the idea is to pause the requests taking into account the time interval that the response returns.

@alafanechere
Copy link
Contributor

alafanechere commented Mar 21, 2022

/test connector=connectors/source-facebook-marketing

🕑 connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/2015949287
✅ connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/2015949287
Python tests coverage:

Name                                                 Stmts   Miss  Cover
------------------------------------------------------------------------
source_acceptance_test/utils/__init__.py                 6      0   100%
source_acceptance_test/tests/__init__.py                 4      0   100%
source_acceptance_test/__init__.py                       2      0   100%
source_acceptance_test/tests/test_full_refresh.py       52      2    96%
source_acceptance_test/utils/asserts.py                 37      2    95%
source_acceptance_test/config.py                        74      6    92%
source_acceptance_test/utils/json_schema_helper.py     105     13    88%
source_acceptance_test/utils/common.py                  70     17    76%
source_acceptance_test/utils/compare.py                 62     23    63%
source_acceptance_test/tests/test_core.py              275    106    61%
source_acceptance_test/base.py                          10      4    60%
source_acceptance_test/utils/connector_runner.py       110     48    56%
source_acceptance_test/tests/test_incremental.py        69     38    45%
------------------------------------------------------------------------
TOTAL                                                  876    259    70%
Name                                                        Stmts   Miss  Cover
-------------------------------------------------------------------------------
source_facebook_marketing/streams/__init__.py                   2      0   100%
source_facebook_marketing/spec.py                              33      0   100%
source_facebook_marketing/__init__.py                           2      0   100%
source_facebook_marketing/api.py                              101     13    87%
source_facebook_marketing/streams/base_streams.py             127     27    79%
source_facebook_marketing/streams/common.py                    41     13    68%
source_facebook_marketing/streams/streams.py                   97     32    67%
source_facebook_marketing/source.py                            35     13    63%
source_facebook_marketing/streams/base_insight_streams.py     129     54    58%
source_facebook_marketing/streams/async_job.py                204    128    37%
source_facebook_marketing/streams/async_job_manager.py         74     56    24%
-------------------------------------------------------------------------------
TOTAL                                                         845    336    60%
Name                                                        Stmts   Miss  Cover
-------------------------------------------------------------------------------
source_facebook_marketing/streams/async_job.py                204      0   100%
source_facebook_marketing/streams/__init__.py                   2      0   100%
source_facebook_marketing/spec.py                              33      0   100%
source_facebook_marketing/source.py                            35      0   100%
source_facebook_marketing/__init__.py                           2      0   100%
source_facebook_marketing/streams/common.py                    41      1    98%
source_facebook_marketing/streams/async_job_manager.py         74      3    96%
source_facebook_marketing/api.py                              101     10    90%
source_facebook_marketing/streams/base_insight_streams.py     129     13    90%
source_facebook_marketing/streams/streams.py                   97     22    77%
source_facebook_marketing/streams/base_streams.py             127     30    76%
-------------------------------------------------------------------------------
TOTAL                                                         845     79    91%

@vladimir-remar
Copy link
Contributor Author

Hey @vladimir-remar I wrote some unit tests for the functions you implemented. Feel free to review these tests and add more cases if needed. I have a remaining question about the _compute_pause_interval function that I added in the review comment.

@alafanechere thanks for add the unitests

@vladimir-remar
Copy link
Contributor Author

vladimir-remar commented Mar 21, 2022

@vladimir-remar could you add unit tests for this functionality please?

@sherifnada thanks to @alafanechere who already did.

@alafanechere
Copy link
Contributor

alafanechere commented Mar 21, 2022

/test connector=connectors/source-facebook-marketing

🕑 connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/2018132032
✅ connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/2018132032
Python tests coverage:

Name                                                 Stmts   Miss  Cover
------------------------------------------------------------------------
source_acceptance_test/utils/__init__.py                 6      0   100%
source_acceptance_test/tests/__init__.py                 4      0   100%
source_acceptance_test/__init__.py                       2      0   100%
source_acceptance_test/tests/test_full_refresh.py       52      2    96%
source_acceptance_test/utils/asserts.py                 37      2    95%
source_acceptance_test/config.py                        74      6    92%
source_acceptance_test/utils/json_schema_helper.py     105     13    88%
source_acceptance_test/utils/common.py                  70     17    76%
source_acceptance_test/utils/compare.py                 62     23    63%
source_acceptance_test/tests/test_core.py              275    106    61%
source_acceptance_test/base.py                          10      4    60%
source_acceptance_test/utils/connector_runner.py       110     48    56%
source_acceptance_test/tests/test_incremental.py        69     38    45%
------------------------------------------------------------------------
TOTAL                                                  876    259    70%
Name                                                        Stmts   Miss  Cover
-------------------------------------------------------------------------------
source_facebook_marketing/streams/__init__.py                   2      0   100%
source_facebook_marketing/spec.py                              33      0   100%
source_facebook_marketing/__init__.py                           2      0   100%
source_facebook_marketing/api.py                              101     13    87%
source_facebook_marketing/streams/base_streams.py             127     27    79%
source_facebook_marketing/streams/common.py                    41     13    68%
source_facebook_marketing/streams/streams.py                   97     32    67%
source_facebook_marketing/source.py                            35     13    63%
source_facebook_marketing/streams/base_insight_streams.py     129     54    58%
source_facebook_marketing/streams/async_job.py                204    128    37%
source_facebook_marketing/streams/async_job_manager.py         74     56    24%
-------------------------------------------------------------------------------
TOTAL                                                         845    336    60%
Name                                                        Stmts   Miss  Cover
-------------------------------------------------------------------------------
source_facebook_marketing/streams/async_job.py                204      0   100%
source_facebook_marketing/streams/__init__.py                   2      0   100%
source_facebook_marketing/spec.py                              33      0   100%
source_facebook_marketing/source.py                            35      0   100%
source_facebook_marketing/__init__.py                           2      0   100%
source_facebook_marketing/streams/common.py                    41      1    98%
source_facebook_marketing/streams/async_job_manager.py         74      3    96%
source_facebook_marketing/api.py                              101     10    90%
source_facebook_marketing/streams/base_insight_streams.py     129     13    90%
source_facebook_marketing/streams/streams.py                   97     22    77%
source_facebook_marketing/streams/base_streams.py             127     30    76%
-------------------------------------------------------------------------------
TOTAL                                                         845     79    91%

Copy link
Contributor

@alafanechere alafanechere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your quick reply @vladimir-remar ! I'd love a final review from our connector team.

@alafanechere
Copy link
Contributor

alafanechere commented Mar 25, 2022

/publish connector=connectors/source-facebook-marketing

🕑 connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/2038827353
✅ connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/2038827353

@alafanechere alafanechere merged commit c51ea6e into airbytehq:master Mar 25, 2022
@alafanechere
Copy link
Contributor

Thank you for your contribution and patience @vladimir-remar !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation community
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants