Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Source Facebook: rate limit not always handled #3525

Closed
sherifnada opened this issue May 21, 2021 · 18 comments · Fixed by #3600 or #3743
Closed

Source Facebook: rate limit not always handled #3525

sherifnada opened this issue May 21, 2021 · 18 comments · Fixed by #3600 or #3743

Comments

@sherifnada
Copy link
Contributor

sherifnada commented May 21, 2021

Description

A user reported the FB marketing source was not handling rate limiting gracefully and with no apparent sign of retry. Attached is the log:
logs-290-0.txt

Acceptance Criteria

  1. Handle the rate limit
  2. Add a unit test which mocks a rate limited response to verify it works correctly

┆Issue is synchronized with this Asana task by Unito

@sherifnada sherifnada added type/bug Something isn't working area/connectors Connector related issues labels May 21, 2021
@sherifnada sherifnada added this to the Connectors May 28th, 2021 milestone May 21, 2021
@sherifnada sherifnada added the priority/high High priority label May 21, 2021
@keu keu self-assigned this May 21, 2021
@keu keu closed this as completed in #3600 May 26, 2021
keu added a commit that referenced this issue May 26, 2021
* fix backoff of batch requests

* fix backoff of batch requests

* fix tests

* fix tests

* fix requirements

* fix tests

* fix tests

* sort codes

* format

* fix setup

* fix integration tests

* bump version

* fix integration tests

Co-authored-by: Eugene Kulak <kulak.eugene@gmail.com>
@zestyping
Copy link
Contributor

@keu @sherifnada This worked previously but today I'm getting this:

2021-05-27 16:12:27 ERROR (/tmp/workspace/449/0) LineGobbler(voidCall):69 - File "/airbyte/integration_code/source_facebook_marketing/client/common.py", line 84, in should_retry_api_error
2021-05-27 16:12:27 ERROR (/tmp/workspace/449/0) LineGobbler(voidCall):69 - return handle_call_rate_response(exc)
2021-05-27 16:12:27 ERROR (/tmp/workspace/449/0) LineGobbler(voidCall):69 - File "/airbyte/integration_code/source_facebook_marketing/client/common.py", line 56, in handle_call_rate_response
2021-05-27 16:12:27 ERROR (/tmp/workspace/449/0) LineGobbler(voidCall):69 - platform_header = exc.http_headers().get("x-app-usage") or exc.http_headers().get("x-ad-account-usage")
2021-05-27 16:12:27 ERROR (/tmp/workspace/449/0) LineGobbler(voidCall):69 - AttributeError: 'list' object has no attribute 'get'

Did something change? It looks like the http_headers method isn't part of the documented interface (https://developers.facebook.com/docs/reference/android/current/class/FacebookRequestError/) so perhaps Facebook made a breaking change to it.

@zestyping
Copy link
Contributor

Also, when fetching adaccounts, error #80004 is getting triggered but it isn't getting caught. Haven't quite figured out why, yet — I've confirmed the correct version of the code is running.

2021-05-27 16:17:07 ERROR (/tmp/workspace/449/1) LineGobbler(voidCall):69 - File "/airbyte/integration_code/source_facebook_marketing/client/api.py", line 210, in _get_creatives
2021-05-27 16:17:07 ERROR (/tmp/workspace/449/1) LineGobbler(voidCall):69 - return self._api.account.get_ad_creatives(params=params)
2021-05-27 16:17:07 ERROR (/tmp/workspace/449/1) LineGobbler(voidCall):69 - File "/usr/local/lib/python3.7/site-packages/cached_property.py", line 36, in __get__
2021-05-27 16:17:07 ERROR (/tmp/workspace/449/1) LineGobbler(voidCall):69 - value = obj.__dict__[self.func.__name__] = self.func(obj)
2021-05-27 16:17:07 ERROR (/tmp/workspace/449/1) LineGobbler(voidCall):69 - File "/airbyte/integration_code/source_facebook_marketing/client/client.py", line 96, in account
2021-05-27 16:17:07 ERROR (/tmp/workspace/449/1) LineGobbler(voidCall):69 - return self._find_account(self._account_id)
2021-05-27 16:17:07 ERROR (/tmp/workspace/449/1) LineGobbler(voidCall):69 - File "/airbyte/integration_code/source_facebook_marketing/client/client.py", line 106, in _find_account
2021-05-27 16:17:07 ERROR (/tmp/workspace/449/1) LineGobbler(voidCall):69 - raise FacebookAPIException(f"Error: {exc.api_error_code()}, {exc.api_error_message()}") from exc
2021-05-27 16:17:07 ERROR (/tmp/workspace/449/1) LineGobbler(voidCall):69 - source_facebook_marketing.client.common.FacebookAPIException: Error: 80004, (#80004) There have been too many calls to this ad-account. Wait a bit and try again. For more info, please refer to https://developers.facebook.com/docs/graph-api/overview/rate-limiting.

@keu
Copy link
Contributor

keu commented May 28, 2021

Hi @zestyping,

  1. This looks like a FB bug, do you have full logs?
  2. This was intended to not catch and sleep in find_account method because we do this during connection check (to verify credentials) and at the very beginning of the sync process. But now I'm not sure if that's the right decision.

@zestyping
Copy link
Contributor

Here's the full log with both errors.
434.log

Note at the beginning it says it's running airbyte/source-facebook-marketing:0.2.7-tfc. This is a locally built version of the source container that I created by running docker build . -t airbyte/source-facebook-marketing:0.2.7-tfc at the tip of my branch, 4d19cb2c.

@zestyping
Copy link
Contributor

Let me know if you'd like me to export the Docker image of 0.2.7-tfc for you. I'm pretty sure that my changes didn't cause the error because I only changed the schema JSON files, but I'm happy to send you the image to help you replicate the exact conditions.

@zestyping
Copy link
Contributor

zestyping commented May 28, 2021

Interesting, I'm getting a new kind of error now: "did not start after 512 seconds". It's persistent after many retries. Here's the log:

2021-05-28 17:00:05 INFO (/tmp/workspace/491/2) WorkerRun(call):62 - Executing worker wrapper. Airbyte version: 0.24.2-alpha
2021-05-28 17:00:05 INFO (/tmp/workspace/491/2) TemporalAttemptExecution(get):111 - Executing worker wrapper. Airbyte version: 0.24.2-alpha
2021-05-28 17:00:05 INFO (/tmp/workspace/491/2) DefaultReplicationWorker(run):97 - start sync worker. job id: 491 attempt id: 2
2021-05-28 17:00:05 INFO (/tmp/workspace/491/2) DefaultReplicationWorker(run):106 - configured sync modes: {null.ads_insights_platform_and_device=full_refresh - append, null.adsets=full_refresh - append, null.adcreatives=full_refresh - append, null.ads=full_refresh - append, null.campaigns=full_refresh - append, null.ads_insights_age_and_gender=full_refresh - append}
2021-05-28 17:00:05 INFO (/tmp/workspace/491/2) DefaultAirbyteDestination(start):81 - Running destination...
2021-05-28 17:00:05 INFO (/tmp/workspace/491/2) LineGobbler(voidCall):69 - Checking if airbyte/destination-bigquery:0.3.2 exists...
2021-05-28 17:00:05 INFO (/tmp/workspace/491/2) LineGobbler(voidCall):69 - airbyte/destination-bigquery:0.3.2 was found locally.
2021-05-28 17:00:05 INFO (/tmp/workspace/491/2) DockerProcessBuilderFactory(create):111 - Preparing command: docker run --rm --init -i -v airbyte_workspace:/data -v /tmp/airbyte_local:/local -w /data/491/2 --network host airbyte/destination-bigquery:0.3.2 write --config destination_config.json --catalog destination_catalog.json
2021-05-28 17:00:05 INFO (/tmp/workspace/491/2) LineGobbler(voidCall):69 - Checking if airbyte/source-facebook-marketing:0.2.7-tfc5 exists...
2021-05-28 17:00:05 INFO (/tmp/workspace/491/2) LineGobbler(voidCall):69 - airbyte/source-facebook-marketing:0.2.7-tfc5 was found locally.
2021-05-28 17:00:05 INFO (/tmp/workspace/491/2) DockerProcessBuilderFactory(create):111 - Preparing command: docker run --rm --init -i -v airbyte_workspace:/data -v /tmp/airbyte_local:/local -w /data/491/2 --network host airbyte/source-facebook-marketing:0.2.7-tfc5 read --config source_config.json --catalog source_catalog.json --state input_state.json
2021-05-28 17:00:05 INFO (/tmp/workspace/491/2) DefaultReplicationWorker(run):132 - Waiting for source thread to join.
2021-05-28 17:12:57 ERROR (/tmp/workspace/491/2) LineGobbler(voidCall):69 - Traceback (most recent call last):
2021-05-28 17:12:57 ERROR (/tmp/workspace/491/2) LineGobbler(voidCall):69 - File "/airbyte/integration_code/main.py", line 33, in <module>
2021-05-28 17:12:57 ERROR (/tmp/workspace/491/2) LineGobbler(voidCall):69 - launch(source, sys.argv[1:])
2021-05-28 17:12:57 ERROR (/tmp/workspace/491/2) LineGobbler(voidCall):69 - File "/usr/local/lib/python3.7/site-packages/airbyte_cdk/entrypoint.py", line 117, in launch
2021-05-28 17:12:57 ERROR (/tmp/workspace/491/2) LineGobbler(voidCall):69 - for message in source_entrypoint.run(parsed_args):
2021-05-28 17:12:57 ERROR (/tmp/workspace/491/2) LineGobbler(voidCall):69 - File "/usr/local/lib/python3.7/site-packages/airbyte_cdk/entrypoint.py", line 108, in run
2021-05-28 17:12:57 ERROR (/tmp/workspace/491/2) LineGobbler(voidCall):69 - for message in generator:
2021-05-28 17:12:57 ERROR (/tmp/workspace/491/2) LineGobbler(voidCall):69 - File "/usr/local/lib/python3.7/site-packages/airbyte_cdk/sources/deprecated/base_source.py", line 87, in read
2021-05-28 17:12:57 ERROR (/tmp/workspace/491/2) LineGobbler(voidCall):69 - yield from self._read_stream(logger=logger, client=client, configured_stream=configured_stream, state=total_state)
2021-05-28 17:12:57 ERROR (/tmp/workspace/491/2) LineGobbler(voidCall):69 - File "/usr/local/lib/python3.7/site-packages/airbyte_cdk/sources/deprecated/base_source.py", line 106, in _read_stream
2021-05-28 17:12:57 ERROR (/tmp/workspace/491/2) LineGobbler(voidCall):69 - for record in client.read_stream(configured_stream.stream):
2021-05-28 17:12:57 ERROR (/tmp/workspace/491/2) LineGobbler(voidCall):69 - File "/usr/local/lib/python3.7/site-packages/airbyte_cdk/sources/deprecated/client.py", line 90, in read_stream
2021-05-28 17:12:57 ERROR (/tmp/workspace/491/2) LineGobbler(voidCall):69 - for message in method(fields=fields):
2021-05-28 17:12:57 ERROR (/tmp/workspace/491/2) LineGobbler(voidCall):69 - File "/airbyte/integration_code/source_facebook_marketing/client/api.py", line 348, in list
2021-05-28 17:12:57 ERROR (/tmp/workspace/491/2) LineGobbler(voidCall):69 - job = self._run_job_until_completion(params)
2021-05-28 17:12:57 ERROR (/tmp/workspace/491/2) LineGobbler(voidCall):69 - File "/usr/local/lib/python3.7/site-packages/backoff/_sync.py", line 94, in retry
2021-05-28 17:12:57 ERROR (/tmp/workspace/491/2) LineGobbler(voidCall):69 - ret = target(*args, **kwargs)
2021-05-28 17:12:57 ERROR (/tmp/workspace/491/2) LineGobbler(voidCall):69 - File "/airbyte/integration_code/source_facebook_marketing/client/api.py", line 369, in _run_job_until_completion
2021-05-28 17:12:57 ERROR (/tmp/workspace/491/2) LineGobbler(voidCall):69 - f"AdReportRun {job} did not start after {runtime.in_seconds()} seconds. This is an intermittent error which may be fixed by retrying the job. Aborting."
2021-05-28 17:12:57 ERROR (/tmp/workspace/491/2) LineGobbler(voidCall):69 - source_facebook_marketing.client.common.JobTimeoutException: AdReportRun <AdReportRun> {
2021-05-28 17:12:57 ERROR (/tmp/workspace/491/2) LineGobbler(voidCall):69 - "account_id": "1211960785929066",
2021-05-28 17:12:57 ERROR (/tmp/workspace/491/2) LineGobbler(voidCall):69 - "async_percent_completion": 0,
2021-05-28 17:12:57 ERROR (/tmp/workspace/491/2) LineGobbler(voidCall):69 - "async_status": "Job Failed",
2021-05-28 17:12:57 ERROR (/tmp/workspace/491/2) LineGobbler(voidCall):69 - "date_start": "2021-05-12",
2021-05-28 17:12:57 ERROR (/tmp/workspace/491/2) LineGobbler(voidCall):69 - "date_stop": "2021-05-12",
2021-05-28 17:12:57 ERROR (/tmp/workspace/491/2) LineGobbler(voidCall):69 - "id": "988953621845691",
2021-05-28 17:12:57 ERROR (/tmp/workspace/491/2) LineGobbler(voidCall):69 - "is_running": true,
2021-05-28 17:12:57 ERROR (/tmp/workspace/491/2) LineGobbler(voidCall):69 - "report_run_id": "988953621845691",
2021-05-28 17:12:57 ERROR (/tmp/workspace/491/2) LineGobbler(voidCall):69 - "time_ref": 1622221464
2021-05-28 17:12:57 ERROR (/tmp/workspace/491/2) LineGobbler(voidCall):69 - } did not start after 512 seconds. This is an intermittent error which may be fixed by retrying the job. Aborting.
2021-05-28 17:12:57 INFO (/tmp/workspace/491/2) DefaultReplicationWorker(run):134 - Source thread complete.
2021-05-28 17:12:57 INFO (/tmp/workspace/491/2) DefaultReplicationWorker(run):135 - Waiting for destination thread to join.
2021-05-28 17:13:09 INFO (/tmp/workspace/491/2) DefaultReplicationWorker(run):137 - Destination thread complete.
2021-05-28 17:13:09 ERROR (/tmp/workspace/491/2) DefaultReplicationWorker(run):141 - Sync worker failed.
io.airbyte.workers.WorkerException: Source process wasn't successful
at io.airbyte.workers.protocols.airbyte.DefaultAirbyteSource.close(DefaultAirbyteSource.java:130) ~[io.airbyte-airbyte-workers-0.24.2-alpha.jar:?]
at io.airbyte.workers.DefaultReplicationWorker.run(DefaultReplicationWorker.java:139) [io.airbyte-airbyte-workers-0.24.2-alpha.jar:?]
at io.airbyte.workers.DefaultReplicationWorker.run(DefaultReplicationWorker.java:49) [io.airbyte-airbyte-workers-0.24.2-alpha.jar:?]
at io.airbyte.workers.temporal.TemporalAttemptExecution.lambda$getWorkerThread$2(TemporalAttemptExecution.java:147) [io.airbyte-airbyte-workers-0.24.2-alpha.jar:?]
at java.lang.Thread.run(Thread.java:832) [?:?]

@zestyping
Copy link
Contributor

Strangely, when I do the HTTP requests myself, everything runs fine and finishes in less than a minute.

Here's my script fetching the daily ad metrics without any breakdowns:

love[1856] ./query_insights -t ads 1211960785929066 | ./fetch_results
Account: 1211960785929066
2021-05-28T20:57:36: [utilization: 0%]
1211960785929066: querying 2021-05-01 => report ID 935229373921228
935229373921228: Job Not Started
935229373921228: Job Completed
Fetching records.
=> 1211960785929066.2021-05-01.jsonl (0 objects)

2021-05-28T20:57:46: [utilization: 7%]
1211960785929066: querying 2021-05-08 => report ID 959682144853586
959682144853586: Job Not Started
959682144853586: Job Completed
Fetching records..
=> 1211960785929066.2021-05-08.jsonl (42 objects)

2021-05-28T20:57:57: [utilization: 15%]
1211960785929066: querying 2021-05-15 => report ID 308583194224035
308583194224035: Job Not Started
308583194224035: Job Completed
Fetching records..
=> 1211960785929066.2021-05-15.jsonl (42 objects)

2021-05-28T20:58:07: [utilization: 23%]
1211960785929066: querying 2021-05-22 => report ID 1133641807118469
1133641807118469: Job Not Started
1133641807118469: Job Completed
Fetching records.
=> 1211960785929066.2021-05-22.jsonl (0 objects)

With age and gender breakdowns:

love[1860] ./query_insights -t ads -b age,gender 1211960785929066 | ./fetch_results
Account: 1211960785929066
2021-05-28T20:58:54: [utilization: 18%]
1211960785929066: querying 2021-05-01 => report ID 504604387649931
504604387649931: Job Not Started
504604387649931: Job Completed
Fetching records.
=> 1211960785929066.2021-05-01.jsonl (0 objects)

2021-05-28T20:59:04: [utilization: 24%]
1211960785929066: querying 2021-05-08 => report ID 318102336388270
318102336388270: Job Not Started
318102336388270: Job Completed
Fetching records.....
=> 1211960785929066.2021-05-08.jsonl (109 objects)

2021-05-28T20:59:15: [utilization: 37%]
1211960785929066: querying 2021-05-15 => report ID 1704964669891921
1704964669891921: Job Not Started
1704964669891921: Job Completed
Fetching records..........
2021-05-28T20:59:25: [utilization: 57%]
1211960785929066: querying 2021-05-22 => report ID 184091436949184
..
=> 1211960785929066.2021-05-15.jsonl (295 objects)
184091436949184: Job Completed
Fetching records.
=> 1211960785929066.2021-05-22.jsonl (0 objects)

With device and platform breakdowns:

love[1863] ./query_insights -t ads -b impression_device,publisher_platform 1211960785929066 | ./fetch_results
Account: 1211960785929066
2021-05-28T21:00:03: [utilization: 58%]
1211960785929066: querying 2021-05-01 => report ID 2942440656041222
2942440656041222: Job Not Started
2942440656041222: Job Completed
Fetching records.
=> 1211960785929066.2021-05-01.jsonl (0 objects)

2021-05-28T21:00:14: [utilization: 65%]
1211960785929066: querying 2021-05-08 => report ID 479933393283180
479933393283180: Job Not Started
479933393283180: Job Completed
Fetching records...
=> 1211960785929066.2021-05-08.jsonl (63 objects)

2021-05-28T21:00:24: [utilization: 74%]

[utilization is too high; pausing]

[pause completed; continuing]
1211960785929066: querying 2021-05-15 => report ID 148418757305001
148418757305001: Job Completed
Fetching records......
=> 1211960785929066.2021-05-15.jsonl (136 objects)

2021-05-28T21:01:35: [utilization: 68%]
1211960785929066: querying 2021-05-22 => report ID 801370453879231
801370453879231: Job Not Started
801370453879231: Job Completed
Fetching records.
=> 1211960785929066.2021-05-22.jsonl (0 objects)

@zestyping
Copy link
Contributor

zestyping commented May 28, 2021

@keu @sherifnada Here are the scripts I'm using above, so you can replicate what I'm seeing.

ping-ad-insights.zip

I tried running the Airbyte sync again immediately after using these scripts, and the sync is still failing in the same way (which is evidence that it's not a transient issue that existed when Airbyte was running but not when my scripts were running). Airbyte must be doing something different, though I don't know what.

Can we re-open this issue, or would you like to start a new one?

@sherifnada sherifnada reopened this May 28, 2021
@sherifnada
Copy link
Contributor Author

@zestyping re-opening here since all context is already present in this ticket. Will get to work on this first thing on monday.

@zestyping
Copy link
Contributor

Thanks @sherifnada! I hope the information here helps!

@keu
Copy link
Contributor

keu commented May 29, 2021

@zestyping according to your logs both errors happened only when we read AdCreative stream, but your script reads only Insights data. Could you select only Insights streams and try sync with Airbyte?

@keu
Copy link
Contributor

keu commented May 29, 2021

We definitely do things differently:

  • we don't call HTTP endpoints directly, we use official FB SDK (might be a problem with header error)
  • we use batch API for creatives, this should be the most efficient way to retrieve data with a minimal number of calls
  • we use async API for insights, this might increase the number of calls as we sync each day one by one and also do several calls to check Async Job status. This also affects overall sync time as we sleep 2sec after each AsyncJob, so in some way, this should affect call rate positively.

still looking into this issue...

@zestyping
Copy link
Contributor

zestyping commented May 29, 2021

I tried again with adcreatives deselected and this time I got a different error:

2021-05-29 17:24:05 ERROR (/tmp/workspace/535/0) LineGobbler(voidCall):69 -   File "/airbyte/integration_code/source_facebook_marketing/client/common.py", line 60, in handle_call_rate_response
2021-05-29 17:24:05 ERROR (/tmp/workspace/535/0) LineGobbler(voidCall):69 -     if call_count > 99:
2021-05-29 17:24:05 ERROR (/tmp/workspace/535/0) LineGobbler(voidCall):69 - TypeError: '>' not supported between instances of 'NoneType' and 'int'

Full log attached.
logs-535-0.txt

@zestyping
Copy link
Contributor

we use async API for insights

The scripts I included above also use the async API for insights. The query_insights script starts the async jobs, and the fetch_results script concurrently checks async job status.

@zestyping
Copy link
Contributor

I am trying again with only the Insights tables selected (I also removed adsets, ads, and campaigns this time).

@zestyping
Copy link
Contributor

zestyping commented May 29, 2021

Same failure as the first time, with only Insights.
logs-537-0.txt

@mk556
Copy link

mk556 commented Jul 28, 2021

Hello. I am getting source_facebook_marketing.common.JobTimeoutException: AdReportRun error while running the sync for Facebook table ads_insights. Details are as follows :

airbyte version : 0.27.4-alpha
Source connector version : Facebook - 0.2.14
destination : Bigquery - 0.3.7
underlying infrastructure : EC2 instance (r5.large - 2vCPU, 16G mem)
tables : ads_insights
start_date : 2020-07-01
Sync connection config :

  • Included deleted : False
  • Insights Lookback Window : 28
  • Insights Days Per Job : 1 (initially we kept 7, but we were receiving jobTimeout error so we set it to 1)
    FB access level : Advanced Access (but we dont have ads_management permission)

Description :
We initialy kept Insights Days Per Job to 7 but we were receiving JobTimeoutException: AdReportRun for some of our FB connections. Hence, we lowered this config to 1 day. But still we are receiving this error. The job fetches reports successfully for initial few days but results into the error source_facebook_marketing.common.JobTimeoutException: AdReportRun <AdReportRun> after some time. We upgraded the connector to version 0.2.14 as well and tried again but facing the same error. The main job fails after retrying 3 times.

Here is the full log -
facebook-job-timeour-error.txt.zip

@keu
Copy link
Contributor

keu commented Jul 28, 2021

Hi @mk556 , thanks for your report!

I don't think your case is related to call rate, so may I ask you to create a separate issue for it?

Also, AFAIK, we can't do much about solving timeout exceptions because this is a FB limitation, in general, it is not recommended to use Account level Insights. But we are actively working on one feature that should allow selecting only specific fields thus making request 'lighter' and less prone to timeouts: #2227

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment