Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Facebook Conversion Data Missing #5190

Closed
manavkohli opened this issue Aug 4, 2021 · 12 comments · Fixed by #6463
Closed

Facebook Conversion Data Missing #5190

manavkohli opened this issue Aug 4, 2021 · 12 comments · Fixed by #6463

Comments

@manavkohli
Copy link
Contributor

Enviroment

  • Airbyte version: 0.22.0-alpha
  • OS Version / Instance: example macOS
  • Deployment: Local
  • Source Connector and version: Facebook Marketing (Latest)
  • Destination Connector and version: N/A
  • Severity: High
  • Step where error happened: Sync job

Current Behavior

After the rollout of iOS 14.5, Facebook added many restrictions to its API. In particular, it restricted fetching conversion data when breakdowns were present (see here: "No support for breakdowns: For both app and web conversions, delivery and action breakdowns, such as age, gender, region, and placement will not be supported.", source). In our warehouse we're at least seeing missing data for conversions after April.

Expected Behavior

There should be a report that does not use breakdowns and can expose conversion events.

Logs

If applicable, please upload the logs from the failing operation.
For sync jobs, you can download the full logs from the UI by going to the sync attempt page and
clicking the download logs button at the top right of the logs display window.

LOG

replace this with
your long log
output here

Steps to Reproduce

  1. Create a Facebook Marketing source
  2. Create any destination
  3. Create a connection between the two and sync data
  4. Try and query conversion data from the database (sample query below)
SELECT
    date_start as date,
    SUM(value)
FROM facebook_ads_insights_actions a
LEFT JOIN facebook_ads_insights b ON b._airbyte_facebook_ads_insights_hashid = a._airbyte_facebook_ads_insights_hashid
WHERE action_type LIKE 'omni_purchase'
GROUP BY date
ORDER BY date DESC

Are you willing to submit a PR?

Yes

@manavkohli manavkohli added the type/bug Something isn't working label Aug 4, 2021
@sherifnada
Copy link
Contributor

@manavkohli it seems like the issue is that when the FB connector queries for action_attribution_windows of 28days then empty data can be returned. I think the answer is to stop querying for 28day attribution windows altogether but I'm not 100% sure yet. needs more research. Adding this to our current sprint.

@sherifnada sherifnada added this to the Connectors August 6th milestone Aug 5, 2021
@sherifnada sherifnada added the area/connectors Connector related issues label Aug 5, 2021
@sherifnada
Copy link
Contributor

helpful context: https://developers.facebook.com/docs/marketing-api/insights#sample see the notice about iOS 14 in these docs

@manavkohli
Copy link
Contributor Author

@sherifnada good to know - will try that as well

@manavkohli
Copy link
Contributor Author

@sherifnada fyi running into a handful of errors with the Facebook Marketing connector as I dug deeper, what's the best way to proceed here? Namely:

  1. All jobs are failing here, logs:
 ERROR () LineGobbler(voidCall):85 -     record["thumbnail_url"] = remove_params_from_url(thumbnail_url, ["_nc_hash", "d"])
2021-08-03 20:02:54 ERROR () LineGobbler(voidCall):85 -   File "/airbyte/integration_code/source_facebook_marketing/streams.py", line 53, in remove_params_from_url
2021-08-03 20:02:54 ERROR () LineGobbler(voidCall):85 -     key, value = q.split("=")
2021-08-03 20:02:54 ERROR () LineGobbler(voidCall):85 - ValueError: not enough values to unpack (expected 2, got 1)
  1. Once fixed, seeing separate error in the normalization process

@sherifnada
Copy link
Contributor

@manavkohli will prioritize this failure asap, can you share the full logs from the failling normalization job?

@manavkohli
Copy link
Contributor Author

Awesome, thanks! Also I dug deeper, I think the normalization job was an issue with the generated catalog/actually received data for the report I built. So is not something affecting the source/production build, sorry about raising the alarm there. That all being said, attaching the logs if helpful.
logs-10-0.txt

@sherifnada
Copy link
Contributor

thanks for the heads up, will look into the first issue also

@manavkohli
Copy link
Contributor Author

Definitely! Also to clarify earlier comment, all the jobs are still failing in production, but the normalization process after fixing seemed due to my changes

@manavkohli
Copy link
Contributor Author

@sherifnada do you have an idea of when the underlying data issue may be deployed? That's still blocking us from being able to use the connector

@vitaliizazmic
Copy link
Contributor

@manavkohli hi, I'm working on task and try to reproduce issue. Could you please provide more details about ads for which data is missing.

@manavkohli
Copy link
Contributor Author

@vitaliizazmic definitely, basically all conversion data after mid April. If you look at the *_actions or *_actions_values tables and filter on events where the action_type is omni_purchase (this is a conversion event) then you won't see any results after that.

@manavkohli
Copy link
Contributor Author

@sherifnada @vitaliizazmic I opened a PR that addressed both issues. Was able to verify data loaded into my DB that included conversion data. Query below (note that the tables are prepended with (fb921_)

SELECT
    date_start as date,
    SUM(value)
FROM fb921_ads_insights_action_type_actions a
LEFT JOIN fb921_ads_insights_action_type b ON b._airbyte_fb921_ads_i__ts_action_type_hashid = a._airbyte_fb921_ads_i__ts_action_type_hashid
WHERE action_type LIKE 'omni_purchase'
GROUP BY date
ORDER BY date DESC

vitaliizazmic added a commit that referenced this issue Oct 27, 2021
* Drift #5190 - migrate to CDK, add acceptance tests

* Source Drift #7041 - fixing according to PR review

* Source Drift #7041 - bump version and update changelogs
vitaliizazmic added a commit that referenced this issue Nov 1, 2021
…st_sequential_reads

* Source Facebook Marketing #5190 - estimate cost_per_estimated_ad_recallers for AdsInsights streams if not presented in records

* Source Facebook Marketing #5190 - add ignored fields to full refresh test

* Source Facebook Marketing #5190 - annotations

* Source Facebook Marketing #5190 - reformat

* SAT #5190 - delete remove_ignored_fields

* Source Facebook Marketing #5190 - use dpath util for excluding fields

* Facebook marketing #5190 - follow EAFP principle

* Facebook Marketing #5190 - add unit tests.

* Source Facebook Marketing #5190 - fixing according to PR

* Source Facebook Marketing #5190 - support ignored fields by stream

* Source Facebook Marketing #5190 - update docs

* Source Facebook Marketing #5190 - merge conflicts

* Source Facebook Marketing #5190 - bump SAT version

* Source Facebook Marketing #5190 - fixing unit tests

* Source Facebook Marketing #5190 - bump
vitaliizazmic added a commit that referenced this issue Nov 2, 2021
* Source Google Directory #7415 - migrate to the CDK

* Source Google Directory #5190 - fix timeout error

* Source Google Directory #7415 - fix according to PR review

* Source Google Directory #7415 - added etag and lastLoginTime to ignored fields for full refresh acceptance test

* Source Google Directory #7415 - fix full refresh acceptance test config

* Source Google Directory #7415 - bump version
lmossman pushed a commit that referenced this issue Nov 3, 2021
* Source Google Directory #7415 - migrate to the CDK

* Source Google Directory #5190 - fix timeout error

* Source Google Directory #7415 - fix according to PR review

* Source Google Directory #7415 - added etag and lastLoginTime to ignored fields for full refresh acceptance test

* Source Google Directory #7415 - fix full refresh acceptance test config

* Source Google Directory #7415 - bump version
vitaliizazmic added a commit that referenced this issue Nov 5, 2021
* Source Google Directory #6265 - add oauth support

* Source Google Directory #6265 - update credentials

* Source Google Directory #6265 - fixing according to PR

* Source Google directory #6265 - update docs

* Source Google directory #5190 - update doc

* Source Google Directory #6265 - resolve merge conflict

* Source Google Directory #6265 - SAT for oauth

* Source Google Directory #6265 - bump version and update changelog

* Source Google Directory #6265 - bump version and update changelog (update publish)
schlattk pushed a commit to schlattk/airbyte that referenced this issue Jan 4, 2022
* Drift airbytehq#5190 - migrate to CDK, add acceptance tests

* Source Drift airbytehq#7041 - fixing according to PR review

* Source Drift airbytehq#7041 - bump version and update changelogs
schlattk pushed a commit to schlattk/airbyte that referenced this issue Jan 4, 2022
…st_sequential_reads

* Source Facebook Marketing airbytehq#5190 - estimate cost_per_estimated_ad_recallers for AdsInsights streams if not presented in records

* Source Facebook Marketing airbytehq#5190 - add ignored fields to full refresh test

* Source Facebook Marketing airbytehq#5190 - annotations

* Source Facebook Marketing airbytehq#5190 - reformat

* SAT airbytehq#5190 - delete remove_ignored_fields

* Source Facebook Marketing airbytehq#5190 - use dpath util for excluding fields

* Facebook marketing airbytehq#5190 - follow EAFP principle

* Facebook Marketing airbytehq#5190 - add unit tests.

* Source Facebook Marketing airbytehq#5190 - fixing according to PR

* Source Facebook Marketing airbytehq#5190 - support ignored fields by stream

* Source Facebook Marketing airbytehq#5190 - update docs

* Source Facebook Marketing airbytehq#5190 - merge conflicts

* Source Facebook Marketing airbytehq#5190 - bump SAT version

* Source Facebook Marketing airbytehq#5190 - fixing unit tests

* Source Facebook Marketing airbytehq#5190 - bump
schlattk pushed a commit to schlattk/airbyte that referenced this issue Jan 4, 2022
* Source Google Directory airbytehq#7415 - migrate to the CDK

* Source Google Directory airbytehq#5190 - fix timeout error

* Source Google Directory airbytehq#7415 - fix according to PR review

* Source Google Directory airbytehq#7415 - added etag and lastLoginTime to ignored fields for full refresh acceptance test

* Source Google Directory airbytehq#7415 - fix full refresh acceptance test config

* Source Google Directory airbytehq#7415 - bump version
schlattk pushed a commit to schlattk/airbyte that referenced this issue Jan 4, 2022
* Source Google Directory airbytehq#6265 - add oauth support

* Source Google Directory airbytehq#6265 - update credentials

* Source Google Directory airbytehq#6265 - fixing according to PR

* Source Google directory airbytehq#6265 - update docs

* Source Google directory airbytehq#5190 - update doc

* Source Google Directory airbytehq#6265 - resolve merge conflict

* Source Google Directory airbytehq#6265 - SAT for oauth

* Source Google Directory airbytehq#6265 - bump version and update changelog

* Source Google Directory airbytehq#6265 - bump version and update changelog (update publish)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants