Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Source Facebook marketing: Fix duplicating records during insights lookback period #13047

Merged
merged 20 commits into from
May 23, 2022

Conversation

grubberr
Copy link
Contributor

@grubberr grubberr commented May 20, 2022

Signed-off-by: Sergey Chvalyuk grubberr@gmail.com

What

Connector during incremental stream constantly re-read all records for insights_loopback_period (default 28 days) producing duplicate records.
Improve logic to re-read only record which were updated.

Recommended reading order

  1. x.java
  2. y.python

🚨 User Impact 🚨

Are there any breaking changes? What is the end result perceived by the user? If yes, please merge this PR with the 🚨🚨 emoji so changelog authors can further highlight this if needed.

Pre-merge Checklist

Expand the relevant checklist and delete the others.

Updating a connector

Community member or Airbyter

  • Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • Changelog updated in docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • If new credentials are required for use in CI, add them to GSM. Instructions.
  • /test connector=connectors/<name> command is passing
  • New Connector version released on Dockerhub and connector version bumped by running the /publish command described here

Tests

Unit

Put your unit tests output here.

Integration

Put your integration tests output here.

Acceptance

Put your acceptance tests output here.

Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>
@grubberr grubberr self-assigned this May 20, 2022
@github-actions github-actions bot added area/connectors Connector related issues area/documentation Improvements or additions to documentation labels May 20, 2022
@grubberr
Copy link
Contributor Author

grubberr commented May 20, 2022

/test connector=connectors/source-facebook-marketing

🕑 connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/2356504472
✅ connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/2356504472
Python tests coverage:

Name                                                 Stmts   Miss  Cover
------------------------------------------------------------------------
source_acceptance_test/utils/__init__.py                 6      0   100%
source_acceptance_test/tests/__init__.py                 4      0   100%
source_acceptance_test/__init__.py                       2      0   100%
source_acceptance_test/tests/test_full_refresh.py       52      2    96%
source_acceptance_test/utils/asserts.py                 37      2    95%
source_acceptance_test/config.py                        77      6    92%
source_acceptance_test/utils/json_schema_helper.py     105     13    88%
source_acceptance_test/tests/test_incremental.py       121     25    79%
source_acceptance_test/utils/common.py                  80     17    79%
source_acceptance_test/tests/test_core.py              294    106    64%
source_acceptance_test/utils/compare.py                 62     23    63%
source_acceptance_test/base.py                          10      4    60%
source_acceptance_test/utils/connector_runner.py       110     48    56%
------------------------------------------------------------------------
TOTAL                                                  960    246    74%
Name                                                        Stmts   Miss  Cover
-------------------------------------------------------------------------------
source_facebook_marketing/streams/__init__.py                   2      0   100%
source_facebook_marketing/spec.py                              34      0   100%
source_facebook_marketing/__init__.py                           2      0   100%
source_facebook_marketing/api.py                               96     12    88%
source_facebook_marketing/streams/base_streams.py             127     27    79%
source_facebook_marketing/streams/common.py                    41     13    68%
source_facebook_marketing/streams/streams.py                   97     32    67%
source_facebook_marketing/source.py                            39     16    59%
source_facebook_marketing/streams/base_insight_streams.py     141     66    53%
source_facebook_marketing/streams/async_job.py                210    134    36%
source_facebook_marketing/streams/async_job_manager.py         78     60    23%
-------------------------------------------------------------------------------
TOTAL                                                         867    360    58%
Name                                                        Stmts   Miss  Cover
-------------------------------------------------------------------------------
source_facebook_marketing/streams/async_job.py                210      0   100%
source_facebook_marketing/streams/__init__.py                   2      0   100%
source_facebook_marketing/spec.py                              34      0   100%
source_facebook_marketing/__init__.py                           2      0   100%
source_facebook_marketing/streams/common.py                    41      1    98%
source_facebook_marketing/source.py                            39      1    97%
source_facebook_marketing/streams/async_job_manager.py         78      3    96%
source_facebook_marketing/api.py                               96      9    91%
source_facebook_marketing/streams/base_insight_streams.py     141     18    87%
source_facebook_marketing/streams/streams.py                   97     22    77%
source_facebook_marketing/streams/base_streams.py             127     30    76%
-------------------------------------------------------------------------------
TOTAL                                                         867     84    90%

@codecov
Copy link

codecov bot commented May 20, 2022

Codecov Report

❗ No coverage uploaded for pull request base (master@2b90559). Click here to learn what that means.
The diff coverage is n/a.

❗ Current head 23f38b1 differs from pull request most recent head 22ac460. Consider uploading reports for the commit 22ac460 to get more accurate results

@@            Coverage Diff            @@
##             master   #13047   +/-   ##
=========================================
  Coverage          ?   91.72%           
=========================================
  Files             ?       11           
  Lines             ?      870           
  Branches          ?        0           
=========================================
  Hits              ?      798           
  Misses            ?       72           
  Partials          ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2b90559...22ac460. Read the comment docs.

Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>
…cessed_in_lookback_period

Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>
Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>
Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>
Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>
Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>
Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>
Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>
Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>
@grubberr
Copy link
Contributor Author

grubberr commented May 21, 2022

/test connector=connectors/source-facebook-marketing

🕑 connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/2363211848
✅ connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/2363211848
Python tests coverage:

Name                                                 Stmts   Miss  Cover
------------------------------------------------------------------------
source_acceptance_test/utils/__init__.py                 6      0   100%
source_acceptance_test/tests/__init__.py                 4      0   100%
source_acceptance_test/__init__.py                       2      0   100%
source_acceptance_test/tests/test_full_refresh.py       52      2    96%
source_acceptance_test/utils/asserts.py                 37      2    95%
source_acceptance_test/config.py                        77      6    92%
source_acceptance_test/utils/json_schema_helper.py     105     13    88%
source_acceptance_test/tests/test_incremental.py       121     25    79%
source_acceptance_test/utils/common.py                  80     17    79%
source_acceptance_test/tests/test_core.py              294    106    64%
source_acceptance_test/utils/compare.py                 62     23    63%
source_acceptance_test/base.py                          10      4    60%
source_acceptance_test/utils/connector_runner.py       110     48    56%
------------------------------------------------------------------------
TOTAL                                                  960    246    74%
Name                                                        Stmts   Miss  Cover
-------------------------------------------------------------------------------
source_facebook_marketing/streams/__init__.py                   2      0   100%
source_facebook_marketing/spec.py                              34      0   100%
source_facebook_marketing/__init__.py                           2      0   100%
source_facebook_marketing/api.py                               96     12    88%
source_facebook_marketing/streams/base_streams.py             127     27    79%
source_facebook_marketing/streams/common.py                    41     13    68%
source_facebook_marketing/streams/streams.py                   97     32    67%
source_facebook_marketing/source.py                            39     16    59%
source_facebook_marketing/streams/base_insight_streams.py     141     66    53%
source_facebook_marketing/streams/async_job.py                210    134    36%
source_facebook_marketing/streams/async_job_manager.py         78     60    23%
-------------------------------------------------------------------------------
TOTAL                                                         867    360    58%
Name                                                        Stmts   Miss  Cover
-------------------------------------------------------------------------------
source_facebook_marketing/streams/async_job.py                210      0   100%
source_facebook_marketing/streams/__init__.py                   2      0   100%
source_facebook_marketing/spec.py                              34      0   100%
source_facebook_marketing/__init__.py                           2      0   100%
source_facebook_marketing/streams/common.py                    41      1    98%
source_facebook_marketing/source.py                            39      1    97%
source_facebook_marketing/streams/async_job_manager.py         78      3    96%
source_facebook_marketing/streams/base_insight_streams.py     141      6    96%
source_facebook_marketing/api.py                               96      9    91%
source_facebook_marketing/streams/streams.py                   97     22    77%
source_facebook_marketing/streams/base_streams.py             127     30    76%
-------------------------------------------------------------------------------
TOTAL                                                         867     72    92%

Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>
Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>
Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>
Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>
Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>
@grubberr grubberr changed the title Source Facebook marketing: Don't duplicate syncing inside INSIGHTS_LOOKBACK_PERIOD period Source Facebook marketing: Fix duplicating records during insights lookback period May 22, 2022
Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>
@grubberr
Copy link
Contributor Author

grubberr commented May 22, 2022

/test connector=connectors/source-facebook-marketing

🕑 connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/2365550255
✅ connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/2365550255
Python tests coverage:

Name                                                 Stmts   Miss  Cover
------------------------------------------------------------------------
source_acceptance_test/utils/__init__.py                 6      0   100%
source_acceptance_test/tests/__init__.py                 4      0   100%
source_acceptance_test/__init__.py                       2      0   100%
source_acceptance_test/tests/test_full_refresh.py       52      2    96%
source_acceptance_test/utils/asserts.py                 37      2    95%
source_acceptance_test/config.py                        77      6    92%
source_acceptance_test/utils/json_schema_helper.py     105     13    88%
source_acceptance_test/tests/test_incremental.py       121     25    79%
source_acceptance_test/utils/common.py                  80     17    79%
source_acceptance_test/tests/test_core.py              294    106    64%
source_acceptance_test/utils/compare.py                 62     23    63%
source_acceptance_test/base.py                          10      4    60%
source_acceptance_test/utils/connector_runner.py       110     48    56%
------------------------------------------------------------------------
TOTAL                                                  960    246    74%
Name                                                        Stmts   Miss  Cover
-------------------------------------------------------------------------------
source_facebook_marketing/streams/__init__.py                   2      0   100%
source_facebook_marketing/spec.py                              34      0   100%
source_facebook_marketing/__init__.py                           2      0   100%
source_facebook_marketing/api.py                               96     12    88%
source_facebook_marketing/streams/base_streams.py             127     27    79%
source_facebook_marketing/streams/common.py                    41     13    68%
source_facebook_marketing/streams/streams.py                   97     32    67%
source_facebook_marketing/source.py                            39     16    59%
source_facebook_marketing/streams/base_insight_streams.py     141     66    53%
source_facebook_marketing/streams/async_job.py                210    134    36%
source_facebook_marketing/streams/async_job_manager.py         78     60    23%
-------------------------------------------------------------------------------
TOTAL                                                         867    360    58%
Name                                                        Stmts   Miss  Cover
-------------------------------------------------------------------------------
source_facebook_marketing/streams/async_job.py                210      0   100%
source_facebook_marketing/streams/__init__.py                   2      0   100%
source_facebook_marketing/spec.py                              34      0   100%
source_facebook_marketing/__init__.py                           2      0   100%
source_facebook_marketing/streams/common.py                    41      1    98%
source_facebook_marketing/source.py                            39      1    97%
source_facebook_marketing/streams/async_job_manager.py         78      3    96%
source_facebook_marketing/streams/base_insight_streams.py     141      6    96%
source_facebook_marketing/api.py                               96      9    91%
source_facebook_marketing/streams/streams.py                   97     22    77%
source_facebook_marketing/streams/base_streams.py             127     30    76%
-------------------------------------------------------------------------------
TOTAL                                                         867     72    92%

Comment on lines 113 to 114
if date_start and pendulum.parse(record["updated_time"]).date() <= date_start:
continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice - this seems to be the main logical change

Comment on lines 163 to 164
today = pendulum.today(tz="UTC").date()
end_date = min(self._end_date, today)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps since we are going to prevent looking at "today" again once that date has already been looked up, the earliest day we look at should be "yesterday" (e.g. today - 1). This way, we won't try to look up data for a partial day without being able to update it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed logic to sync only for yesterday

@grubberr
Copy link
Contributor Author

grubberr commented May 23, 2022

/test connector=connectors/source-facebook-marketing

🕑 connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/2370038782
✅ connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/2370038782
Python tests coverage:

Name                                                 Stmts   Miss  Cover
------------------------------------------------------------------------
source_acceptance_test/utils/__init__.py                 6      0   100%
source_acceptance_test/tests/__init__.py                 4      0   100%
source_acceptance_test/__init__.py                       2      0   100%
source_acceptance_test/tests/test_full_refresh.py       52      2    96%
source_acceptance_test/utils/asserts.py                 37      2    95%
source_acceptance_test/config.py                        77      6    92%
source_acceptance_test/utils/json_schema_helper.py     105     13    88%
source_acceptance_test/tests/test_incremental.py       121     25    79%
source_acceptance_test/utils/common.py                  80     17    79%
source_acceptance_test/tests/test_core.py              294    106    64%
source_acceptance_test/utils/compare.py                 62     23    63%
source_acceptance_test/base.py                          10      4    60%
source_acceptance_test/utils/connector_runner.py       110     48    56%
------------------------------------------------------------------------
TOTAL                                                  960    246    74%
Name                                                        Stmts   Miss  Cover
-------------------------------------------------------------------------------
source_facebook_marketing/streams/__init__.py                   2      0   100%
source_facebook_marketing/spec.py                              34      0   100%
source_facebook_marketing/__init__.py                           2      0   100%
source_facebook_marketing/api.py                               96     12    88%
source_facebook_marketing/streams/base_streams.py             127     27    79%
source_facebook_marketing/streams/common.py                    41     13    68%
source_facebook_marketing/streams/streams.py                   97     32    67%
source_facebook_marketing/source.py                            39     16    59%
source_facebook_marketing/streams/base_insight_streams.py     144     69    52%
source_facebook_marketing/streams/async_job.py                210    134    36%
source_facebook_marketing/streams/async_job_manager.py         78     60    23%
-------------------------------------------------------------------------------
TOTAL                                                         870    363    58%
Name                                                        Stmts   Miss  Cover
-------------------------------------------------------------------------------
source_facebook_marketing/streams/async_job.py                210      0   100%
source_facebook_marketing/streams/__init__.py                   2      0   100%
source_facebook_marketing/spec.py                              34      0   100%
source_facebook_marketing/__init__.py                           2      0   100%
source_facebook_marketing/streams/common.py                    41      1    98%
source_facebook_marketing/source.py                            39      1    97%
source_facebook_marketing/streams/async_job_manager.py         78      3    96%
source_facebook_marketing/streams/base_insight_streams.py     144      6    96%
source_facebook_marketing/api.py                               96      9    91%
source_facebook_marketing/streams/streams.py                   97     22    77%
source_facebook_marketing/streams/base_streams.py             127     30    76%
-------------------------------------------------------------------------------
TOTAL                                                         870     72    92%

@grubberr grubberr requested a review from evantahler May 23, 2022 09:18
@grubberr grubberr removed the request for review from lazebnyi May 23, 2022 17:56
@grubberr
Copy link
Contributor Author

grubberr commented May 23, 2022

/publish connector=connectors/source-facebook-marketing

🕑 connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/2373057915
❌ Failed to publish connectors/source-facebook-marketing
❌ Couldn't auto-bump version for connectors/source-facebook-marketing

@grubberr
Copy link
Contributor Author

grubberr commented May 23, 2022

/publish connector=connectors/source-facebook-marketing

🕑 connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/2373198544
🚀 Successfully published connectors/source-facebook-marketing
🚀 Auto-bumped version for connectors/source-facebook-marketing
✅ connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/2373198544

@octavia-squidington-iii octavia-squidington-iii temporarily deployed to more-secrets May 23, 2022 18:59 Inactive
@grubberr grubberr merged commit 4c283d7 into master May 23, 2022
@grubberr grubberr deleted the grubberr/oncall-231-source-facebook-marketing branch May 23, 2022 19:03
jscottpolevault pushed a commit to jscottpolevault/airbyte that referenced this pull request Jun 1, 2022
…okback period (airbytehq#13047)

Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>
@vladimir-remar
Copy link
Contributor

@grubberr @evantahler there is something I don't fully understand, in this change, the connector is still downloading the data for the last 28 days but only commits the records with date later than the state. Is that right? Wouldn't make more sense to call the FB API with that filter to avoid unnecessary calls?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants