Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 Source Bing Ads : fix rollback window for marketing performance reports #23663

Conversation

tlatrace
Copy link
Contributor

@tlatrace tlatrace commented Mar 2, 2023

What

Bing Ads connector doesn’t ingest all conversions for marketing performance reports. There are discrepancies between Bing UI data and data received in the Destination.

At the moment, the conversions' fields (Conversions, AllConversions, ConversionsQualified) in the Destination only count conversions that happened the day of the click. Conversions happening after 2+ days, during the attribution window, are not counted.

The problem comes from the connector cursor field (on the incremental sync) not being appropriately implemented. Indeed, it misses a rollback window to attribute conversions happening after 2+ days.

This PR intends to add an optional parameter lookback_windowin the connector configuration, for performance reports ingested with incremental streams.

Fixes #22930.

How

During a sync, instead of retrieving data from [cursor_value, now()], we’ll retrieve data from [cursor_value - rollback_window, now()]. So the API call logic is modified (not the cursor logic).

This means that if the cursor value is set to 15/12/2022, we’ll retrieve data up to 15/11/2022 (for a 30-day attribution window).

Cons: we’ll ingest lots of duplicates in the history table (30 duplicates per record in most cases!). The deduped + history feature will handle the deduplication in the final table.

Recommended reading order

Review per commit, in chronological order.

🚨 User Impact 🚨

No breaking changes as the lookback_window is an optional parameter.

Pre-merge Checklist

Updating a connector

Community member or Airbyter

  • Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • Changelog updated in docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
  • PR name follows PR naming conventions

@CLAassistant
Copy link

CLAassistant commented Mar 2, 2023

CLA assistant check
All committers have signed the CLA.

@misteryeo
Copy link
Contributor

The team's going to take a first look at this shortly @tlatrace! Assigning to @sajarin

Copy link
Contributor

@sajarin sajarin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @tlatrace, thanks for the contribution.

On a first glance, this looks good. A small nit would be to rename _get_start_date to simply get_start_date (without a prefixing underscore) so that it matches the naming conventions of the other methods.

Additionally, could you explain what the test case you've added is testing?

@@ -38,6 +38,14 @@ def test_get_column_value():
assert test_report.get_column_value(record, "Spend") == 1.203


def test_get_updated_state_init_state():
Copy link
Contributor Author

@tlatrace tlatrace Mar 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This unit test makes sure that the state is correctly initiated for a given stream.

Nothing related to the fix, but I took the opportunity to add it as it was missing.

@octavia-squidington-iii octavia-squidington-iii added the area/documentation Improvements or additions to documentation label Mar 9, 2023
@tlatrace tlatrace force-pushed the moguero/source-bing-ads-fix-add-lookback-window branch from 8460c14 to 26abc58 Compare March 10, 2023 10:02
@tlatrace tlatrace requested a review from sajarin March 10, 2023 11:07
@tlatrace
Copy link
Contributor Author

Here are screenshots of unit tests and integration tests passing.

unit-tests
integration-tests

@sajarin
Copy link
Contributor

sajarin commented Mar 10, 2023

/test connector=connectors/source-bing-ads

🕑 connectors/source-bing-ads https://github.com/airbytehq/airbyte/actions/runs/4385404954
❌ connectors/source-bing-ads https://github.com/airbytehq/airbyte/actions/runs/4385404954
🐛 https://gradle.com/s/krzgo3cel3vbs

Build Failed

Test summary info:

	 =========================== short test summary info ============================
	 FAILED unit_tests/test_source.py::test_AccountPerformanceReportMonthly_request_params
	 �[31m=================== �[31m�[1m1 failed�[0m, �[32m23 passed�[0m, �[33m2 warnings�[0m�[31m in 3.22s�[0m�[31m ===================�[0m

@tlatrace
Copy link
Contributor Author

tlatrace commented Mar 13, 2023

I managed to reproduce the failing test locally. The test might fail because the secrets/config.json file doesn't have the lookback_window key. Browsing the doc, I suspect this CI secret has to be added in Google Secret Manager but I don't have access. Could you have a look @sajarin?

@sajarin
Copy link
Contributor

sajarin commented Mar 15, 2023

lookback_window has been added to the configs on GSM

@sajarin
Copy link
Contributor

sajarin commented Mar 15, 2023

/test connector=connectors/source-bing-ads

🕑 connectors/source-bing-ads https://github.com/airbytehq/airbyte/actions/runs/4429516638
❌ connectors/source-bing-ads https://github.com/airbytehq/airbyte/actions/runs/4429516638
🐛 https://gradle.com/s/xsxicn5yugmgg

Build Failed

Test summary info:

=========================== short test summary info ============================
FAILED test_core.py::TestBasicRead::test_read[inputs0] - docker.errors.Contai...
FAILED test_full_refresh.py::TestFullRefresh::test_sequential_reads[inputs0]
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/connector_acceptance_test/plugin.py:63: Skipping TestIncremental.test_two_sequential_reads: SAT doesn't support complex nested states used in incremental report streams
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/connector_acceptance_test/tests/test_core.py:509: The previous and actual discovered catalogs are identical.
============== 2 failed, 32 passed, 2 skipped in 76.24s (0:01:16) ==============

@tlatrace
Copy link
Contributor Author

tlatrace commented Mar 16, 2023

Thank you @sajarin! I suspect lookback_window parameter is not set to a default value in the CI config, as self.config[\"lookback_window\"] is evaluated to None in 2 acceptance tests (see traceback below).

Is a default value set for lookback_window on GSM? (0 is the default value set up in the spec.json)

Failing tests:

  • test_core.py::TestBasicRead::test_read
  • test_full_refresh.py::TestFullRefresh::test_sequential_reads:
Captured log call 023-03-15T18:18:36.7227793Z ERROR root:connector_runner.py:172 Docker container failed, code 1, error: 2023-03-15T18:18:36.7232397Z ***"type": "TRACE", "trace": ***"type": "ERROR", "emitted_at": 1678904314295.8801, "error": ***"message": "Something went wrong in the connector. See the logs for more details.", "internal_message": "'NoneType' object has no attribute 'convert'", "stack_trace": "Traceback (most recent call last):\n File \"/airbyte/integration_code/main.py\", line 13, in \n launch(source, sys.argv[1:])\n File \"/usr/local/lib/python3.9/site-packages/airbyte_cdk/entrypoint.py\", line 131, in launch\n for message in source_entrypoint.run(parsed_args):\n File \"/usr/local/lib/python3.9/site-packages/airbyte_cdk/entrypoint.py\", line 122, in run\n for message in generator:\n File \"/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/abstract_source.py\", line 134, in read\n raise e\n File \"/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/abstract_source.py\", line 120, in read\n yield from self._read_stream(\n File \"/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/abstract_source.py\", line 189, in _read_stream\n for record in record_iterator:\n File \"/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/abstract_source.py\", line 307, in _read_full_refresh\n for record_data_or_message in record_data_or_messages:\n File \"/airbyte/integration_code/source_bing_ads/source.py\", line 125, in read_records\n params = self.request_params(\n File \"/airbyte/integration_code/source_bing_ads/reports.py\", line 206, in request_params\n start_date = self.get_start_date(stream_state, account_id)\n File \"/airbyte/integration_code/source_bing_ads/reports.py\", line 349, in get_start_date\n start_date = self.client.reports_start_date.subtract(days=self.config[\"lookback_window\"])\n File \"/usr/local/lib/python3.9/site-packages/pendulum/datetime.py\", line 721, in subtract\n return self.add(\n File \"/usr/local/lib/python3.9/site-packages/pendulum/datetime.py\", line 667, in add\n dt = self.tz.convert(dt)\nAttributeError: 'NoneType' object has no attribute 'convert'\n", "failure_type": "system_error"***

@sajarin
Copy link
Contributor

sajarin commented Mar 17, 2023

/test connector=connectors/source-bing-ads

🕑 connectors/source-bing-ads https://github.com/airbytehq/airbyte/actions/runs/4448660304
❌ connectors/source-bing-ads https://github.com/airbytehq/airbyte/actions/runs/4448660304
🐛 https://gradle.com/s/b5clccbronmvk

Build Failed

Test summary info:

=========================== short test summary info ============================
FAILED test_core.py::TestBasicRead::test_read[inputs0] - docker.errors.Contai...
FAILED test_full_refresh.py::TestFullRefresh::test_sequential_reads[inputs0]
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/connector_acceptance_test/plugin.py:63: Skipping TestIncremental.test_two_sequential_reads: SAT doesn't support complex nested states used in incremental report streams
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/connector_acceptance_test/tests/test_core.py:509: The previous and actual discovered catalogs are identical.
============== 2 failed, 32 passed, 2 skipped in 74.03s (0:01:14) ==============

@sajarin
Copy link
Contributor

sajarin commented Mar 17, 2023

@tlatrace running the tests again, I'm not able to get integration tests passing locally given our credentials despite setting the lookback_window to a default value of 0.

Do you screenshots of passing integration tests? The above images includes passing unit but for integration tests, I'm not seeing any tests being run there. Just trying to verify that this is working so that I can debug my local and get it working in the CI/CD.

@tlatrace
Copy link
Contributor Author

tlatrace commented Mar 20, 2023

Following the documentation, 
I first tried to run integration tests with this Docker command:

docker build . --no-cache -t airbyte/source-bing-ads:dev && python -m pytest -p connector_acceptance_test.plugin

But I reached this error for 4 tests: FileNotFoundError: [Errno 2] No such file or directory: '/data/tap_config.json', which I don’t understand and was already reported here. These tests also fails with the same error on the master branch.

I then tried to run integration tests using Gradle:

./gradlew :airbyte-integrations:connectors:source-bing-ads:integrationTest

Which led to Timeout errors on the 2 tests failing on the CI. These tests also fail on master branch on my local FYI. Reducing the number of streams in the configured_catalog.json doesn't help. Traceback: logs-bing-ads-source-integration-tests.txt

Could you please provide help on this @sajarin? 🙏

@tlatrace tlatrace force-pushed the moguero/source-bing-ads-fix-add-lookback-window branch from c396329 to cb37392 Compare April 17, 2023 13:17
Copy link
Member

@marcosmarxm marcosmarxm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tlatrace for the contribution. I'm going to finish the PR #25259 because I don't have commit access to your company fork.

@tlatrace
Copy link
Contributor Author

Thank you for your feedback @marcosmarxm. Just to be sure, is something else required from my side?

@marcosmarxm
Copy link
Member

Thank you for your feedback @marcosmarxm. Just to be sure, is something else required from my side?

Nope! Right now I'm going to wait the final approval from the connector team. Because Bing Ads it is a GA connector there are some discussion to merge new code into the connector.

@marcosmarxm
Copy link
Member

@tlatrace tests are been running in #25259

@marcosmarxm
Copy link
Member

Published

@thibault-latrace
Copy link

Awesome, thanks @marcosmarxm!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Source Bing Ads - No rollback window for conversions
7 participants