TDL-18749 Implement interruptible full table streams. #60

prijendev · 2022-05-24T06:31:49Z

Description of change

Added WHERE and ORDER BY clauses in the query parameter to filter out the records.
Added logic to write state with last_pk_fetched(id of last synced record) after 1000 records.
Added unit test case to verify query building logic.
Added integration test cases to verify interruptible Full-Table streams.
Following streams are excluded from interruptible streams because they do not have the id key to add in WHERE clause,
- call_details
- campaign_labels
ad_group_criterion and campaign_criterion have composite primary keys. For these streams, we are considering ad_group_id and campaign_id respectively for the filter parameters.

QA steps

automated tests passing
manual qa steps passing

Risks

Rollback steps

revert this branch

savan-chovatiya · 2022-05-30T06:23:37Z

tap_google_ads/streams.py

+        last_pk_fetched = singer.get_bookmark(state,
+                                            stream["tap_stream_id"],
+                                            customer["customerId"]) or {}


Suggested change

last_pk_fetched = singer.get_bookmark(state,

stream["tap_stream_id"],

customer["customerId"]) or {}

last_pk_fetched = singer.get_bookmark(state,

stream["tap_stream_id"],

customer["customerId"]) or {}

savan-chovatiya · 2022-05-30T06:30:06Z

tests/test_google_ads_interrupted_sync_full_table.py

+
+        # call_details, campaign_labels do not have an id field
+        # accessible_bidding_strategies, accounts, user_list, and bidding_strategies streams have only 1 record.
+        # campaign_criterion, ad_group_criterion have composite primary keys as ad_group_id, criterion_id and campaign_id, criterion_id respectively.


Remove this line as a test is not skipping the campaign_criterion and ad_group_criterion streams.

Removed unnecessary comments.

savan-chovatiya · 2022-05-30T06:30:55Z

tests/test_google_ads_interrupted_sync_full_table.py

+        streams_to_test = {'ad_groups': 132093547633, 'ads': 586722860308, 'campaign_budgets': 10006446850,
+                           'campaigns': 15481829265, 'labels': 21628120997, 'carrier_constant': 70094, 'feed': 351805305,
+                           'feed_item': 216977537909, 'language_constant': 1007, 'mobile_app_category_constant': 60009,
+                           'mobile_device_constant': 604043, 'operating_system_version_constant': 630166, 'topic_constant': 41,
+                            'user_interest': 959, 'campaign_criterion': 16990616126, 'ad_group_criterion': 131901833709}


Suggested change

streams_to_test = {'ad_groups': 132093547633, 'ads': 586722860308, 'campaign_budgets': 10006446850,

'campaigns': 15481829265, 'labels': 21628120997, 'carrier_constant': 70094, 'feed': 351805305,

'feed_item': 216977537909, 'language_constant': 1007, 'mobile_app_category_constant': 60009,

'mobile_device_constant': 604043, 'operating_system_version_constant': 630166, 'topic_constant': 41,

'user_interest': 959, 'campaign_criterion': 16990616126, 'ad_group_criterion': 131901833709}

streams_to_test = {'ad_groups': 132093547633, 'ads': 586722860308, 'campaign_budgets': 10006446850,

'campaigns': 15481829265, 'labels': 21628120997, 'carrier_constant': 70094, 'feed': 351805305,

'feed_item': 216977537909, 'language_constant': 1007, 'mobile_app_category_constant': 60009,

'mobile_device_constant': 604043, 'operating_system_version_constant': 630166, 'topic_constant': 41,

'user_interest': 959, 'campaign_criterion': 16990616126, 'ad_group_criterion': 131901833709}

savan-chovatiya · 2022-05-30T06:34:17Z

tests/test_google_ads_interrupted_sync_full_table.py

+                self.assertGreaterEqual(full_records[i][primary_key], full_records[i-1][primary_key], 
+                                   msg='id of the current record is less than the id of the previous record.')


Suggested change

self.assertGreaterEqual(full_records[i][primary_key], full_records[i-1][primary_key],

msg='id of the current record is less than the id of the previous record.')

self.assertGreaterEqual(full_records[i][primary_key], full_records[i-1][primary_key],

msg='id of the current record is less than the id of the previous record.')

Fixed formatting.

savan-chovatiya · 2022-05-30T06:35:01Z

tests/test_google_ads_interrupted_sync_full_table.py

+                self.assertGreaterEqual(interrupted_records[i][primary_key], interrupted_records[i-1][primary_key], 
+                                   msg='id of the current record is less than the id of the previous record.')


Suggested change

self.assertGreaterEqual(interrupted_records[i][primary_key], interrupted_records[i-1][primary_key],

msg='id of the current record is less than the id of the previous record.')

self.assertGreaterEqual(interrupted_records[i][primary_key], interrupted_records[i-1][primary_key],

msg='id of the current record is less than the id of the previous record.')

Fixed formatting.

savan-chovatiya · 2022-05-30T06:35:22Z

tests/test_google_ads_interrupted_sync_full_table.py

+                                   msg='id of the current record is less than the id of the previous record.')
+
+            # Verify that all records in the interrupted sync come in Ascending order.
+            # That means id of current record is greater than id of last record.


Suggested change

# That means id of current record is greater than id of last record.

# That means id of current record is greater than id of previous record.

savan-chovatiya · 2022-05-30T06:35:36Z

tests/test_google_ads_interrupted_sync_full_table.py

+                primary_key = next(iter(self.expected_primary_keys()[stream]))
+
+            # Verify that all records in the full sync come in Ascending order.
+            # That means id of current record is greater than id of last record.


Suggested change

# That means id of current record is greater than id of last record.

# That means id of current record is greater than id of previous record.

savan-chovatiya · 2022-05-30T06:39:28Z

tests/unittests/test_core_stream_query_building.py

+        expected = 'SELECT id FROM ads  PARAMETERS omit_unselected_resource_names=true'
+
+        actual = create_core_stream_query(RESOURCE_NAME, SELECTED_FIELDS, last_pk_fetched, filter_params, composite_pks)


Replace these expected and actual with expected_query and actual_query in all the test cases.

Replaced expected and actual with expected_query and actual_query.

savan-chovatiya · 2022-05-30T06:40:19Z

tests/unittests/test_core_stream_query_building.py

+        Verify that query contains WHERE(exclusive) and ORDER BY clause if filter_params and
+        last_pk_fetched are available. (interrupted sync). WHERE clause must exclude equality if stream contain
+        a composite primary key.


Suggested change

Verify that query contains WHERE(exclusive) and ORDER BY clause if filter_params and

last_pk_fetched are available. (interrupted sync). WHERE clause must exclude equality if stream contain

a composite primary key.

Verify that query contains WHERE(exclusive) and ORDER BY clause if filter_params and

last_pk_fetched are available. (interrupted sync). WHERE clause must exclude equality if stream does not contain

a composite primary key.

Updated comment.

dbshah1212 · 2022-05-30T06:50:51Z

tap_google_ads/streams.py

+
+def create_core_stream_query(resource_name, selected_fields, last_pk_fetched, filter_param, composite_pks):
+
+    # Generate query parameter WHERE and ORDER BY.


Suggested change

# Generate query parameter WHERE and ORDER BY.

# Generate a query using WHERE and ORDER BY parameters.

Updated comment.

karanpanchal-crest · 2022-05-31T07:35:13Z

tap_google_ads/streams.py

@@ -80,9 +80,43 @@ def build_parameters():
    param_str = ",".join(f"{k}={v}" for k, v in API_PARAMETERS.items())
    return f"PARAMETERS {param_str}"

+def generate_where_and_orderby_clause(last_pk_fetched, filter_param, composite_pks):
+    """
+    Generates a where clause based on filter parameter(`key_properties`), and


@prijendev please include ORDER by clause in the comment

Added ORDER BY clause in the comment.

karanpanchal-crest · 2022-05-31T07:35:57Z

tap_google_ads/streams.py

+
+    if not composite_pks:
+        # Exclude equality for the stream which do not have composite primary key.
+        comparision_operator = ">"


@prijendev Spelling mistake: 'comparison'

@prijendev Why are we changing the operator for a single PK case? How do we handle composite PK case in the where clause?

Because in single pk case we are sure that no other record will have the same pk. So, we do not want to fetch the last record again. That's why we just keep the> operator.

In the case of composite PK, we are storing only a single pk value in the bookmark. So, there might be possible that records with the same single pk value exist with different pk value combinations. That's why for composite_pks we are using a greater than or equal operator.

For example, the ad_group_criterion have records as below,

{'ad_group_id': 2, 'criterion_id': 1} {'ad_group_id': 2, 'criterion_id': 4} {'ad_group_id': 3, 'criterion_id': 4} {'ad_group_id': 3, 'criterion_id': 2}

In this case, the bookmark will be stored as last_pk_fetched: 3. (Assuming state will be written after every 3 records). Assume that tap interrupted after 3rd record. Then next time when sync will start, the query with where and order by clause will be looks like the below,

WHERE ad_group_id >= 3 ORDER BY ad_group_id ASC

karanpanchal-crest · 2022-05-31T07:42:47Z

tap_google_ads/streams.py

+
+    if last_pk_fetched:
+        # Create WHERE clause based on last_pk_fetched.
+        where_clause = 'WHERE {} {} {} '.format(filter_param, comparision_operator, last_pk_fetched)


@prijendev in the example you mentioned that filter_param is in the list format. Appending the list to the where_clause is intentional?

No, appending the list to the where_clause is not intentional. filter_param is in string format.(link) Updated comment in string format.

karanpanchal-crest · 2022-05-31T07:51:27Z

tap_google_ads/streams.py

+        # Assign True if the primary key is composite.
+        composite_pks = len(self.primary_keys) > 1
+
+        query = create_core_stream_query(resource_name, selected_fields, last_pk_fetched.get('last_pk_fetched'), self.filter_param, composite_pks)


@prijendev Why can't we keep this function in the BaseStream class itself?
It will be beneficial because we would not need to pass these many parameters in the function

create_core_stream_query is an existing function. They kept query_building and other helper functions out of class because Basestream is the parent class of UserInterestStream and ReportStream. create query(link) function for report stream is also placed out of BaseStream class. So, I think it would be fine to not move the create query function in BaseStream class.

karanpanchal-crest · 2022-05-31T08:52:08Z

tap_google_ads/streams.py

@@ -80,9 +80,49 @@ def build_parameters():
    param_str = ",".join(f"{k}={v}" for k, v in API_PARAMETERS.items())
    return f"PARAMETERS {param_str}"

+def generate_where_and_orderby_clause(last_pk_fetched, filter_param, composite_pks):


@prijendev in the example, you have not provided value of composite_pks. Please provide that

Provided example of composite_pks.

karanpanchal-crest · 2022-05-31T09:02:58Z

tap_google_ads/streams.py

+
+                    # Write state(last_pk_fetched) using primary key(id) value for core streams after DEFAULT_PAGE_SIZE records
+                    if counter.value % DEFAULT_PAGE_SIZE == 0 and self.filter_param:
+                        singer.write_bookmark(state, stream["tap_stream_id"], customer["customerId"], {'last_pk_fetched': record[self.primary_keys[0]]})


@prijendev can you please add a logger to this line for debugging purpose in future

Added logger message.

savan-chovatiya · 2022-06-02T09:55:56Z

tap_google_ads/streams.py

+            state['bookmarks'].pop(stream["tap_stream_id"])
+            singer.write_state(state)
+
+def should_sync_record(record, filter_param, pks, last_pk_fetched):


Add unit tests for above function.

Added unittest cases for should_sync_record method.

This reverts commit cd06e11.

kspeer825 · 2022-06-03T16:53:38Z

tests/test_google_ads_interrupted_sync_full_table.py

+        interrupted_state = {
+            'currently_syncing': (stream, '5548074409'),
+            'bookmarks': {
+                stream: {'5548074409': {'last_pk_fetched': last_pk_fetched}},
+           }
+        }


Instead of testing every stream individually, I think a more valuable approach is to validate that the sync can resume when it is interrupted part-way through with multiple streams selected. That way we can verify the ordering of streams is maintained on the resuming sync. So in a sync replicating streams A, B and C the interrupted state would be set to stream B. Then the resuming sync should start with stream B, and then move on to stream C. At the tap-tester level we can validate the expected records are synced by comparing to the uninterrupted sync results. At the unittest level the ordering should be explicitly tested. I'd expect the streams to be ordered alphabetically by name, and that the order is shuffled to account for interrupted states.

@kspeer825 We have updated the integration test case as per your suggestion. Can you please review it?
Unittest cases to validate the order of streams are already available here. (link) Are you expecting anything else?

karanpanchal-crest · 2022-06-03T17:18:04Z

tap_google_ads/streams.py

+    Composite PK Case:
+
+    composite_pks = True
+    filter_param = 'id'


@prijendev when we have composite_pks, wouldn't there be more than one filter_param?

atribed

Had a minor comment and a quick question!

atribed · 2022-06-10T20:42:21Z

tap_google_ads/streams.py

+
+    if last_pk_fetched:
+        # Create WHERE clause based on last_pk_fetched.
+        where_clause = 'WHERE {} {} {} '.format(filter_param, comparison_operator, last_pk_fetched)


Super minor, but we use f strings below in a function and format up here, what would you think about using f strings here too? If we did that in the string returned, we wouldn't need to be worried about the space here at the end (return f"{where_clause} {order_by_clause}"), which could be less fragile, since implementation don't need to depend on the space being on the end there.

Replaced format with f string. We used format because earlier some tap broke because we used the f string and some older python versions won't support it. But yes as f string is already used in tap, we can use it in other places as well to keep it consistent across the tap.

atribed · 2022-06-13T15:28:14Z

tap_google_ads/streams.py

+    # Even If the stream has a composite primary key, we are storing only a single pk value in the bookmark.
+    # So, there might be possible that records with the same single pk value exist with different pk value combinations.
+    # That's why for composite_pks we are using a greater than or equal operator.
+    comparison_operator = ">="


I think there was a conversation around this, but I just want to make sure we don't want to store both pieces of the composite key in the bookmark. I think there might be a couple of hard-to-reach edge cases with this approach, but they definitely don't seem likely.

I Agree with you. If the stream has a composite primary key, then we are storing only a single pk value in the bookmark. There might be a duplication at the time of interruption for streams that have composite PKs but it is very rare.

kspeer825 · 2022-06-14T19:55:39Z

tests/test_google_ads_interrupted_sync_full_table.py

+                    # Verify second sync(interrupted_sync) have the less records as compared to first sync(full sync)
+                    self.assertLess(interrupted_record_count, full_record_count)
+
+                    # Verify id value of first record in interrupted sync is less than last_pk_fetched


I think the comment and error msg in the assertion are contradicting the actual assertion being made.

I have updated the comment according to the actual assertion.

kspeer825 · 2022-06-14T19:58:20Z

tests/test_google_ads_interrupted_sync_full_table.py

+                    for record in full_records:
+                        with self.subTest(record_primary_key=record[primary_key]):
+                            self.assertIn(record, interrupted_records, msg='Record missing from resuming sync.' )    


I don't think this assertion is valid. The sync should replicate from the last_pk_fetched onward for the interrupted table. Then move on to the other streams before coming back to this stream.

This assertion is specific to uninterrupted streams(streams that are yet-to-be-synced). That's why here we are verifying that records on both syncs are the same.

got it 👍

atribed

I had one really minor comment (don't need to re-review it after the change), but this all looks good to me!

atribed · 2022-06-14T22:08:32Z

tap_google_ads/streams.py

+        # Create WHERE clause based on last_pk_fetched.
+        where_clause = f'WHERE {filter_param} {comparison_operator} {last_pk_fetched} '
+
+    return f'{where_clause}{order_by_clause}'


Super minor, but since this is an f string now we can just put the space between the two parts here and not need the space at the end of the string used above. Not having the space in the where_clause makes it less susceptible to bugs since each part doesn't need to know where it falls in the string it's being combined into.

If last_pk_fetched(interrupted state) is available, then we are generating a query with a WHERE clause. So, here if we add space in the return statement between two parts and if the where_clause is not available, then it will append extra space before order_by_clause. That's why added extra space at the end of where_clause.

kspeer825 · 2022-06-15T12:25:16Z

tests/test_google_ads_interrupted_sync_full_table.py

+                    for record in full_records:
+                        with self.subTest(record_primary_key=record[primary_key]):
+                            self.assertIn(record, interrupted_records, msg='Record missing from resuming sync.' )    


got it 👍

* Tdl 19235 handle uncaught exceptions (#61) * Added backoff for 5xx, 429 and ReadTimeout errors. * Resolved pylint error. * Updated comments in the unittest cases. * Updated error handling. * TDL-18749 Implement interruptible full table streams. (#60) * Implemented interruptible full table streams. * Resolved pylint error * Resolved error in full table sync test case. * Updated config.yml to pass cci * Updated query building logic. * Updated integration test case. * Resolved review comments. * Resolved comments. * Implemeted logic to skip the duplicate records. * Resolved unittest case error. * Resolved pylint error * Resolved integration test case error * Added empty filter param for call_details and campaign_label stream. * Added unit test cases for should_sync method. * Revert "Implemeted logic to skip the duplicate records." This reverts commit cd06e11. * Added logger message for debugging purpose * Updated integration test case. * Replaced .format with f string. * Updated comment in integration test. Co-authored-by: KrishnanG <kgurusamy@talend.com>

* Bump to v1.0.0, update changelog (singer-io#45) * Qa/future testing (singer-io#48) * documented reasons for untested streams * list high level scenarios for manual qa checks * remove stitch specific pr template Co-authored-by: kspeer <kspeer@stitchdata.com> * Add call_details stream (singer-io#49) * Add call_view core stream, filter non-attribute fields from core streams * Change `call_view` stream name to `call_details` * Make pylint happy * Fix integration tests * Remove campaign_id from foreign keys, split call_details foreign keys * Update tests to exclude call_details as needed * Add context around excluding call_details from tests * Remove outdated TODO related to addition of call_details stream. Co-authored-by: dsprayberry <28106103+dsprayberry@users.noreply.github.com> * Add core LABELS streams and campaign.labels fields to relevant reports (singer-io#53) * Add core LABELS streams and campaign.labels fields to relevant reports * Update test exclusions to exclude new core streams. * Update discover to include campaign_label in reports list * Update foreign_key expected metadata to include attributed resource foreign_keys * Update tests to re-include campaign_labels and labels as we now have test data. * Handles case where state does not have currently_syncing (singer-io#54) * Handles case where state does not have currently_syncing * tap-tester test added * takeout unused imports in test Co-authored-by: kspeer <kspeer@stitchdata.com> * Version bump and changelog entry (singer-io#52) * Version bump and changelog entry * Update to include PR 53. * Add PR 54 to changelog * Implement Automatic Keys (singer-io#55) * Set geographic_view.location_type as automatic to account for reporting discrepancies * Update Automatic Fields test happy / error paths * Update base to include automatic_keys metadata for use in tests * WIP Add Automatic Report Fields * Pass automatic_keys to BaseStream; use automatic_keys for inclusion * Fix bad field name in streams; Start updating tests * Rename function to match base.py * Add closing brace -_- * Fix tests; rename ad_group_criterion_criterion_id in automatic_keys; reorder test metadata; add campaign.id to campaign_audience_performance_report * Rename report_field_parts to split_report_field * Update transform_keys to raise ad_group_ad.ad fields * Update happy path streams_to_test to exclude streams with no data * Remove field name change for ad_performance_report stream * Accept Andy's Suggestion Co-authored-by: Andy Lu <andy@stitchdata.com> * Reverting Andy's change because it affects core streams * Explicitly install grpcio-status to avoid from_call attribute errors * Update setup.py with docs explaining required but unused dep Co-authored-by: Andy Lu <andy@stitchdata.com> Co-authored-by: Arthur Gorka <azgorka@gmail.com> * Cleanup (singer-io#56) * Remove useless function * Rename "REPORTS" to "STREAMS" for accuracy / readability * Version bump for PRs 56 and 55 (singer-io#57) * Version bump for PRs 56 and 55 * Update to exclude ad_group_ad change for ad_performance_report * [Feature] Add more core streams (singer-io#58) * WIP add new core streams * Set geographic_view.location_type as automatic to account for reporting discrepancies * Update Automatic Fields test happy / error paths * Update base to include automatic_keys metadata for use in tests * WIP Add Automatic Report Fields * Pass automatic_keys to BaseStream; use automatic_keys for inclusion * Fix bad field name in streams; Start updating tests * Rename function to match base.py * Add closing brace -_- * Fix tests; rename ad_group_criterion_criterion_id in automatic_keys; reorder test metadata; add campaign.id to campaign_audience_performance_report * Rename report_field_parts to split_report_field * Update transform_keys to raise ad_group_ad.ad fields * Update happy path streams_to_test to exclude streams with no data * WIP remove IPDB and try except * Fix report_definition typo and add handling for user_interest_id field * Remove field name change for ad_performance_report stream * Accept Andy's Suggestion Co-authored-by: Andy Lu <andy@stitchdata.com> * Reverting Andy's change because it affects core streams * Explicitly install grpcio-status to avoid from_call attribute errors * Remove ipdb -_- * Update setup.py with docs explaining required but unused dep * WIP w/ failing tests and attributed_resource foreign_keys * WIP with failing tests; remove extraneous attributed_resoruce automatic fields * Create UserInterestStream class to handle its edge case * Add transform_keys to UserInterestStream class; start test updates * Rename obj to json_message; rename variables accordingly; fix UserInterestStream transform_keys * update auto fields test to account for compound pks * Fix campaigns typo * Remove exclusion of feed and feed items from sync canary test Co-authored-by: Andy Lu <andy@stitchdata.com> Co-authored-by: Arthur Gorka <azgorka@gmail.com> Co-authored-by: kspeer <kspeer@stitchdata.com> Co-authored-by: atribed <agorka@talend.com> * V1.3.0 (singer-io#59) * Version bump for PRs 56 and 55 * Update to exclude ad_group_ad change for ad_performance_report * Version bump for 1.3.0 Co-authored-by: Arthur Gorka <azgorka@gmail.com> * Qa/fix build notification (singer-io#65) * fix slack notif for build * remove click_performance_report from tests * remove click_performance_report from sync canary test * run only streams that are untested in canary sync * just skip canaray test * put back the assert in the skipped test Co-authored-by: kspeer <kspeer@stitchdata.com> * Crest master (singer-io#66) * Tdl 19235 handle uncaught exceptions (singer-io#61) * Added backoff for 5xx, 429 and ReadTimeout errors. * Resolved pylint error. * Updated comments in the unittest cases. * Updated error handling. * TDL-18749 Implement interruptible full table streams. (singer-io#60) * Implemented interruptible full table streams. * Resolved pylint error * Resolved error in full table sync test case. * Updated config.yml to pass cci * Updated query building logic. * Updated integration test case. * Resolved review comments. * Resolved comments. * Implemeted logic to skip the duplicate records. * Resolved unittest case error. * Resolved pylint error * Resolved integration test case error * Added empty filter param for call_details and campaign_label stream. * Added unit test cases for should_sync method. * Revert "Implemeted logic to skip the duplicate records." This reverts commit cd06e11. * Added logger message for debugging purpose * Updated integration test case. * Replaced .format with f string. * Updated comment in integration test. Co-authored-by: KrishnanG <kgurusamy@talend.com> * Bump version (singer-io#67) Co-authored-by: KrishnanG <kgurusamy@talend.com> * TDL-18524 updated readme and added sample config (singer-io#51) * updated readme and added sample config * updated endpoints * add new streams * resolved PR comments * fixed a typo * added the * Add timeout parameter to gas.search (singer-io#64) * Add timeout parameter to search gas.search * Increase timeout to 15 minutes for safety. * Add get_request_timeout function, add config to make_request as needed * Update on_giveup to raise specific exception text for timeoutexception class; add unit test * Make Pylint happy take 1 * Update make_request signature in unittests. * Another Unittest update * More unittest fixes for signature * Add default config param value for ease of implementation in future tests * Fix pylint dangerous-defaul-value * Fix stupid error * Version bump and changelog (singer-io#70) * TDL-19486 Add limit clause to core stream queries (singer-io#68) * Initial commit for add page limit. * Added limit parameter in sync method of ReportStream class. * Fixed issue for call_details stream. * Resolved unit test case error. * Added test cases. * Updated code comments. * Fixed keyerror issue. * Updated default query limit. * Updated pagination test case. * Modify config name to be more explicit. * Update comment for accuracy. * Update property name in base and propogate name change to tests * Exclude feed from streams_to_test because of lack of data * Committing stuff from main that should have already been in Co-authored-by: dsprayberry <28106103+dsprayberry@users.noreply.github.com> * Version Bump and Changelog Update (singer-io#72) * Reintroduce unintentionally removed files. (singer-io#73) * add schemaless * fix schemaless config * update login_customer_ids * fix missing key from dict * add google ads api version in config add log to show the version * use version from config * use version global Co-authored-by: bryantgray <bgray@talend.com> Co-authored-by: Kyle Speer <54034650+kspeer825@users.noreply.github.com> Co-authored-by: kspeer <kspeer@stitchdata.com> Co-authored-by: dsprayberry <28106103+dsprayberry@users.noreply.github.com> Co-authored-by: Andy Lu <andy@stitchdata.com> Co-authored-by: Arthur Gorka <azgorka@gmail.com> Co-authored-by: atribed <agorka@talend.com> Co-authored-by: Prijen Khokhani <88327452+prijendev@users.noreply.github.com> Co-authored-by: KrishnanG <kgurusamy@talend.com> Co-authored-by: KrisPersonal <66801357+KrisPersonal@users.noreply.github.com> Co-authored-by: namrata270998 <75604662+namrata270998@users.noreply.github.com> Co-authored-by: bbaltagi-dtsl <bilal.baltagi@sendinblue.com>

prijendev added 5 commits May 24, 2022 12:00

Implemented interruptible full table streams.

21bbe66

Resolved pylint error

816caa4

Resolved error in full table sync test case.

eff7929

Updated config.yml to pass cci

c41b784

Updated query building logic.

c7dd803

prijendev marked this pull request as ready for review May 27, 2022 08:36

prijendev requested review from dbshah1212, karanpanchal-crest and savan-chovatiya May 27, 2022 08:36

Updated integration test case.

65d632c

savan-chovatiya suggested changes May 30, 2022

View reviewed changes

dbshah1212 reviewed May 30, 2022

View reviewed changes

Resolved review comments.

2dd1f81

prijendev requested review from dbshah1212 and savan-chovatiya May 30, 2022 09:56

karanpanchal-crest suggested changes May 31, 2022

View reviewed changes

Resolved comments.

9988086

prijendev requested a review from karanpanchal-crest May 31, 2022 08:34

karanpanchal-crest suggested changes May 31, 2022

View reviewed changes

prijendev requested a review from karanpanchal-crest May 31, 2022 09:26

dbshah1212 approved these changes May 31, 2022

View reviewed changes

karanpanchal-crest suggested changes May 31, 2022

View reviewed changes

prijendev added 5 commits June 1, 2022 12:05

Implemeted logic to skip the duplicate records.

cd06e11

Resolved unittest case error.

8429347

Resolved pylint error

7154451

Resolved integration test case error

a3e7971

Added empty filter param for call_details and campaign_label stream.

05082dd

prijendev requested a review from karanpanchal-crest June 1, 2022 08:17

prijendev changed the base branch from main to crest-master June 1, 2022 11:28

savan-chovatiya reviewed Jun 2, 2022

View reviewed changes

Added unit test cases for should_sync method.

150afd6

prijendev requested a review from savan-chovatiya June 2, 2022 11:30

prijendev added 2 commits June 3, 2022 10:17

Revert "Implemeted logic to skip the duplicate records."

32f29f6

This reverts commit cd06e11.

Added logger message for debugging purpose

7615dfe

savan-chovatiya approved these changes Jun 3, 2022

View reviewed changes

dbshah1212 requested review from KrisPersonal, kspeer825 and RushiT0122 June 3, 2022 11:05

kspeer825 suggested changes Jun 3, 2022

View reviewed changes

karanpanchal-crest suggested changes Jun 6, 2022

View reviewed changes

karanpanchal-crest approved these changes Jun 6, 2022

View reviewed changes

Updated integration test case.

3bd978a

RushiT0122 approved these changes Jun 8, 2022

View reviewed changes

prijendev requested a review from kspeer825 June 10, 2022 04:41

atribed reviewed Jun 13, 2022

View reviewed changes

Replaced .format with f string.

1a6a007

kspeer825 suggested changes Jun 14, 2022

View reviewed changes

atribed approved these changes Jun 14, 2022

View reviewed changes

Updated comment in integration test.

33f708a

dbshah1212 requested a review from kspeer825 June 15, 2022 08:40

kspeer825 approved these changes Jun 15, 2022

View reviewed changes

prijendev merged commit 3073e3e into crest-master Jun 15, 2022

prijendev mentioned this pull request Jun 15, 2022

Crest master #66

Merged

2 tasks

KrisPersonal mentioned this pull request Jun 15, 2022

Bump version #67

Merged

2 tasks

dsprayberry deleted the TDL-18749-Implement-Interruptible-Full-Table branch June 30, 2022 17:24

		self.assertGreaterEqual(full_records[i][primary_key], full_records[i-1][primary_key],
		msg='id of the current record is less than the id of the previous record.')

		self.assertGreaterEqual(interrupted_records[i][primary_key], interrupted_records[i-1][primary_key],
		msg='id of the current record is less than the id of the previous record.')

	# That means id of current record is greater than id of last record.
	# That means id of current record is greater than id of previous record.

		expected = 'SELECT id FROM ads PARAMETERS omit_unselected_resource_names=true'

		actual = create_core_stream_query(RESOURCE_NAME, SELECTED_FIELDS, last_pk_fetched, filter_params, composite_pks)


		def create_core_stream_query(resource_name, selected_fields, last_pk_fetched, filter_param, composite_pks):

		# Generate query parameter WHERE and ORDER BY.

	# Generate query parameter WHERE and ORDER BY.
	# Generate a query using WHERE and ORDER BY parameters.

TDL-18749 Implement interruptible full table streams. #60

TDL-18749 Implement interruptible full table streams. #60

Conversation

prijendev commented May 24, 2022 • edited

Description of change

QA steps

Risks

Rollback steps

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

atribed left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

prijendev Jun 14, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

prijendev Jun 14, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

prijendev Jun 15, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

atribed left a comment • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

prijendev commented May 24, 2022 •

edited

prijendev Jun 14, 2022 •

edited

prijendev Jun 14, 2022 •

edited

prijendev Jun 15, 2022 •

edited

atribed left a comment •

edited