Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

attempted fix for hierarchy stream primary key #28

Merged
merged 21 commits into from
Apr 7, 2022
Merged

Conversation

jlloyd-widen
Copy link
Contributor

@jlloyd-widen jlloyd-widen commented Mar 18, 2022

fixes #18 . Specifically, it fixes the Primary Key errors mentioned in that issue. This required flattening of the json structures returned by the API. There were also several instances where query terms were missing in the google query and the primary keys were just completely incorrect.

Josh Lloyd and others added 5 commits March 18, 2022 12:09
@visch
Copy link
Contributor

visch commented Mar 22, 2022

Thanks for this PR @jlloyd-widen

I'm trying to fix #30 first, generally the idea here seems right! Thanks for adding the test, I'll come back after I get 30 fixed

@visch
Copy link
Contributor

visch commented Mar 22, 2022

I'm a little worried about responses with GAQL's of something like https://github.com/AutoIDM/tap-googleads/blob/main/tap_googleads/streams.py#L316

Looks like you can have multiple nested objects returned by Google's api.

Maybe we just pull the primary key out of the record and put it at the root level and leave the rest? I haven't dove in enough to know.

@visch
Copy link
Contributor

visch commented Mar 22, 2022

I'm a little worried about responses with GAQL's of something like https://github.com/AutoIDM/tap-googleads/blob/main/tap_googleads/streams.py#L316

Looks like you can have multiple nested objects returned by Google's api.

Maybe we just pull the primary key out of the record and put it at the root level and leave the rest? I haven't dove in enough to know.

This is pretty well addressed by your PR.

I wonder if leaving the flattening up to the USER would be appropriate here as we could hop onto the new functionality here https://gitlab.com/meltano/sdk/-/merge_requests/236/diffs#5419fd77f507b41e5e0fe17b6fbf7375fbf2c3eb .

That doesn't answer the question about how to get a primary key for each of these reports though. The work you put in to make the primary key for each stream is good.

Things I'm thinking about:

  1. How do we handle custom reports in regards to primary keys and user flattening? Custom Reports #32 . Honestly with the auto flattening here it might be a lot easier to do custom mappings (although you'd have to have the user define which objects to denest or nest which could get confusing)
  2. Make flattening of all objects a user decided feature. Ideally we leave the data untouched when possible?
  3. What about a surrogate key, we'd call it primary_key for each stream, ie for campaign_performance we'd do primary_key = json_path(record, "id")+json_path(record, "date"]

I'm not in love with the surrogate key idea. We're at a trade off point of do we keep the data as close to the source data as possible by leaving the data in the same structure as before but making a surrogate key? Or do we denest the objects ourselves

Just questions not really answers to anything.

@visch
Copy link
Contributor

visch commented Mar 23, 2022

Gave this a shot with some realistic data

fails on the campaign_performance_by_location stream


2022-03-23T15:06:27.287160Z [info     ] Traceback (most recent call last): cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.287291Z [info     ]   File "<string>", line 1, in <module> cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.287443Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/click/core.py", line 1128, in __call__ cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.287571Z [info     ]     return self.main(*args, **kwargs) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.287689Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/click/core.py", line 1053, in main cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.287811Z [info     ]     rv = self.invoke(ctx)      cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.287928Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/click/core.py", line 1395, in invoke cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.288085Z [info     ]     return ctx.invoke(self.callback, **ctx.params) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.288222Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/click/core.py", line 754, in invoke cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.288342Z [info     ]     return __callback(*args, **kwargs) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.288490Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/tap_base.py", line 476, in cli cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.288611Z [info     ]     tap.sync_all()             cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.288736Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/tap_base.py", line 345, in sync_all cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.288875Z [info     ]     stream.sync()              cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.289039Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1010, in sync cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.289181Z [info     ]     self._sync_records(context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.289300Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 953, in _sync_records cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.289431Z [info     ]     self._sync_children(child_context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.289558Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1015, in _sync_children cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.289705Z [info     ]     child_stream.sync(context=child_context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.289861Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1010, in sync cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.289985Z [info     ]     self._sync_records(context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.290135Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 953, in _sync_records cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.290278Z [info     ]     self._sync_children(child_context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.290428Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1015, in _sync_children cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.290629Z [info     ]     child_stream.sync(context=child_context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.290753Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1010, in sync cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.290873Z [info     ]     self._sync_records(context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.290988Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 937, in _sync_records cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.291116Z [info     ]     for record_result in self.get_records(current_context): cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.291306Z [info     ]   File "/home/visch/git/tap-googleads/tap_googleads/client.py", line 124, in get_records cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.291427Z [info     ]     transformed_record = self.post_process(record, context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.291542Z [info     ]   File "/home/visch/git/tap-googleads/tap_googleads/streams.py", line 456, in post_process cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.291660Z [info     ]     return flatten(            cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.291774Z [info     ]   File "/home/visch/git/tap-googleads/tap_googleads/streams.py", line 24, in flatten cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.291895Z [info     ]     subitem = row.pop(prefix)  cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.292010Z [info     ] KeyError: 'campaignCriterion_location' cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.333911Z [error    ] Extraction failed              code=1 job_id=test_5 message=KeyError: 'campaignCriterion_location' name=meltano run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d
2022-03-23T15:06:27.334330Z [info     ] ELT could not be completed: Extractor failed. cmd_type=elt job_id=test_5 name=meltano run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.334461Z [info     ] For more detailed log messages re-run the command using 'meltano --log-level=debug ...' CLI flag. cmd_type=elt job_id=test_5 name=meltano run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.334584Z [info     ] Note that you can also check the generated log file at '/home/visch/git/tap-googleads/.meltano/logs/elt/test_5/56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d/elt.log'. cmd_type=elt job_id=test_5 name=meltano run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.334719Z [info     ] For more information on debugging and logging: https://docs.meltano.com/reference/command-line-interface#debugging cmd_type=elt job_id=test_5 name=meltano run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr

Also fails on the conversion_by_location stream it seems. Haven't dove into why, but we need to clear this up first


2022-03-23T15:10:18.749933Z [info     ] time=2022-03-23 11:10:18 name=tap-googleads level=WARNING message=Property 'client_id' was present in the 'conversion_by_location' stream but not found in catalog schema. Ignoring. cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.750125Z [info     ] Traceback (most recent call last): cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.750259Z [info     ]   File "<string>", line 1, in <module> cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.750466Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/click/core.py", line 1128, in __call__ cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.750622Z [info     ]     return self.main(*args, **kwargs) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.750768Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/click/core.py", line 1053, in main cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.750904Z [info     ]     rv = self.invoke(ctx)      cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.751046Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/click/core.py", line 1395, in invoke cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.751252Z [info     ]     return ctx.invoke(self.callback, **ctx.params) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.751424Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/click/core.py", line 754, in invoke cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.751548Z [info     ]     return __callback(*args, **kwargs) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.751662Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/tap_base.py", line 476, in cli cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.751776Z [info     ]     tap.sync_all()             cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.751889Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/tap_base.py", line 345, in sync_all cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.752004Z [info     ]     stream.sync()              cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.752117Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1010, in sync cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.752230Z [info     ]     self._sync_records(context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.752342Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 953, in _sync_records cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.752478Z [info     ]     self._sync_children(child_context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.752607Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1015, in _sync_children cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.752770Z [info     ]     child_stream.sync(context=child_context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.752889Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1010, in sync cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.753004Z [info     ]     self._sync_records(context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.753127Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 953, in _sync_records cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.753240Z [info     ]     self._sync_children(child_context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.753352Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1015, in _sync_children cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.753476Z [info     ]     child_stream.sync(context=child_context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.753589Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1010, in sync cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.753702Z [info     ]     self._sync_records(context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.753814Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 937, in _sync_records cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.753926Z [info     ]     for record_result in self.get_records(current_context): cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.754038Z [info     ]   File "/home/visch/git/tap-googleads/tap_googleads/client.py", line 124, in get_records cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.754151Z [info     ]     transformed_record = self.post_process(record, context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.754263Z [info     ]   File "/home/visch/git/tap-googleads/tap_googleads/streams.py", line 490, in post_process cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.754375Z [info     ]     return flatten(            cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.754505Z [info     ]   File "/home/visch/git/tap-googleads/tap_googleads/streams.py", line 24, in flatten cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.754618Z [info     ]     subitem = row.pop(prefix)  cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.754731Z [info     ] KeyError: 'campaignCriterion_location' cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.799819Z [error    ] Extraction failed              code=1 job_id=test_5 message=KeyError: 'campaignCriterion_location' name=meltano run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3
2022-03-23T15:10:18.800206Z [info     ] ELT could not be completed: Extractor failed. cmd_type=elt job_id=test_5 name=meltano run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.800331Z [info     ] For more detailed log messages re-run the command using 'meltano --log-level=debug ...' CLI flag. cmd_type=elt job_id=test_5 name=meltano run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.800512Z [info     ] Note that you can also check the generated log file at '/home/visch/git/tap-googleads/.meltano/logs/elt/test_5/e62918d4-c084-4c72-89cd-6bf19f2a13c3/elt.log'. cmd_type=elt job_id=test_5 name=meltano run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.800665Z [info     ] For more information on debugging and logging: https://docs.meltano.com/reference/command-line-interface#debugging cmd_type=elt job_id=test_5 name=meltano run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr

@visch
Copy link
Contributor

visch commented Mar 23, 2022

Added Data for one of the failures

2022-03-23T15:21:45.786665Z [info     ] Data sent into flatten: {'campaign_resourceName': 'customers/!number!/campaigns/!number!', 'campaign_name': 'Job Title - Various Locations', 'campaign_id': '!number!', 'campaignCriterion_resourceName': 'customers/!number!/campaignCriteria/!number!~!number!', 'locationView_resourceName': 'customers/!number!/locationViews/!number!~!number!', 'conversions': 1, 'conversionActionCategory': 'SUBMIT_LEAD_FORM', 'date': '2022-03-19'} cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.786781Z [info     ] Traceback (most recent call last): cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.786898Z [info     ]   File "<string>", line 1, in <module> cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.787012Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/click/core.py", line 1128, in __call__ cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.787208Z [info     ]     return self.main(*args, **kwargs) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.787330Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/click/core.py", line 1053, in main cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.787468Z [info     ]     rv = self.invoke(ctx)      cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.787606Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/click/core.py", line 1395, in invoke cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.787725Z [info     ]     return ctx.invoke(self.callback, **ctx.params) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.787862Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/click/core.py", line 754, in invoke cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.787991Z [info     ]     return __callback(*args, **kwargs) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.788106Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/tap_base.py", line 476, in cli cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.788221Z [info     ]     tap.sync_all()             cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.788308Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/tap_base.py", line 345, in sync_all cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.788466Z [info     ]     stream.sync()              cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.788581Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1010, in sync cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.788713Z [info     ]     self._sync_records(context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.788907Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 953, in _sync_records cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.789067Z [info     ]     self._sync_children(child_context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.789218Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1015, in _sync_children cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.789434Z [info     ]     child_stream.sync(context=child_context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.789559Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1010, in sync cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.789691Z [info     ]     self._sync_records(context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.789819Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 953, in _sync_records cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.789952Z [info     ]     self._sync_children(child_context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.790083Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1015, in _sync_children cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.790212Z [info     ]     child_stream.sync(context=child_context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.790343Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1010, in sync cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.790474Z [info     ]     self._sync_records(context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.790604Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 937, in _sync_records cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.790731Z [info     ]     for record_result in self.get_records(current_context): cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.790872Z [info     ]   File "/home/visch/git/tap-googleads/tap_googleads/client.py", line 124, in get_records cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.791006Z [info     ]     transformed_record = self.post_process(record, context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.791133Z [info     ]   File "/home/visch/git/tap-googleads/tap_googleads/streams.py", line 492, in post_process cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.791248Z [info     ]     return flatten(            cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.791379Z [info     ]   File "/home/visch/git/tap-googleads/tap_googleads/streams.py", line 26, in flatten cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.791506Z [info     ]     subitem = row.pop(prefix)  cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.791620Z [info     ] KeyError: 'campaignCriterion_location' cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.834672Z [error    ] Extraction failed              code=1 job_id=test_5 message=KeyError: 'campaignCriterion_location' name=meltano run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d
2022-03-23T15:21:45.835079Z [info     ] ELT could not be completed: Extractor failed. cmd_type=elt job_id=test_5 name=meltano run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.835209Z [info     ] For more detailed log messages re-run the command using 'meltano --log-level=debug ...' CLI flag. cmd_type=elt job_id=test_5 name=meltano run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.835332Z [info     ] Note that you can also check the generated log file at '/home/visch/git/tap-googleads/.meltano/logs/elt/test_5/7484dece-3eb3-4ac8-8d19-a0e942d7732d/elt.log'. cmd_type=elt job_id=test_5 name=meltano run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.835452Z [info     ] For more information on debugging and logging: https://docs.meltano.com/reference/command-line-interface#debugging cmd_type=elt job_id=test_5 name=meltano run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr

@jlloyd-widen
Copy link
Contributor Author

I wonder if leaving the flattening up to the USER would be appropriate here as we could hop onto the new functionality here https://gitlab.com/meltano/sdk/-/merge_requests/236/diffs#5419fd77f507b41e5e0fe17b6fbf7375fbf2c3eb .

That doesn't answer the question about how to get a primary key for each of these reports though. The work you put in to make the primary key for each stream is good.

Things I'm thinking about:

  1. How do we handle custom reports in regards to primary keys and user flattening? Custom Reports #32 . Honestly with the auto flattening here it might be a lot easier to do custom mappings (although you'd have to have the user define which objects to denest or nest which could get confusing)
  2. Make flattening of all objects a user decided feature. Ideally we leave the data untouched when possible?

@visch This got me thinking that the best way to handle this probably wouldn't be to introduce flattening like I have here, but instead to introduce the ability to copy (not cut or flatten) the primary keys from where ever the are naturally in the json structure to the root level of the record. You could use json paths to designate each primary key.

For example, if the primary key json path $.object_1.example_primary_key was provided, this:

{
    "root_data_1": "foo",
    "object_1": {
        "example_primary_key": 1,
        "other_data_1": "bar"
    }
}

would become:

{
    "root_data_1": "foo",
    "example_primary_key": 1,
    "object_1": {
        "example_primary_key": 1,
        "other_data_1": "bar"
    }
}

The downside is that this introduces duplicate data into almost all of the streams, the upsides are

  1. the json paths to the primary keys can be provided/overwritten by the config to address the custom reports you mentioned in a different issue.
  2. The data is technically untouched with the minor addition of the now (sometimes) duplicate primary key data.

I think I might do this in a separate PR to demonstrate and then we can decide on which approach is preferred. I'll link all these issues and PRs once I have it.

@visch
Copy link
Contributor

visch commented Mar 24, 2022

I wonder if leaving the flattening up to the USER would be appropriate here as we could hop onto the new functionality here https://gitlab.com/meltano/sdk/-/merge_requests/236/diffs#5419fd77f507b41e5e0fe17b6fbf7375fbf2c3eb .
That doesn't answer the question about how to get a primary key for each of these reports though. The work you put in to make the primary key for each stream is good.
Things I'm thinking about:

  1. How do we handle custom reports in regards to primary keys and user flattening? Custom Reports #32 . Honestly with the auto flattening here it might be a lot easier to do custom mappings (although you'd have to have the user define which objects to denest or nest which could get confusing)
  2. Make flattening of all objects a user decided feature. Ideally we leave the data untouched when possible?

@visch This got me thinking that the best way to handle this probably wouldn't be to introduce flattening like I have here, but instead to introduce the ability to copy (not cut or flatten) the primary keys from where ever the are naturally in the json structure to the root level of the record. You could use json paths to designate each primary key.

For example, if the primary key json path $.object_1.example_primary_key was provided, this:

{
    "root_data_1": "foo",
    "object_1": {
        "example_primary_key": 1,
        "other_data_1": "bar"
    }
}

would become:

{
    "root_data_1": "foo",
    "example_primary_key": 1,
    "object_1": {
        "example_primary_key": 1,
        "other_data_1": "bar"
    }
}

The downside is that this introduces duplicate data into almost all of the streams, the upsides are

  1. the json paths to the primary keys can be provided/overwritten by the config to address the custom reports you mentioned in a different issue.
  2. The data is technically untouched with the minor addition of the now (sometimes) duplicate primary key data.

I think I might do this in a separate PR to demonstrate and then we can decide on which approach is preferred. I'll link all these issues and PRs once I have it.

I think I like that idea best! Maybe the key name could be something like _sdc_primary_key to follow singer conventions thus far?

@jlloyd-widen
Copy link
Contributor Author

jlloyd-widen commented Mar 24, 2022

I think I like that idea best! Maybe the key name could be something like _sdc_primary_key to follow singer conventions thus far?

That makes sense for a single primary key. What's the naming convention for a composite primary key? _sdc_primary_key_1, etc.?

UPDATE: Actually there isn't a convention, but the closest pattern I could come up with was _sdc_primary_key_<json_path>. Any objections?

@visch
Copy link
Contributor

visch commented Mar 24, 2022

I think I like that idea best! Maybe the key name could be something like _sdc_primary_key to follow singer conventions thus far?

That makes sense for a single primary key. What's the naming convention for a composite primary key? _sdc_primary_key_1, etc.?

UPDATE: Actually there isn't a convention, but the closest pattern I could come up with was _sdc_primary_key_<json_path>. Any objections?

Hmm, I was thinking we'd say SCHEMA primary_keys = ['_sdc_primary_key`]

and if _sdc_primary_key is a composite we'd do that as a part of calculating _sdc_primary_key ie if it's
["campaign_id", "adGroup_id"] then _sdc_primary_key=campaign_id+adGroup_id

We may want to add a check to be sure the generated primary key is actually unique? But that might be overkill honestly

Maybe I'm missing something? Let me know!

I guess we could always just not use _sdc prefix at all here, I was just thinking it'd probably make the most sense to folks used to Singer taps.

@jlloyd-widen
Copy link
Contributor Author

Hmm, I was thinking we'd say SCHEMA primary_keys = ['_sdc_primary_key`]

and if _sdc_primary_key is a composite we'd do that as a part of calculating _sdc_primary_key ie if it's ["campaign_id", "adGroup_id"] then _sdc_primary_key=campaign_id+adGroup_id

That's a better idea. I'm in favor of using the _sdc for the reasons you mentioned. Building a composite key like that actually helps alleviate the undesirable duplicate data of my original approach. I'll get a PR for this up shortly.

@jlloyd-widen
Copy link
Contributor Author

@visch updated the PR to match the discussion. Give it a run and let me know what you think. Hopefully this also resolves the errors you were getting.

@visch
Copy link
Contributor

visch commented Mar 24, 2022

@visch updated the PR to match the discussion. Give it a run and let me know what you think. Hopefully this also resolves the errors you were getting.

Skimmed the PR, generally looks good I'll deep dive once we get these bugs ironed out

New data in replicate_pk_at_root: {'campaign': {'resourceName': 'customers/!numberhere!/campaigns/!numberhere!', 'name': '!WordsWereHere! - Various Locations', 'id': '!numberhere!'}, 'metrics': {'clicks': '0', 'costMicros': '0', 'ctr': 0, 'impressions': '47'}, 'campaignCriterion': {'resourceName': 'customers/!numberhere!/campaignCriteria/!numberhere!~!numberhere!'}, 'segments': {'date': '2022-03-17'}, 'locationView': {'resourceName': 'customers/!numberhere!/locationViews/'!numberhere!~'!numberhere!'}}

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/tap_base.py", line 476, in cli
    tap.sync_all()
  File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/tap_base.py", line 345, in sync_all
    stream.sync()
  File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1010, in sync
    self._sync_records(context)
  File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 953, in _sync_records
    self._sync_children(child_context)
  File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1015, in _sync_children
    child_stream.sync(context=child_context)
  File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1010, in sync
    self._sync_records(context)
  File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 953, in _sync_records
    self._sync_children(child_context)
  File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1015, in _sync_children
    child_stream.sync(context=child_context)
  File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1010, in sync
    self._sync_records(context)
  File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 937, in _sync_records
    for record_result in self.get_records(current_context):
  File "/home/visch/git/tap-googleads/tap_googleads/client.py", line 126, in get_records
    transformed_record = self.post_process(record, context)
  File "/home/visch/git/tap-googleads/tap_googleads/client.py", line 151, in post_process
    return replicate_pk_at_root(row, self.primary_keys_jsonpaths)
  File "/home/visch/git/tap-googleads/tap_googleads/utils.py", line 31, in replicate_pk_at_root
    val = val[level]
KeyError: 'location'

@jlloyd-widen
Copy link
Contributor Author

jlloyd-widen commented Mar 24, 2022

KeyError: 'location'

@visch Your account apparently does have the field available that I was using to set a primary key so I used an alternative field in the same query. Give this new commit a shot in your account and see if it works.

@visch
Copy link
Contributor

visch commented Apr 4, 2022

Ran this and it did work! Doing a code review now

for i, level in enumerate(levels):
val = val[level]
if i == len(levels) - 1:
pk_str += ":" + str(val) if pk_str else str(val)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know for mssql there's a limit to the size a PK index can be. https://docs.microsoft.com/en-us/sql/relational-databases/tables/primary-and-foreign-key-constraints?view=sql-server-ver15#:~:text=A%20table%20can%20contain%20only,key%20length%20of%20900%20bytes.

Postgres has some similar length I believe. I think we're fine for now, but we may have to evaluate this at some point and maybe just md5 the whole thing

@visch
Copy link
Contributor

visch commented Apr 4, 2022

Made some changes to queries while this was being written. We'll have to merge this PR, but I think we'll be pretty close to set after that. I'll go ahead and merge them and make a PR for this branch on your github. We'll see how this goes :D

@jlloyd-widen
Copy link
Contributor Author

jlloyd-widen commented Apr 4, 2022

Made some changes to queries while this was being written. We'll have to merge this PR, but I think we'll be pretty close to set after that. I'll go ahead and merge them and make a PR for this branch on your github. We'll see how this goes :D

Merged your PR on my repo's branch so this should not have merge conflicts anymore and should be up to date.

@visch
Copy link
Contributor

visch commented Apr 4, 2022

Made some changes to queries while this was being written. We'll have to merge this PR, but I think we'll be pretty close to set after that. I'll go ahead and merge them and make a PR for this branch on your github. We'll see how this goes :D

Merged your PR on my repo's branch so this should not have merge conflicts anymore and should be up to date.

Added a few more things (You were too fast for me, and I forgot a few things ha)

Could you run the latest stuff with your Google Ads data to be sure it works on your side?

After that I think we're good to go, part of me wants to run an audit on some data here to be sure we didn't miss any data on the migration. I'll think about that, if this works on your side I may just merge it!

@@ -0,0 +1,32 @@
from tap_googleads.utils import replicate_pk_at_root
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for writing tests!

@jlloyd-widen
Copy link
Contributor Author

jlloyd-widen commented Apr 4, 2022

Could you run the latest stuff with your Google Ads data to be sure it works on your side?

Sorry for the happy trigger finger ;) I ran all streams locally with your latest changes. Everything ran smoothly so I merged it. Let me know if you have any other edits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

primary_key incorrect - causes failures in target-postgres, and target-snowflake (probably more)
2 participants