Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

primary_key incorrect - causes failures in target-postgres, and target-snowflake (probably more) #18

Closed
DanielPDWalker opened this issue Dec 6, 2021 · 7 comments · Fixed by #28
Labels
bug Something isn't working

Comments

@DanielPDWalker
Copy link

DanielPDWalker commented Dec 6, 2021

As the tap stands there are a couple of issues when using target-postgres, both variant Meltano and Transferwise.


1. Using target-postgres causing errors.

Meltano variant of target-postgres results in a psycopg2 error of UndefinedColumn, as it unpackages the json objects with different names to the expected.
Transferwise variant errors out with key_properties field is required.

The work around in my use case was to remove primary_keys all together (to have it work with multiple target-postgres variants), but realise that isn't a great solution, and not suitable in general.

target-postgres Transferwise error

target-postgres |   File "/home/dw/Documents/hahaha/tap-googleads/.meltano/loaders/target-postgres/venv/lib/python3.8/site-packages/target_postgres/__init__.py", line 209, in persist_lines
target-postgres |     raise Exception("key_properties field is required")
target-postgres | Exception: key_properties field is required
meltano         | Loading failed (1): Exception: key_properties field is required

target-postgres Meltano error

target-postgres |     return super(DictCursor, self).execute(query, vars)
target-postgres | psycopg2.errors.UndefinedColumn: column "customer_client.id" named in key does not exist
target-postgres | LINE 1: ..., "customer_client__time_zone" character varying, PRIMARY KE...
target-postgres |                                                              ^
target-postgres | 
meltano         | Loading failed (1): (see above)
meltano         | ELT could not be completed: Loader failed


2. If you do delete the primary keys, then use the Meltano variant of target-postgres you only get your last synced customer data.

Using the Meltano variant seems to make it so each time a customer's data is looped and added to the postgres table, that the table is just completely re-created, so you only ever get the last customers data. Also when the tap loops through the customers, all the postgres tables are created over and over for each customer.

I didn't end up looking too much into this issue, and it could be a more of a loader issue rather than tap, but I've never run into the loader behaving like this before so thought I mention it.

For us, we use the Meltano variant of target-postgres for some other reasons by default, so my work around was just to sync a single customer, no looping, no overwriting. Again more a fix for our use case than in general.

Would be awesome to hear some thoughts/ideas on these issues, not sure if anyone has seen a target-postgres creating tables over and over? I'm guessing its some quirk with the child streams?

@visch
Copy link
Contributor

visch commented Dec 7, 2021

Could you post the error you're getting from target-postgres?

@DanielPDWalker
Copy link
Author

Added the error messages to the issue

@visch
Copy link
Contributor

visch commented Dec 7, 2021

One of the errors is due to: https://github.com/AutoIDM/tap-googleads/blob/main/tap_googleads/streams.py#L81

  • This one we just need to add the correct primary key if this isn't it (not sure how it was missed)

The other is due to: https://github.com/AutoIDM/tap-googleads/blob/main/tap_googleads/streams.py#L32

  • Need to add the correct primary key here

  • Final note is we should add a Test to run the whole connector with Target-Postgres as that's a pretty good standard. Today where this is running in production uses target-s3 so a lot of requirements are looser

@jlloyd-widen
Copy link
Contributor

jlloyd-widen commented Mar 18, 2022

I'm getting a very similar error with target-snowflake transferwise variant.

2022-03-18T16:36:51.571689Z [info     ] invalid identifier '"CUSTOMER_CLIENT.ID"' cmd_type=loader job_id=gads_1 name=target-snowflake run_id=dc70b619-bb7d-4a3e-97c7-8bc53a27fca2 stdio=stderr

which pretty much immediately follows this log:

2022-03-18T16:36:51.123275Z [info     ] time=2022-03-18 10:36:51 name=target_snowflake level=INFO message=Running query: 'CREATE TABLE IF NOT EXISTS gads."CUSTOMER_HIERARCHYSTREAM" ("_SDC_BATCHED_AT" timestamp_ntz, "_SDC_DELETED_AT" text, "_SDC_EXTRACTED_AT" timestamp_ntz, "_SDC_SOURCE_SCHEMA" text, "CUSTOMERCLIENT" variant, PRIMARY KEY ("CUSTOMER_CLIENT.ID")) data_retention_time_in_days = 1 ' with Params {'LAST_QID': None} cmd_type=loader job_id=gads_1 name=target-snowflake run_id=dc70b619-bb7d-4a3e-97c7-8bc53a27fca2 stdio=stderr

This, to me, indicates that the issue is not that the primary key is wrong but that the primary key is located inside of a variant or json field and therefore the primary key can't be accessed for the purposes of table creation.

@visch visch changed the title target-postgres issues primary_key incorrect - causes failures in target-postgres, and target-snowflake (probably more) Mar 23, 2022
@visch visch added the bug Something isn't working label Mar 23, 2022
@visch visch closed this as completed in #28 Apr 7, 2022
@visch
Copy link
Contributor

visch commented Apr 7, 2022

Let me know if this doesn't' work now and I'll reopen this!

@visch
Copy link
Contributor

visch commented Apr 7, 2022

Thanks a bunch to @jlloyd-widen

@mdgart
Copy link

mdgart commented May 14, 2023

Was this fixed? Because I still get the same error:

name=target_postgres level=CRITICAL message=Primary key is set to mandatory but not defined in the [accessible_customers] stream

and then

raise Exception("key_properties field is required")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants