Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Register Postgres UUID and CIDR datatypes when creating a connection #8357

Closed

Conversation

gmontanola
Copy link

@gmontanola gmontanola commented Aug 10, 2023

resolves dbt-labs/dbt-postgres#54
docs dbt-labs/docs.getdbt.com/#

Problem

Contracts checks for data types and when it finds stuff like UUID it fails because it's not registered by default.

Solution

Registering some extra data types - UUID and IP ADDR

Checklist

  • I have read the contributing guide and understand what's expected of me
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • This PR has no interface changes (e.g. macros, cli, logs, json artifacts, config files, adapter interface, etc) or this PR has already received feedback and approval from Product or DX

@cla-bot
Copy link

cla-bot bot commented Aug 10, 2023

Thanks for your pull request, and welcome to our community! We require contributors to sign our Contributor License Agreement and we don't seem to have your signature on file. Check out this article for more information on why we have a CLA.

In order for us to review and merge your code, please submit the Individual Contributor License Agreement form attached above above. If you have questions about the CLA, or if you believe you've received this message in error, please reach out through a comment on this PR.

CLA has not been signed by users: @gmontanola

@gmontanola
Copy link
Author

gmontanola commented Aug 10, 2023

Some points:

  1. Should this be a adapter configuration option or can we just call it at module top level/inside the connection handling?
    a. If it's a adapter configuration I can modify the docs and maybe add basic tests.
  2. Are tests needed here? Feels like I'll be testing psycopg2 if I try to check for the registered types.
  3. AFAIC there are some additional types like hstore but I checked the code and it was not straightforward as UUID/IP. Should we register all extras?

I'm inclined to just make this simpler and call register_... by default. Not sure if this could do any harm and probably it will be more helpful since users might face this kind of error constantly while dealing with dbt contracts.

👉 What do you think? Can we make this always-on or should we keep this as adapter configuration?

@gmontanola gmontanola marked this pull request as ready for review August 10, 2023 18:57
@gmontanola gmontanola requested review from a team as code owners August 10, 2023 18:57
@gmontanola gmontanola marked this pull request as draft August 10, 2023 19:05
@gmontanola
Copy link
Author

gmontanola commented Aug 10, 2023

Well, I'll make this simpler since using this inside credentials won't work 😄

(it was obvious, duh)

@cla-bot cla-bot bot added the cla:yes label Aug 10, 2023
@gmontanola gmontanola marked this pull request as ready for review August 10, 2023 19:24
@gmontanola gmontanola changed the title Add Postgres adapter option to register extra data types Register Postgres UUID and CIDR datatypes when creating a connection Aug 10, 2023
@dbeatty10 dbeatty10 added the ready_for_review Externally contributed PR has functional approval, ready for code review from Core engineering label Aug 10, 2023
@codecov
Copy link

codecov bot commented Aug 10, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 83.12%. Comparing base (e69d4e7) to head (523c901).
Report is 361 commits behind head on main.

❗ Current head 523c901 differs from pull request most recent head eef4fa0. Consider uploading reports for the commit eef4fa0 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main    dbt-labs/dbt-core#8357      +/-   ##
==========================================
- Coverage   86.23%   83.12%   -3.12%     
==========================================
  Files         174      174              
  Lines       25518    25529      +11     
==========================================
- Hits        22006    21220     -786     
- Misses       3512     4309     +797     
Flag Coverage Δ
integration 83.12% <ø> (+0.12%) ⬆️
unit ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@colin-rogers-dbt
Copy link
Contributor

Are tests needed here? Feels like I'll be testing psycopg2 if I try to check for the registered types.

If we're going to say that dbt-postgres data contracts support something I think we want to test the integration of psycopg2 and data contracts

What do you think? Can we make this always-on or should we keep this as adapter configuration?

I wonder if we can turn this on if needed (i.e. a callback function if a data contract specifies these types). Not saying this needs to be a blocker since we can probably keep it on by default

@dataders
Copy link
Contributor

Assumptions

lmk if i'm wrong about any of the following

  • psycopg2.extras's register_uuid and register_ipaddress were added to psycopg2 for version 2.7 (released March 1, 2017).
  • dbt-postgres today is compat-pinned to ~=2.8
  • both postgres's UUID and CIDR data types were introduced in Postgres 11 (October 18, 2018)

questions

  • what happens if a user defines a contract on a column and gives one of these types and but is not using postgres 11 or newer?
  • do we have documented somewhere the data types for which contracts are supported in postgres?
  • do we hope to support the long tail of data types across data platforms? (e.g. hstore for postgres)
  • if a new data platform introduces a new data type, is there an explicit list of steps to ensure that they work within contracts?

@gmontanola
Copy link
Author

If we're going to say that dbt-postgres data contracts support something I think we want to test the integration of psycopg2 and data contracts

I'll look into adding some tests then.

What happens if a user defines a contract on a column and gives one of these types and but is not using postgres 11 or newer?

Nice question. I think a weird exception will appear but I'll test it using an older PG version. Then I can try/except graciously and emit a log message in case of failure.


I'm a bit swamped right now but I'll try to improve this ASAP.

@jenna-jordan
Copy link

Hi! I just wanted to note here that I ran into the same issue with postgres's money datatype (as well as the uuid datatype).

@mowsen
Copy link

mowsen commented Oct 22, 2023

Just wanted to come here and say thank you! Your PR fixed my problem with snowplow + dbt + uuid. Thanks !

@ondraz
Copy link

ondraz commented Nov 27, 2023

Hi, would appreciate merging this PR. I have exactly the same issue to @mowsen with snowplow + dbt + uuid running on postgres. Installing dbt from my fork just to have this fixed is quite tedious. Thank you.

@hkorhola
Copy link

seconding @ondraz and @mowsen , I also had to tweak the Python code back then to make it work properly with Postgres and would be good if this would be fixed in the main branch - whether with some configs or then that register -style

@ondraz
Copy link

ondraz commented Mar 21, 2024

Just for the sake of completeness, this has been fixed by this PR. I tested it in dbt-core version 1.7.10 and it works. Thank you.

@dbeatty10 dbeatty10 added community This PR is from a community member Team:Adapters Issues designated for the adapter area of the code dbt-postgres Needs migration to dbt-postgres repo labels Apr 3, 2024
@dbeatty10 dbeatty10 added bug Something isn't working and removed bug Something isn't working labels Apr 10, 2024
@dbeatty10
Copy link
Contributor

Thanks for taking the time to open this PR @gmontanola! Since opening, we've decoupled dbt Adapters from dbt Core, and this code now lives in a separate repo: dbt-postgres.

A consequence of the decoupling is that PR can't be merged anymore as is, so we're closing it. For more context see #9171.

The linked issue has already been transferred. Please don't hesitate to re-open your proposed code changes within this PR in the dbt-postgres repo.

@dbeatty10 dbeatty10 closed this Apr 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cla:yes community This PR is from a community member dbt-postgres Needs migration to dbt-postgres repo ok_to_test ready_for_review Externally contributed PR has functional approval, ready for code review from Core engineering Team:Adapters Issues designated for the adapter area of the code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CT-2957] [Bug] Cannot specify column contract of type UUID in PostgreSQL with DBT 1.5.3
9 participants