[Bug] get_relation reporting dbt found an approximate match #5

sfc-gh-dflippo · 2022-05-06T18:59:26Z

Is there an existing issue for this?

I have searched the existing issues

Current Behavior

Hello, I am the author of the dbt_constraints package and I am attempting to add support for dbt-oracle.

I am receiving the following error when my code runs adapter.get_relation():

Encountered an error:
Compilation Error in operation dbt_constraints-on-run-end-0 (./dbt_project.yml)
  When searching for a relation, dbt found an approximate match. Instead of guessing 
  which relation to use, dbt will move on. Please delete DBT_USER.dim_customers, or rename it to be less ambiguous.
  Searched for: dbt_user.dim_customers
  Found: DBT_USER.dim_customers

I have a schema named DBT_USER and I have specified the the schema in upper case in my profile.

I believe the root cause is how dbt-oracle changes the case of objects in your catalog.sql:

          select
              lower(tables.table_catalog) as "table_database",
              lower(tables.table_schema) as "table_schema",
              lower(tables.table_name) as "table_name",
              lower(tables.table_type) as "table_type",
              all_tab_comments.comments as "table_comment",
              lower(columns.column_name) as "column_name",
              ordinal_position as "column_index",
              lower(case
                when data_type like '%CHAR%'
                then data_type || '(' || cast(char_length as varchar(10)) || ')'
                else data_type
              end) as "column_type",
              all_col_comments.comments as "column_comment",
              tables.table_schema as "table_owner"
          from tables

I work for Snowflake and our database uses exactly the same conventions as Oracle for how identifiers are treated with and without quotes. Identifiers become upper case if not quoted and case sensitive if quoted. Therefore, I think it can be helpful to compare some of your code against the dbt-Labs code for dbt-Snowflake. In the dbt-Labs implementation of dbt-Snowflake they do not adjust the case of our databases, schemas, tables, or columns in the catalog.sql

There could also be an issue with the oracle__get_columns_in_relation macro:

      select
          lower(column_name) as "name",
          lower(data_type) as "type",
          char_length as "character_maximum_length",
          numeric_precision as "numeric_precision",
          numeric_scale as "numeric_scale"
      from columns

In the snowflake__get_columns_in_relation macro dbt-labs uses DESCRIBE TABLE to retrieve columns and they do not change the case of column names or data types (upper case just like Oracle).

I reviewed all the SQL lookups dbt-Labs uses for Snowflake and I consistently see they use UPPER() = UPPER() in their WHERE clauses but in all other cases their metadata lookups preserve the case for databases, schema, tables, and columns. I recommend dbt-Oracle should do the same.

Expected Behavior

No response

Steps To Reproduce

No response

Relevant log output using `--debug` flag enabled

No response

Environment

- OS: MacOS
- Python: 3.8.13
- dbt: 1.0.6

What Oracle database version are you using dbt with?

Oracle XE 21c

Additional Context

No response

The text was updated successfully, but these errors were encountered:

aosingh · 2022-05-11T05:24:15Z

Hi @danflippo
Thank you so much for such detailed explanation. This is very helpful.
I will check this issue and keep you posted.

aosingh · 2022-05-25T19:05:40Z

@danflippo

I wanted to test dbt_constraints package to understand the issue better. I installed it using dbt deps command

packages.yml is as shown below

packages:
  - git: https://github.com/fishtown-analytics/dbt-utils.git
  - package: Snowflake-Labs/dbt_constraints
    version: 0.3.0

After running dbt test command, I get the below error. Do I need to modify my tests ?

 File "/Users/abhisoms/dbt-oracle/.dbt-oracle-env/lib/python3.8/site-packages/dbt/clients/jinja_static.py", line 140, in statically_parse_adapter_dispatch
    macro = db_wrapper.dispatch(
  File "/Users/abhisoms/dbt-oracle/.dbt-oracle-env/lib/python3.8/site-packages/dbt/context/providers.py", line 187, in dispatch
    raise CompilationException(msg)
dbt.exceptions.CompilationException: Compilation Error
  In dispatch: No macro named 'create_primary_key' found
      Searched for: 'dbt_adbs_test_project.oracle__create_primary_key', 'dbt_adbs_test_project.default__create_primary_key', 'dbt_constraints.oracle__create_primary_key', 'dbt_constraints.default__create_primary_key'

sfc-gh-dflippo · 2022-05-26T13:16:58Z

For each database I support I have to add a new set of adapter-specific macros. I was testing with an offline branch but I have now published this branch to Github so you can access it. Since it isn't part of a release, you will need to pull in the module directly using the following git syntax:

  - git: "https://github.com/Snowflake-Labs/dbt_constraints.git"
    revision: dbt-oracle-support

sfc-gh-dflippo · 2022-05-29T21:01:36Z

I did some more tracing and I found an additional change. In the Snowflake adapter's impl.py we have the following function:

    def _make_match_kwargs(self, database, schema, identifier):
        quoting = self.config.quoting
        if identifier is not None and quoting["identifier"] is False:
            identifier = identifier.upper()

        if schema is not None and quoting["schema"] is False:
            schema = schema.upper()

        if database is not None and quoting["database"] is False:
            database = database.upper()

        return filter_null_values(
            {"identifier": identifier, "schema": schema, "database": database}
        )

This overrides the default implementation in dbt-core:

    def _make_match_kwargs(self, database: str, schema: str, identifier: str) -> Dict[str, str]:
        quoting = self.config.quoting
        if identifier is not None and quoting["identifier"] is False:
            identifier = identifier.lower()

        if schema is not None and quoting["schema"] is False:
            schema = schema.lower()

        if database is not None and quoting["database"] is False:
            database = database.lower()

        return filter_null_values(
            {
                "database": database,
                "identifier": identifier,
                "schema": schema,
            }
        )

As I mentioned earlier, the databases share the same convention so a non-quoted identifier becomes uppercase and I think this is one of the necessary changes to accommodate this. The dbt-core get_relation() function calls _make_match() and that function calls _make_match_kwargs(). I still believe it is also good to remove your lower() functions in your catalog.sql. I wouldn't be surprised if you had needed those lower() functions in your SQL because you didn't have the logic to override the match function. I think once your override the function you will be able to change your catalog.sql to:

          select
              tables.table_catalog as "table_database",
              tables.table_schema as "table_schema",
              tables.table_name as "table_name",
              tables.table_type as "table_type",
              all_tab_comments.comments as "table_comment",
              columns.column_name as "column_name",
              ordinal_position as "column_index",
              case
                when data_type like '%CHAR%'
                then data_type || '(' || cast(char_length as varchar(10)) || ')'
                else data_type
              end as "column_type",
              all_col_comments.comments as "column_comment",
              tables.table_schema as "table_owner"
          from tables

I would issue a pull-request but I can only imagine the legal red tape I would have to go through to get the Oracle Contribution Agreement signed.

Also, these are the steps to recreate my test using dbt-constraint's built-in integration tests:

Create a profile called dbt_constraints that points to your Oracle DB. It should be able to use any empty user/schema.
navigate to the folder your_package/dbt_packages/dbt_constraints/integration_tests
dbt deps
dbt build

The way dbt_constraints works, there are normal tests called primary_key, unique_key, and foreign_key that work pretty must as you would expect. These tests do not have any issues in my integration tests. Then after all the models and tests are complete, dbt_constraints has a on-run-end hook that runs the create_constraints macro in create_constraints.sql. This macro loops through all the executed tests and looks for tests that it can turn into an equivalent physical constraint. It is in the beginning of this macro that the macro fails with the approximate match error. I believe the error occurs on line 200:

            {%- set table_relation = adapter.get_relation(
                database=table_models[0].database,
                schema=table_models[0].schema,
                identifier=table_models[0].name) -%}

At this point, the logic has found a test and through its depends_on.nodes property it has a reference to the table model. This is my manifest.json:
manifest.json.zip

If you search for "model.dbt_constraints_integration_tests.dim_customers", you will find it on line 465. You can find the three attributes I am using to look up the relation on line 490 ("database": "XEPDB1"), 491 ("schema": "DBT_USER") and 501 ("name": "dim_customers").

aosingh · 2022-05-31T03:52:22Z

Thank you @danflippo for this detailed analysis. I am going through your recommendations and analysis and will get back to you.

aosingh · 2022-06-02T01:22:49Z

Hi @danflippo

I am testing the dbt_constraints package from the following git branch

  - git: https://github.com/Snowflake-Labs/dbt_constraints.git
    revision: dbt-oracle-support

For connection, both user and schema are set to upper case

DBT_ORACLE_USER=DBT_TEST
DBT_ORACLE_SCHEMA=DBT_TEST

The test cases pass after which dbt_constraints logic is invoked at which point I encounter the following error

dbt.exceptions.CompilationException: Compilation Error in operation dbt_constraints-on-run-end-0 (./dbt_project.yml)
  'dbt.contracts.graph.compiled.CompiledSingularTestNode object' has no attribute 'test_metadata'
  
  > in macro create_constraints (macros/create_constraints.sql)
  > called by macro create_constraints_by_type (macros/create_constraints.sql)
  > called by operation dbt_constraints-on-run-end-0 (./dbt_project.yml)

My schema.yml looks is as shown below

models:
  - name: people
    columns:
      - name: id
        tests:
          - dbt_constraints.primary_key
          - not_null
          - unique
      - name: gender
        tests:
          - accepted_values:
              values: ['Male', 'Female']
  - name: eu_direct_sales_channels_promo_costs
    columns:
      - name: country_id
        tests:
          - relationships:
              to: ref('countries')
              field: country_id
          - dbt_constraints.foreign_key:
              pk_table_name: ref('countries')
              pk_column_name: country_id

Please let me know if I am missing something here.

aosingh · 2022-06-02T02:08:22Z

On a side note, I tested dbt functionalities with both lowercase dbt_test and uppercase DBT_TEST schema in profile.yml. It works the same in both cases. Even if you change the casing for subsequent runs dbt does not fail.

aosingh · 2022-06-02T18:33:56Z

This is the output log of dbt build command

18:31:55  14 of 19 START test relationships_eu_direct_sales_channels_promo_costs_country_id__country_id__ref_countries_ [RUN]
18:31:57  13 of 19 OK created view model dbt_test.people.................................. [OK in 7.25s]
18:31:57  15 of 19 START test accepted_values_people_gender__Male__Female................. [RUN]
18:31:57  16 of 19 START test dbt_constraints_primary_key_people_id....................... [RUN]
18:32:01  10 of 19 OK snapshotted dbt_test.promotion_costs_snapshot....................... [OK in 15.17s]
18:32:01  17 of 19 START test not_null_people_id.......................................... [RUN]
18:32:03  14 of 19 PASS relationships_eu_direct_sales_channels_promo_costs_country_id__country_id__ref_countries_ [PASS in 8.43s]
18:32:03  18 of 19 START test test_count_employees........................................ [RUN]
18:32:05  15 of 19 PASS accepted_values_people_gender__Male__Female....................... [PASS in 7.96s]
18:32:05  19 of 19 START test unique_people_id............................................ [RUN]
18:32:05  16 of 19 PASS dbt_constraints_primary_key_people_id............................. [PASS in 7.97s]
18:32:07  17 of 19 PASS not_null_people_id................................................ [PASS in 6.13s]
18:32:09  18 of 19 PASS test_count_employees.............................................. [PASS in 5.57s]
18:32:10  19 of 19 PASS unique_people_id.................................................. [PASS in 4.81s]
18:32:12  
18:32:12  Running 2 on-run-end hooks
18:32:12  1 of 2 START hook: dbt_adbs_test_project.on-run-end.0........................... [RUN]
18:32:12  1 of 2 OK hook: dbt_adbs_test_project.on-run-end.0.............................. [OK in 0.23s]
18:32:12  Running dbt Constraints
18:32:12  Database error while running on-run-end
18:32:14  Encountered an error:
Compilation Error in operation dbt_constraints-on-run-end-0 (./dbt_project.yml)
  'dbt.contracts.graph.compiled.CompiledSingularTestNode object' has no attribute 'test_metadata'
  
  > in macro create_constraints (macros/create_constraints.sql)
  > called by macro create_constraints_by_type (macros/create_constraints.sql)
  > called by operation dbt_constraints-on-run-end-0 (./dbt_project.yml)

sfc-gh-dflippo · 2022-06-06T16:19:31Z

This was reported by another user as well. The issue wasn't related at all to dbt-Oracle. I needed to add one line to verify that a test is not singular test. I have released a new version of dbt Constraints with this fix and I have back-ported the latest changes (including a new test I have for singular tests) into my dbt-oracle-support branch.

You should be able to pull in my changes now with dbt deps.

aosingh · 2022-06-10T00:20:04Z

@danflippo

Thank you for updating dbt_constraints. I tested the updates.

With dbt test command, it creates all constraints as defined in schema.yml

23:53:07  Running 2 on-run-end hooks
23:53:07  1 of 2 START hook: dbt_adbs_test_project.on-run-end.0........................... [RUN]
23:53:07  1 of 2 OK hook: dbt_adbs_test_project.on-run-end.0.............................. [OK in 0.17s]
23:53:07  Running dbt Constraints
23:53:08  Creating primary key: PEOPLE_ID_PK
23:53:08  Creating primary key: COUNTRIES_COUNTRY_ID_PK
23:53:13  Creating foreign key: EU_DIRECT_SALES_CHANNELS_PROMO_COSTS_COUNTRY_ID_FK referencing COUNTRIES ['country_id']
23:53:14  Finished dbt Constraints
23:53:14  2 of 2 START hook: dbt_constraints.on-run-end.0................................. [RUN]
23:53:14  2 of 2 OK hook: dbt_constraints.on-run-end.0.................................... [OK in 0.00s]
23:53:14  
23:53:14  
23:53:14  Finished running 10 tests, 3 hooks in 21.18s.
23:53:14  
23:53:14  Completed successfully
23:53:14  
23:53:14  Done. PASS=10 WARN=0 ERROR=0 SKIP=0 TOTAL=10

The following table shows all constraints created in the database

OWNER	CONSTRAINT_NAME	TABLE_NAME	COLUMN_NAME	POSITION
DBT_TEST	COUNTRIES_COUNTRY_ID_PK	COUNTRIES	COUNTRY_ID	1
DBT_TEST	EU_DIRECT_SALES_CHANNELS_PROMO_COSTS_COUNTRY_ID_FK	EU_DIRECT_SALES_CHANNELS_PROMO_COSTS	COUNTRY_ID	1
DBT_TEST	PEOPLE_ID_PK	PEOPLE	ID	1

However, with dbt build command it does not create constraints and prints the following. Could you help me understand why this difference ?

23:52:29  Running 2 on-run-end hooks
23:52:29  1 of 2 START hook: dbt_adbs_test_project.on-run-end.0........................... [RUN]
23:52:30  1 of 2 OK hook: dbt_adbs_test_project.on-run-end.0.............................. [OK in 0.19s]
23:52:30  Running dbt Constraints
23:52:30  Skipping primary/unique key because a physical column name was not found on the table: countries ['country_id']
23:52:30  Skipping primary/unique key because a physical column name was not found on the table: people ['id']
23:52:30  Skipping primary/unique key because a physical column name was not found on the table: countries ['country_id']
23:52:30  Skipping primary/unique key because a physical column name was not found on the table: people ['id']
23:52:30  Skipping foreign key because a physical column was not found on the pk table: countries ['country_id']
23:52:31  Skipping foreign key because a physical column was not found on the pk table: countries ['country_id']
23:52:31  Finished dbt Constraints
23:52:31  2 of 2 START hook: dbt_constraints.on-run-end.0................................. [RUN]
23:52:31  2 of 2 OK hook: dbt_constraints.on-run-end.0.................................... [OK in 0.00s]
23:52:31

- Adapter method _make_match_kwargs() is updated to perform UPPER case comparison - Adapter method quote_seed_column() is updated to not quote seed cols by default - adapter.sql is updated to not lower case metadata - catalog.sql is updated to not lower case metadata - Incremental materialization needed a fix to perform a UPPER case comparison on the unique_key in ON clause - Added a test scenarios in dbt_adbs_test_project to test integration with dbt_constraints - Fixes Bug - #5

aosingh · 2022-06-10T01:39:16Z

You can also install dbt-oracle with latest changes from the git branch https://github.com/oracle/dbt-oracle/tree/fix/integration_with_dbt_constraints

We plan to release this in the upcoming dbt-oracle v1.0.3

You can also review the PR #22.

aosingh · 2022-06-14T13:06:34Z

Hi @danflippo

As we are planning to release dbt-oracle's next version late today or early tomorrow, I wanted to check if you had a chance to test the updated dbt-oracle and found any issue integrating with dbt_constraints ?

Thanks

aosingh · 2022-06-15T15:26:14Z

Although this issue was closed automatically after merging the PR, please let me know if you have any questions.

sfc-gh-dflippo · 2022-06-21T15:49:19Z

I'm sorry I didn't get back to you last week. I was on vacation out of the country. The difference above seems to be that it is not finding an exact match for the column in your constraint. I use the following macro to lookup the columns on a relation and compare it to an array of fields for the constraint:

{# This macro tests that all the column names passed to the macro can be found on the table, ignoring case #}
{%- macro table_columns_all_exist(table_relation, column_list) -%}
    {%- set tab_Columns = adapter.get_columns_in_relation(table_relation) -%}

    {%- set tab_column_list = [] -%}
    {%- for column in tab_Columns -%}
        {{ tab_column_list.append(column.name|upper) }}
    {%- endfor -%}

    {%- for column in column_list|map('upper') if column not in tab_column_list -%}
        {{ return(false) }}
    {%- endfor -%}
    {{ return(true) }}

{%- endmacro -%}

I just published a new version of dbt Constraints and I've back-ported the changes into the oracle support branch. I saw your changes and they all look good to me. I'm going to do some debugging to see if I can identify what was causing your error.

sfc-gh-dflippo · 2022-06-26T23:05:13Z

In the end I found that using the adapter.get_relation() macro was problematic and I could solve any remaining issues by switching to dbt's api.Relation.create()macro. All of my tests are passing now and I have published a new release of dbt Constraints to the dbt Hub that includes support for Oracle. Thanks for the help! You can mark this issue as complete.

aosingh · 2022-06-27T15:25:28Z

Thank you @danflippo for confirming!

sfc-gh-dflippo added the bug Something isn't working label May 6, 2022

aosingh self-assigned this May 7, 2022

ThoSap mentioned this issue May 18, 2022

[Bug] Incremental upsert strategy fails with ORA-00911 & ORA-01747 #8

Closed

1 task

aosingh added the work-in-progress Planned in the upcoming release label Jun 8, 2022

aosingh mentioned this issue Jun 10, 2022

Fix/integration with dbt constraints #22

Merged

aosingh mentioned this issue Jun 10, 2022

Release/v1.0.3 #23

Merged

aosingh closed this as completed in #23 Jun 15, 2022

aosingh removed the work-in-progress Planned in the upcoming release label Aug 3, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] get_relation reporting dbt found an approximate match #5

[Bug] get_relation reporting dbt found an approximate match #5

sfc-gh-dflippo commented May 6, 2022 •

edited

aosingh commented May 11, 2022

aosingh commented May 25, 2022

sfc-gh-dflippo commented May 26, 2022

sfc-gh-dflippo commented May 29, 2022

aosingh commented May 31, 2022

aosingh commented Jun 2, 2022

aosingh commented Jun 2, 2022

aosingh commented Jun 2, 2022

sfc-gh-dflippo commented Jun 6, 2022

aosingh commented Jun 10, 2022

aosingh commented Jun 10, 2022

aosingh commented Jun 14, 2022

aosingh commented Jun 15, 2022 •

edited

sfc-gh-dflippo commented Jun 21, 2022

sfc-gh-dflippo commented Jun 26, 2022

aosingh commented Jun 27, 2022

[Bug] get_relation reporting dbt found an approximate match #5

[Bug] get_relation reporting dbt found an approximate match #5

Comments

sfc-gh-dflippo commented May 6, 2022 • edited

Is there an existing issue for this?

Current Behavior

Expected Behavior

Steps To Reproduce

Relevant log output using --debug flag enabled

Environment

What Oracle database version are you using dbt with?

Additional Context

aosingh commented May 11, 2022

aosingh commented May 25, 2022

sfc-gh-dflippo commented May 26, 2022

sfc-gh-dflippo commented May 29, 2022

aosingh commented May 31, 2022

aosingh commented Jun 2, 2022

aosingh commented Jun 2, 2022

aosingh commented Jun 2, 2022

sfc-gh-dflippo commented Jun 6, 2022

aosingh commented Jun 10, 2022

aosingh commented Jun 10, 2022

aosingh commented Jun 14, 2022

aosingh commented Jun 15, 2022 • edited

sfc-gh-dflippo commented Jun 21, 2022

sfc-gh-dflippo commented Jun 26, 2022

aosingh commented Jun 27, 2022

sfc-gh-dflippo commented May 6, 2022 •

edited

Relevant log output using `--debug` flag enabled

aosingh commented Jun 15, 2022 •

edited