Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix renaming the old_relation in Oracle 12c+ when using DBT table materialization with missing database name in profiles.yml #3

Closed
wants to merge 37 commits into from

Conversation

ThoSap
Copy link
Contributor

@ThoSap ThoSap commented May 4, 2022

Hi,

first of all, thank you very much for releasing an official DBT Oracle DB adapter/profile with an up-to-date DBT version!!! 😍
I tried this adapter today locally together with some Airbyte modifications and found that version 1.0.0 does not rename the already existing destination table to a backup table when using DBT table materialization and therefore I get the following error:

1 of 1 ERROR creating table model staging.test................................................................. [ERROR in 0.96s]
2022-05-04 08:41:55 normalization > 08:41:55  oracle adapter: Oracle error: ORA-00955: name is already used by an existing object

With dbt run --debug

Database Error in model test (models/generated/airbyte_tables/staging/test.sql)
2022-05-04 08:42:00 normalization > 08:41:55.566833 [error] [MainThread]:   ORA-00955: name is already used by an existing object
2022-05-04 08:42:00 normalization > 08:41:55.567157 [error] [MainThread]:   compiled SQL at ../build/run/airbyte_utils/models/generated/airbyte_tables/staging/test.sql

This is caused by the fact that the database name is passed to function adapter.get_relation() which should not be the case for Oracle DB.
https://github.com/oracle/dbt-oracle/blob/v1.0.0/dbt/include/oracle/macros/materializations/table/table.sql#L22
In Oracle DB the database name is never a prefix when making a select as it does not even work.
schema.relation
database.schema.relation

Also at the DBT/Jinija macro oracle__list_relations_without_caching the filter for the table_catalog (database) should be removed as usually no one ever will pass the database name in the profiles.yml.

Fixes #4
Fixes techindicium/dbt-oracle#8
Fixes techindicium/dbt-oracle#32

@ThoSap ThoSap changed the title Fix dropping the old_relation in Oracle 12c+ when using DBT table materialization #4 Fix dropping the old_relation in Oracle 12c+ when using DBT table materialization May 4, 2022
@ThoSap ThoSap changed the title #4 Fix dropping the old_relation in Oracle 12c+ when using DBT table materialization Fix dropping the old_relation in Oracle 12c+ when using DBT table materialization May 4, 2022
@aosingh
Copy link
Member

aosingh commented May 4, 2022

Thank you @ThoSap for testing and contributing.

While we review to get this merged, could you sign the Oracle Contributor Agreement (OCA) ?

The bottom of your commit message must have the following line using your name
and e-mail address as it appears in the OCA Signatories list.

Signed-off-by: Your Name <you@example.org>

This can be automatically added to pull requests by committing with:

git commit --signoff

@ThoSap
Copy link
Contributor Author

ThoSap commented May 4, 2022

I just wanted to create an Individual OCA but the oracle/dbt-oracle repo is not available for selection and it is a required field.
https://oca.opensource.oracle.com/api/v1/projects/enabled

Can you help @aosingh ?

@aosingh
Copy link
Member

aosingh commented May 4, 2022

Sure, let me check and I will get back to you

@ThoSap ThoSap changed the title Fix dropping the old_relation in Oracle 12c+ when using DBT table materialization Fix renaming the old_relation in Oracle 12c+ when using DBT table materialization May 4, 2022
@aosingh aosingh self-assigned this May 5, 2022
@aosingh
Copy link
Member

aosingh commented May 5, 2022

Hi @ThoSap

Thanks once again for your help!

I would like to better understand the problem and the fixes. The reason is, I am trying to reproduce the issue but it works as expected with the current version of dbt-oracle.

Test Model - https://github.com/oracle/dbt-oracle/blob/main/dbt_adbs_test_project/models/promotion_costs.sql
Oracle Database version - 19c
dbt_project.yml - https://github.com/oracle/dbt-oracle/blob/main/dbt_adbs_test_project/dbt_project.yml

I ran the model 3 times and each time it completed successfully. With --debug flag enabled i could also see the backup table being dropped using the below PL/SQL.

DECLARE
     dne_942    EXCEPTION;
     PRAGMA EXCEPTION_INIT(dne_942, -942);
     attempted_ddl_on_in_use_GTT EXCEPTION;
     pragma EXCEPTION_INIT(attempted_ddl_on_in_use_GTT, -14452);
  BEGIN
     SAVEPOINT start_transaction;
     EXECUTE IMMEDIATE 'DROP table dbt_test.promotion_costs__dbt_backup cascade constraint';
     COMMIT;
  EXCEPTION
     WHEN attempted_ddl_on_in_use_GTT THEN
        NULL; -- if it its a global temporary table, leave it alone.
     WHEN dne_942 THEN
        NULL;
  END;

@ThoSap ThoSap changed the title Fix renaming the old_relation in Oracle 12c+ when using DBT table materialization Fix renaming the old_relation in Oracle 12c+ when using DBT table materialization with missing database name in profiles.yml May 5, 2022
@ThoSap
Copy link
Contributor Author

ThoSap commented May 5, 2022

The thing with this change is that it would no longer be necessary to specify the database name (in profiles.yml) which normally is never needed in any other application using OJDBC or cx_Oracle when connecting to an Oracle DB.
The database name (select SYS_CONTEXT('userenv', 'DB_NAME') from dual;) is not required for OJDBC, cx_Oracle and not even in SQL Developer, but only the service name or SID instead.

I don't think the database name property is needed at all as the service name should suffice with these PR changes, maybe we can eliminate it altogether.
But it could be that I don't get the technical reason why the database name is really needed in DBT Oracle 🤔

This probably has historic reasons and also there is the confusion, where if one specifies the profiles.yml property dbname but not database and also omits service, the property dbname with alias database gets to be the service name and then the connection works anyways.
https://github.com/oracle/dbt-oracle/blob/v1.0.0/dbt/adapters/oracle/connections.py#L129-L134
Example profiles.yml:

config:
  partial_parse: true
  printer_width: 120
  send_anonymous_usage_stats: false
  use_colors: true
dbt_test:
  target: prod
  outputs:
    prod:
      type: oracle
      protocol: tcp
      host: yourhost.example.com
      port: 1521
      schema: test
      user: test
      pass: SECRET_PW
      # Connection works anyways as in connections.py#L83 the dbname property gets alias database
      # and if the service property is not specified the database internally is the service in the end
      # https://github.com/oracle/dbt-oracle/blob/v1.0.0/dbt/adapters/oracle/connections.py#L134
      dbname: test.service.name.example.com
      threads: 4

For my use case using Airbyte (I plan to do there a PR also with the new oracle/dbt-oracle adapter), the database property in profiles.yml would pose a breaking change in the Oracle Destination Connector, as suddenly users need to specify the database name (select SYS_CONTEXT('userenv', 'DB_NAME') from dual;) as a mandatory field if they want to use the Oracle Basic Normalization which is already fully implemented in Airbyte but currently not enabled due to the fact that the current Airbyte Oracle normalization implementation is still using DBT 0.19.1 from techindicium/dbt-oracle:0.4.3.

@ThoSap
Copy link
Contributor Author

ThoSap commented May 5, 2022

Also the OracleIncludePolicy for database is False anyways 😁
https://github.com/oracle/dbt-oracle/blob/v1.0.0/dbt/adapters/oracle/relation.py#L32

Copy link
Member

@aosingh aosingh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we revert the change in dbt/adapters/oracle/impl.py because in OracleIncludePolicy the property database is False. Additionally IncludePolicy is configurable in dbt_project.yml ?

@aosingh
Copy link
Member

aosingh commented May 5, 2022

Thanks for explaining this to me. I understand the following.

  • You would like to keep dbname or database as optional parameter in connection config. This way it will not be a breaking change for Airbyte. Also, you can create a PR there to use the new dbt adapter

  • Secondly, dbname or database is redundant in the WHERE clause in oracle__list_relations_without_caching macro and we can safely remove it. Thus we can leave it empty or None

Let me know if this understanding is correct.

@ThoSap
Copy link
Contributor Author

ThoSap commented May 6, 2022

Could we revert the change in dbt/adapters/oracle/impl.py because in OracleIncludePolicy the property database is False. Additionally IncludePolicy is configurable in dbt_project.yml ?

Without this change it is not working, even though for OracleIncludePolicy the property database is False.

@aosingh
Copy link
Member

aosingh commented May 6, 2022

Could we revert the change in dbt/adapters/oracle/impl.py because in OracleIncludePolicy the property database is False. Additionally IncludePolicy is configurable in dbt_project.yml ?

Without this change it is not working, even though OracleIncludePolicy the property database is False.

Ok, I will check this and let you know.

Copy link
Member

@aosingh aosingh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to start by keeping dbname or database as optional and then eventually remove it. I fear if we immediately get rid of this, it will break things for users who are currently using dbname or database.

@ThoSap
Copy link
Contributor Author

ThoSap commented May 6, 2022

Thanks for explaining this to me. I understand the following.

  • You would like to keep dbname or database as optional parameter in connection config. This way it will not be a breaking change for Airbyte. Also, you can create a PR there to use the new dbt adapter
  • Secondly, dbname or database is redundant in the WHERE clause in oracle__list_relations_without_caching macro and we can safely remove it. Thus we can leave it empty or None

Let me know if this understanding is correct.

  • I would like to remove dbname, database and env var DBT_ORACLE_DATABASE alltogether as it makes no sense at all. It does not matter if one connects using OracleConnectionMethod -> HOST, TNS or CONNECTION_STRING, you will always be on the same database (select SYS_CONTEXT('userenv', 'DB_NAME') from dual;) and therefore also the function impl.py#verify_database can never throw an exception, not even with a DB link.

  • The WHERE clause in macros
    oracle__list_relations_without_caching
    and
    oracle__get_columns_in_relation
    also do not make sense at all due to the first point.
    We do not have to filter a column just for the sake that it exists in the first place.

Maybe I'm missing the technical reason but why was this dbname, database and env var DBT_ORACLE_DATABASE introduced in the first place?
Could it be that with the Oracle Autonomous Database Service (ADBS) it is possible to have multiple database (select SYS_CONTEXT('userenv', 'DB_NAME') from dual;) names with the same service/connection as I never used ADBS before.

@geoHeil
Copy link

geoHeil commented May 6, 2022

Thanks for explaining this to me. I understand the following.

  • You would like to keep dbname or database as optional parameter in connection config. This way it will not be a breaking change for Airbyte. Also, you can create a PR there to use the new dbt adapter
  • Secondly, dbname or database is redundant in the WHERE clause in oracle__list_relations_without_caching macro and we can safely remove it. Thus we can leave it empty or None

Let me know if this understanding is correct.

* I would like to **remove** `dbname`, `database` and env var `DBT_ORACLE_DATABASE` alltogether as it makes no sense at all. It does not matter if one connects using `OracleConnectionMethod` -> `HOST`, `TNS `or `CONNECTION_STRING`, you will always be on the same database (`select SYS_CONTEXT('userenv', 'DB_NAME') from dual;`) and therefore also the function [impl.py#verify_database](https://github.com/oracle/dbt-oracle/blob/v1.0.0/dbt/adapters/oracle/impl.py#L101-L112) can _never_ throw an exception, not even with a DB link.

* The WHERE clause in macros
  [`oracle__list_relations_without_caching`](https://github.com/oracle/dbt-oracle/blob/v1.0.0/dbt/include/oracle/macros/adapters.sql#L344)
  and
  [`oracle__get_columns_in_relation`](https://github.com/oracle/dbt-oracle/blob/v1.0.0/dbt/include/oracle/macros/adapters.sql#L185-L187)
  also do not make sense at all due to the first point.
  We do not have to filter a column just for the sake that it exists in the first place.

Maybe I'm missing the technical reason but why was this dbname, database and env var DBT_ORACLE_DATABASE introduced in the first place? Could it be that with the Oracle Autonomous Database Service (ADBS) it is possible to have multiple database (select SYS_CONTEXT('userenv', 'DB_NAME') from dual;) names with the same service/connection as I never used ADBS before.

agree and would love to see this simplification/fix getting merged.

@aosingh
Copy link
Member

aosingh commented May 6, 2022

Maybe I'm missing the technical reason but why was this dbname, database and env var DBT_ORACLE_DATABASE introduced in the first place?

This was present in Techindicium's version and we have kept it for compatibility reasons. I do see your point. Even in ADBS we will not have multiple values for db_name.

@aosingh
Copy link
Member

aosingh commented May 6, 2022

Also, even in dbt-core v0.19 (one used by Airbyte), database is a mandatory parameter in the Credentials class. These contracts are defined by dbt-core. Maybe this is also why database or dbname existed in dbt-oracle

https://github.com/dbt-labs/dbt-core/blob/0.19.latest/core/dbt/contracts/connection.py#L120

@ThoSap
Copy link
Contributor Author

ThoSap commented May 6, 2022

I will come up with a better solution for these issues.

@ThoSap ThoSap requested a review from aosingh May 6, 2022 12:55
@ThoSap
Copy link
Contributor Author

ThoSap commented May 6, 2022

I just wanted to create an Individual OCA but the oracle/dbt-oracle repo is not available for selection and it is a required field. https://oca.opensource.oracle.com/api/v1/projects/enabled

Can you help @aosingh ?

Do you have an update on this @aosingh ?
I would sign the OCR in the meantime.

@ThoSap
Copy link
Contributor Author

ThoSap commented May 6, 2022

I'm still trying to fix the following error, help would be appreciated 😁

14:46:57.198298 [error] [MainThread]: Encountered an error:
2022-05-06 14:46:57 normalization > non-default argument 'schema' follows default argument

Thomas Sapelza and others added 24 commits May 11, 2022 11:03
…and do not filter on the Oracle DB name for adapter macro oracle__list_relations_without_caching

Signed-off-by: Thomas Sapelza <sapelza.thomas@gmail.com>

Signed-off-by: Thomas Sapelza <thomas.sapelza@gknpm.com>
…get_columns_in_relation

Signed-off-by: Thomas Sapelza <sapelza.thomas@gmail.com>

Signed-off-by: Thomas Sapelza <thomas.sapelza@gknpm.com>
…LE_DATABASE as it makes no sense, not even for function adapter.verifiy_database(database)

Signed-off-by: Thomas Sapelza <sapelza.thomas@gmail.com>

Signed-off-by: Thomas Sapelza <thomas.sapelza@gknpm.com>
Signed-off-by: Thomas Sapelza <sapelza.thomas@gmail.com>

Signed-off-by: Thomas Sapelza <thomas.sapelza@gknpm.com>
Signed-off-by: Thomas Sapelza <sapelza.thomas@gmail.com>

Signed-off-by: Thomas Sapelza <thomas.sapelza@gknpm.com>
Signed-off-by: Thomas Sapelza <sapelza.thomas@gmail.com>

Signed-off-by: Thomas Sapelza <thomas.sapelza@gknpm.com>
…ument

Signed-off-by: Thomas Sapelza <sapelza.thomas@gmail.com>

Signed-off-by: Thomas Sapelza <thomas.sapelza@gknpm.com>
Signed-off-by: Thomas Sapelza <sapelza.thomas@gmail.com>

Signed-off-by: Thomas Sapelza <thomas.sapelza@gknpm.com>
…redentials

Signed-off-by: Thomas Sapelza <sapelza.thomas@gmail.com>

Signed-off-by: Thomas Sapelza <thomas.sapelza@gknpm.com>
Signed-off-by: Thomas Sapelza <sapelza.thomas@gmail.com>

Signed-off-by: Thomas Sapelza <thomas.sapelza@gknpm.com>
Signed-off-by: Thomas Sapelza <sapelza.thomas@gmail.com>

Signed-off-by: Thomas Sapelza <thomas.sapelza@gknpm.com>
Signed-off-by: Thomas Sapelza <sapelza.thomas@gmail.com>

Signed-off-by: Thomas Sapelza <thomas.sapelza@gknpm.com>
Signed-off-by: Thomas Sapelza <sapelza.thomas@gmail.com>

Signed-off-by: Thomas Sapelza <thomas.sapelza@gknpm.com>
…s set to True

Signed-off-by: Thomas Sapelza <sapelza.thomas@gmail.com>

Signed-off-by: Thomas Sapelza <thomas.sapelza@gknpm.com>
Signed-off-by: Thomas Sapelza <sapelza.thomas@gmail.com>

Signed-off-by: Thomas Sapelza <thomas.sapelza@gknpm.com>
Signed-off-by: Thomas Sapelza <sapelza.thomas@gmail.com>

Signed-off-by: Thomas Sapelza <thomas.sapelza@gknpm.com>
Signed-off-by: Thomas Sapelza <sapelza.thomas@gmail.com>

Signed-off-by: Thomas Sapelza <thomas.sapelza@gknpm.com>
Signed-off-by: Thomas Sapelza <sapelza.thomas@gmail.com>

Signed-off-by: Thomas Sapelza <thomas.sapelza@gknpm.com>
Signed-off-by: Thomas Sapelza <sapelza.thomas@gmail.com>

Signed-off-by: Thomas Sapelza <thomas.sapelza@gknpm.com>
Signed-off-by: Thomas Sapelza <sapelza.thomas@gmail.com>

Signed-off-by: Thomas Sapelza <thomas.sapelza@gknpm.com>
Signed-off-by: Thomas Sapelza <sapelza.thomas@gmail.com>

Signed-off-by: Thomas Sapelza <thomas.sapelza@gknpm.com>
@ThoSap ThoSap force-pushed the fix-drop-relation-oracle-19c branch from de6af15 to f983c4b Compare May 11, 2022 09:05
@ThoSap
Copy link
Contributor Author

ThoSap commented May 11, 2022

Thank you @ThoSap for testing and contributing.

While we review to get this merged, could you sign the Oracle Contributor Agreement (OCA) ?

The bottom of your commit message must have the following line using your name and e-mail address as it appears in the OCA Signatories list.

Signed-off-by: Your Name <you@example.org>

This can be automatically added to pull requests by committing with:

git commit --signoff

I signed off all commits using my OCA agreement using rebase, sorry for the commit spam here.
https://docs.github.com/en/pull-requests/committing-changes-to-your-project/creating-and-editing-commits/changing-a-commit-message
https://hyperledger-indy.readthedocs.io/projects/sdk/en/latest/docs/contributors/signing-commits.html#signing-your-current-commit

@ThoSap
Copy link
Contributor Author

ThoSap commented May 16, 2022

Superseded by #6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants