Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Using ephemeral materialization causes all tables to have all null records #33

Open
1 of 4 tasks
jaredx435k2d0 opened this issue Apr 13, 2023 · 1 comment
Open
1 of 4 tasks
Labels
type:wontfix This will not be worked on

Comments

@jaredx435k2d0
Copy link

jaredx435k2d0 commented Apr 13, 2023

Is there an existing issue for this?

  • I have searched the existing issues

Describe the issue

Due to #32, I'm considering ephemeral as the materialization type for netsuite_source, so that I can then create my own staging schema that has all of the tables my connector is bringing in.

The only alternatives I can think of is to try to put them in the same schema, which would cause name conflicts, or to have two staging schemas for the same source, which will be confusing for everyone. Since nobody will directly use the staging models from netsuite_source anyway, I figure it's ok to abstract / hide them away in ephemeral models.

When I try to do this, all fields for all records for all models are null. When I look in my target folder, I can see the generated SQL is explicitly using null as the value for everything.

Is there a way I can use ephemeral as the materialization for netsuite_source?

Relevant error log or model output

target/compiled/netsuite_source/models/netsuite2/stg_netsuite2__transaction_lines.sql:

with  __dbt__cte__stg_netsuite2__transaction_lines_tmp as (


select * 
from [my_db_name_redacted].netsuite_suiteanalytics.transactionline
),base as (

    select * 
    from __dbt__cte__stg_netsuite2__transaction_lines_tmp
),

fields as (

    select
        
    cast(null as boolean) as 
    
    _fivetran_deleted
    
 , 
    cast(null as TIMESTAMP) as 
    
    _fivetran_synced
    
 , 
    cast(null as INT) as 
    
    id
    
 , 
    cast(null as INT) as 
    
    transaction
    
 , 
    cast(null as INT) as 
    
    linesequencenumber
    
 , 
    cast(null as TEXT) as 
    
    memo
    
 , 
    cast(null as INT) as 
    
    entity
    
 , 
    cast(null as INT) as 
    
    item
    
 , 
    cast(null as INT) as 
    
    class
    
 , 
    cast(null as INT) as 
    
    location
    
 , 
    cast(null as INT) as 
    
    subsidiary
    
 , 
    cast(null as INT) as 
    
    department
    
 , 
    cast(null as TEXT) as 
    
    isclosed
    
 , 
    cast(null as TEXT) as 
    
    isbillable
    
 , 
    cast(null as TEXT) as 
    
    iscogs
    
 , 
    cast(null as TEXT) as 
    
    cleared
    
 , 
    cast(null as TEXT) as 
    
    commitmentfirm
    
 , 
    cast(null as TEXT) as 
    
    mainline
    
 , 
    cast(null as TEXT) as 
    
    taxline
    
 


    from base
),

final as (
    
    select
        _fivetran_synced,
        id as transaction_line_id,
        transaction as transaction_id,
        linesequencenumber as transaction_line_number,
        memo,
        entity as entity_id,
        item as item_id,
        class as class_id,
        location as location_id,
        subsidiary as subsidiary_id,
        department as department_id,
        isclosed = 'T' as is_closed,
        isbillable = 'T' as is_billable,
        iscogs = 'T' as is_cogs,
        cleared = 'T' as is_cleared,
        commitmentfirm = 'T' as is_commitment_firm,
        mainline = 'T' as is_main_line,
        taxline = 'T' as is_tax_line

        --The below macro adds the fields defined within your transaction_lines_pass_through_columns variable into the staging model
        





    from fields
)

select * 
from final

Expected behavior

I would expect the CTEs to use the actual fields rather than nulls.

dbt Project configurations

models:

  +persist_docs:
    relation: true
    columns: true

  +copy_grants: true

  netsuite_source:
    +schema: stg_netsuite_suiteanalytics
    +materialized: ephemeral


vars:
  netsuite_data_model: netsuite2
  netsuite2__multibook_accounting_enabled: false
  netsuite2__using_vendor_categories: true

Package versions

packages:
  # - package: calogica/dbt_date
  #   version: 0.7.2 
  #   included as dependency in fivetran/zendesk

  # - package: dbt-labs/dbt_utils
  #   version: 1.0.0 
  #   included as dependency in fivetran/zendesk

  # - package: dbt-labs/dbt_project_evaluator
  #   version: 0.3.0

  # - package: calogica/dbt_expectations
  #   version: 0.8.2

  - package: dbt-labs/codegen
    version: 0.9.0

  - package: fivetran/zendesk
    version: [">=0.10.0", "<0.11.0"]
    # includes dependencies to calogica/dbt_date and dbt-labs/dbt_utils

  - package: fivetran/netsuite
    version: [">=0.7.0", "<0.8.0"]
    # includes dependencies to fivetran/netsuite_source, fivetran/fivetran_utils, dbt-labs/dbt_utils, and dbt-labs/spark_utils

What database are you using dbt with?

snowflake 7.12.3

dbt Version

Core:
  - installed: 1.4.5
  - latest:    1.4.5 - Up to date!

Plugins:
  - snowflake: 1.4.2 - Up to date!

Additional Context

If I switch materialized to view then it all works great.

I'm using Python 3.10.10.

Are you willing to open a PR to help address this issue?

  • Yes.
  • Yes, but I will need assistance and will schedule time during our office hours for guidance
  • No.
@jaredx435k2d0 jaredx435k2d0 added the bug Something isn't working label Apr 13, 2023
@fivetran-joemarkiewicz
Copy link
Contributor

HI @jaredx435k2d0 thanks for opening this issue.

Unfortunately with how all of our source packages are created, you will not be able to materialize the staging models as ephemeral. They will either need to be views or tables.

This is particularly necessary due to this cte that exists in all of our staging models. This is required to ensure users of the package see a successful run regardless if they have all the fields in the source that we are querying in the staging model. In order for this to work we leverage the adapter.get_columns function in dbt. Unfortunately, this function requires the model it is reference and the model itself to be a view or table.

As this is an integral part to our package infrastructure we likely will not make this update to the package. Although I understand you desire and wish we could have the adapter function work for ephemeral models. If you wish to get around this, I would recommend forking the package and removing that cte. This way you can take advantage materializing as ephemeral while avoiding the macro if you in fact have all of the columns. Otherwise, you can override the models and add your own custom models that do not leverage this cte.

@fivetran-joemarkiewicz fivetran-joemarkiewicz added type:wontfix This will not be worked on and removed bug Something isn't working labels Apr 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

2 participants