[Bug] TABLE_NOT_FOUND {{tmp_relation}} when there are zero batches to process in incremental model #656

antonysouthworth-halter · 2024-05-21T23:37:34Z

Is this a new bug in dbt-athena?

I believe this is a new bug in dbt-athena
I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

dbt-athena/dbt/include/athena/macros/materializations/models/table/create_table_as.sql

Line 182 in 34633a7

    
           {%- do run_query(create_table_as(temporary, relation, create_target_relation_sql, language)) -%}

If there are zero batches to process, this results in query error due to TABLE_NOT_FOUND later on, here:

dbt-athena/dbt/include/athena/macros/materializations/models/helpers/get_partition_batches.sql

Line 11 in 34633a7

    
           select distinct {{ partitioned_keys }} from ({{ sql }}) order by {{ partitioned_keys }};

because the relation does not exist.

Expected Behavior

I would expect the model to complete and just load no data.

Steps To Reproduce

You will need:

an incremental model, partitioned, with more than 100 partitions, force_batch=True, strategy=append
conditions that result in zero data being loaded

e.g. in our case, our model has a clause like

select ...
from ...

where ...
{% if is_incremental() %}
-- don't load data that's already been loaded
and timestamp > (select max(timestamp) from {{this}})

-- don't load data from hours that have not completed, allowing for 30 minutes of lateness
and timestamp < (date_trunc('hour', now() - interval '30' minute))
{% endif %}

So basically, once the model loaded the data for the last completed hour, it will never load data again until at least 30 minutes passed the next hour.

Pretty edge-casey, and we probably could change to insert_overwrite with some effort, but at the same time I don't think the adapter should error out here because there are genuine cases where there might be zero data to load. For example if you are running dbt as part of some other workload and you need to retry the whole thing. Basically it should be idempotent is what I'm saying haha.

Environment

- OS: Darwin 22.6 but we observe the same under Debian running on Fargate in AWS.
- Python: 3.9.19
- dbt: 1.8.0rc2
- dbt-athena-community: 1.8.0rc1

Additional Context

TBH I can probably implement the fix myself, just wanted to ask if there's a specific reason we CTAS on the first batch here rather than CREATE TABLE AS SELECT ... WITH NO DATA before the fore loop and then just INSERT all batches?

The text was updated successfully, but these errors were encountered:

antonysouthworth-halter · 2024-05-23T23:43:23Z

Closing in favour of #519

antonysouthworth-halter added the bug Something isn't working label May 21, 2024

antonysouthworth-halter changed the title ~~[Bug] TABLE_NOT_FOUND {{tmp_relation}} when there are zero batches to process~~ [Bug] TABLE_NOT_FOUND {{tmp_relation}} when there are zero batches to process in incremental model May 21, 2024

This was referenced May 23, 2024

fix #656 #658

Draft

When using force_batch=true with incremental models, it will fail if there is no data to write #519

Open

antonysouthworth-halter closed this as not planned Won't fix, can't repro, duplicate, stale May 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] TABLE_NOT_FOUND {{tmp_relation}} when there are zero batches to process in incremental model #656

[Bug] TABLE_NOT_FOUND {{tmp_relation}} when there are zero batches to process in incremental model #656

antonysouthworth-halter commented May 21, 2024

antonysouthworth-halter commented May 23, 2024

[Bug] TABLE_NOT_FOUND {{tmp_relation}} when there are zero batches to process in incremental model #656

[Bug] TABLE_NOT_FOUND {{tmp_relation}} when there are zero batches to process in incremental model #656

Comments

antonysouthworth-halter commented May 21, 2024

Is this a new bug in dbt-athena?

Current Behavior

Expected Behavior

Steps To Reproduce

Environment

Additional Context

antonysouthworth-halter commented May 23, 2024