-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CT-3515] [Feature] Source freshness: allow loaded_at_field
set to null
at table level to overwrite default value set at the source level
#9320
Comments
Here is DBT project (using sqlite) which shows the error: Steps to reproduce:
Sources yaml (present in the project): version: 2
sources:
- name: sources_db
database: test
schema: main
freshness:
warn_after:
count: 1
period: day
error_after:
count: 4
period: day
loaded_at_field: absent
tables:
- name: freshness_fails
identifier: example
loaded_at_field: null
description: |
loaded_at_field is set to null and thus DBT should use warehouse metadata (at least try)
Notice error message:
16:07:56 Runtime Error in source freshness_fails (models\sources.yml)
Database Error
no such column: absent
- name: freshness_skipped
identifier: example
freshness: null |
Thanks for reaching out @kokorin ! The crux is that it's tricky to within the dbt implementation details to determine the difference between an explicit null / none and an implicit one. Our current implementation treats them both as the same thing, and I don't think we're inclined to change it at this point. Either way, it doesn't look like dbt is behaving erroneously here, so I'm going to re-categorize this as a feature request. Have you tried either of the following instead? Option 1Two sources. One for freshness from metadata and the other based off column names. # source tables with source freshness from metadata
- name: sources_with_metadata_based_freshness
database: db_name
schema: schema_name
freshness:
warn_after:
count: 1
period: day
error_after:
count: 4
period: day
tables:
- name: table_1
- name: table_2
# source tables with source freshness based off column names
- name: sources_with_column_based_freshness
database: db_name
schema: schema_name
freshness:
warn_after:
count: 1
period: day
error_after:
count: 4
period: day
loaded_at_field: loaded_at # default at the source level
tables:
- name: table_3
- name: table_4
- name: table_5
loaded_at_field: xxx # override of source-level default Option 2One source. - name: sources_with_metadata_freshness
database: db_name
schema: schema_name
freshness:
warn_after:
count: 1
period: day
error_after:
count: 4
period: day
# no source-level default for loaded_at_field
tables:
# source tables with source freshness from metadata
- name: table_1
- name: table_2
# source tables with standardized loaded_at_field
- name: table_3
loaded_at_field: loaded_at
- name: table_4
loaded_at_field: loaded_at
# other source tables with custom loaded_at_field
- name: table_5
loaded_at_field: xxx |
loaded_at_field
set to null
at table level to overwrite default value set at the source level
@dbeatty10 thank you for the reply and for suggestions. I thought about the same options.
Sorry, but I can't agree. I still consider this as bug because behavior is different for I can work on pull request if you consider it possible. |
Thanks for pointing this out @kokorin 👍 Agreed that it's less clear for end users if I'm going to label this as "refinement" for us to 1) reproduce that they are behaving differently and 2) determine how they should behave going forward. |
@dbeatty10 please notice that I have attached zip archive to this ticket with minimal project reproducing the issue |
That would be awesome if you work on a pull request for this @kokorin 🏆 @graciegoheen and I discussed briefly offline, and we're planning to keep this labeled as a feature request and not backport it to dbt-core 1.7. Acceptance criteria
|
Is this a new bug in dbt-core?
Current Behavior
While defining source freshness we use 2 properties:
freshness
andloaded_at_field
.Value resolution is different (from hierarchy perspective) for these 2 properties:
When
freshness
is set to None at table level (usingfreshness:
orfreshness: null
) it overwritesfreshness
set at source level, butloaded_at_field
set to None at table level doesn't overwrite value set at source level.Expected Behavior
loaded_at_field
set to None at table level overwrites value set at source levelSteps To Reproduce
Relevant log output
No response
Environment
Which database adapter are you using with dbt?
snowflake
Additional Context
No response
The text was updated successfully, but these errors were encountered: