Conversation
6ab2385 to
3276825
Compare
|
@eakmanrq we have a slightly similar use-case, but where the "updated at" timestamp could still be inferred from the source. The source table is a daily snapshot of the raw data that changes over time. It has a "snapshot_date" timestamp, this column doesn't tell you if the record changed (so it doesn't fulfill the We can build an SCD type 2 from this and use Would it make sense to add support for this case out of the box? Perhaps this is already supported by the new kind? |
3276825 to
5030084
Compare
Thanks @plaflamme for sharing this use case. Basically you are wanting to create a SCD Type 2 table out of a snapshot table and that use case makes sense. There is already a bool which is I like the idea of adding support for this but leaning towards adding support in another PR. The reason is that this one is already complex enough. |
|
@eakmanrq great! glad to hear this might be a useful addition. Another PR sounds like a good idea. I'll open a separate issue so it this can be tracked separately, feel free to close it if it's not useful. |
5030084 to
b33b498
Compare
f001248 to
b61e094
Compare
9a76e6f to
e1459f9
Compare
e1459f9 to
4c6f23b
Compare
5c79bcd to
85a9961
Compare
254d31a to
f8789ab
Compare
97aa0d5 to
5934c61
Compare
Prior to this PR you could only create an SCD Type 2 table if you had an "Updated At" timestamp in the source table. This PR makes it so that you can create an SCD Type 2 from any source by checking if specific columns have changed. Since you no longer have "Updated At" to tell you when that change was made it uses
execution_timeinstead. As a result you can think of it as a less precise approach to SCD Type 2.This PR includes support both for native SQLMesh and dbt adapter.
One challenge when testing the dbt runtime is that dbt doesn't allow freezing "now()" (their execution time). Before I had a simple way of patching their "now()" with the frozen now but it had a bug and I fixed it in this PR. This bug is actually what created the perceived behavior difference between dbt and SQLMesh so now they actually appear to behave the same.
I decided to leave the
SCD_TYPE_2model kind in place and alias it to beSCD_TYPE_2_BY_TIMEand this is what this PR is not a breaking change. To me this seems fair since by time is the recommended and default approach so having the unqualified version f the name point to what we recommend seems fine. If others disagree then I can remove and make this a breaking change.