We had a model like below which updated every six hours. sqlmesh correctly generates a feature_timestamp interval of six hours for each run. We then added lookback 4 assuming that would give a 24h lookback. But in fact it generates 10h aggregation windows, presumably since 10 = 6 + 4 * 1h.
The documentation for lookback says N is the " ... number of time unit intervals prior to the current interval ..." which I read as 4 six hour intervals. But I eventually found interval_unit which defines data interval granularity and only supports a specific set of interval lengths (hour, half_hour etc). It also says it's "inferred from cron".
So I guess what's happening is that sqlmesh parses the cron spec one way to decide that it needs a 6h window by default, and another way which implies interval_unit hour so lookback 4 actually means 4 hours rather than 4 6h windows.
That seems quite confusing. It would be great to at least update the documentation to clarify, and even better to make lookback a multiple of the actual aggregation window length.
I'm also now confused as to whether batch_size is counting interval_units or aggregation windows.
MODEL (
name features_shared.weather_airport_forecasts,
kind INCREMENTAL_BY_TIME_RANGE (
time_column feature_timestamp
),
partitioned_by TIMESTAMP_TRUNC(feature_timestamp, DAY),
cron '0 */6 * * *',
...
);
We had a model like below which updated every six hours.
sqlmeshcorrectly generates a feature_timestamp interval of six hours for each run. We then addedlookback 4assuming that would give a 24h lookback. But in fact it generates 10h aggregation windows, presumably since 10 = 6 + 4 * 1h.The documentation for
lookbacksaysNis the " ... number of time unit intervals prior to the current interval ..." which I read as 4 six hour intervals. But I eventually foundinterval_unitwhich defines data interval granularity and only supports a specific set of interval lengths (hour, half_hour etc). It also says it's "inferred from cron".So I guess what's happening is that sqlmesh parses the cron spec one way to decide that it needs a 6h window by default, and another way which implies
interval_unit hoursolookback 4actually means 4 hours rather than 4 6h windows.That seems quite confusing. It would be great to at least update the documentation to clarify, and even better to make lookback a multiple of the actual aggregation window length.
I'm also now confused as to whether
batch_sizeis countinginterval_units or aggregation windows.