Skip to content

lookback behavior / documentation is confusing #3602

@patricksurry

Description

@patricksurry

We had a model like below which updated every six hours. sqlmesh correctly generates a feature_timestamp interval of six hours for each run. We then added lookback 4 assuming that would give a 24h lookback. But in fact it generates 10h aggregation windows, presumably since 10 = 6 + 4 * 1h.

The documentation for lookback says N is the " ... number of time unit intervals prior to the current interval ..." which I read as 4 six hour intervals. But I eventually found interval_unit which defines data interval granularity and only supports a specific set of interval lengths (hour, half_hour etc). It also says it's "inferred from cron".

So I guess what's happening is that sqlmesh parses the cron spec one way to decide that it needs a 6h window by default, and another way which implies interval_unit hour so lookback 4 actually means 4 hours rather than 4 6h windows.

That seems quite confusing. It would be great to at least update the documentation to clarify, and even better to make lookback a multiple of the actual aggregation window length.

I'm also now confused as to whether batch_size is counting interval_units or aggregation windows.

MODEL (
  name features_shared.weather_airport_forecasts,
  kind INCREMENTAL_BY_TIME_RANGE (
    time_column feature_timestamp
  ),
  partitioned_by TIMESTAMP_TRUNC(feature_timestamp, DAY),
  cron '0 */6 * * *',
  ...
);

Metadata

Metadata

Assignees

Labels

DocumentationImprovements or additions to documentation

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions