You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, there's a limitation with using anomalies whereby if you have an incident in your training data it'll be reflected in how your thresholds are calculated. This can lead to a situation whereby an incident occurs and then reoccurs within x days and is not flagged as it "looks" similar to the previous incident which is now influencing threholds. See below for an example of this in an hour_of_day seasonal freshness graph.
Jun 30th you can see an incident occurs and we get a linearly increasing delta in current time to last load time as no new data comes in. You can see the same issue occur on Jul 5th. However it's now considered within the threshold.
Note: The periodic bumps daily are expected (hence why hour_of_day is used for this to take them into account).
Describe the solution you'd like
I think there's two possible approaches.
Would be having some way to retrospectively tag something as an incident. In theory you could do this with an incident seed file with a start and finish time of an incident but this is kinda clunky as it involves a PR being merged into dbt every time you want to remove an incident from training data. Maybe this is a good approach?
Would be to apply prior knowledge at the time of test config which adds out of bounds thresholds for training data. e.g. a min and max tolerance. Anything outside that tolerance would be ignored when calculating thresholds, but included when looking at the detection periods. I would envisage this would work with the following yaml being added to anomaly test config
training_thresholds:
min_value: <when training any value < this will be ignored>
max_value: <when training any value > this will be ignored>
One might ask well if you can apply that knowledge of the thresholds should you be doing an anomaly test in the first place or simply a static threshold test? I think it's still fair to do an anomaly test as it'll allow tighter tolerances during during normally consistent periods.
Describe alternatives you've considered
See option 1 above.
Would you be willing to contribute this feature?
More than happy to work on this, keen for feedback on the options though and a discussion first as I imagine this is an area that elementary may have already discussed internally.
Hi @mossyyy,
Definitely something we have discussed, @oravi could elaborate on it.
Until he does - for this specific use case I would recommend you to add where_expression that filters out these few anomalous hours in your training set.
When you add a where_expression to a test Elementary would "reset" the metrics (as this is a change to the underlying dataset) and will recalculate the metrics without the excluded rows.
Maayan-s
changed the title
Allow a test to specify a training data filter to prevent over fitting thresholds to previous incidents
[ELE-1286] Allow a test to specify a training data filter to prevent over fitting thresholds to previous incidents
Jul 9, 2023
Currently, there's a limitation with using anomalies whereby if you have an incident in your training data it'll be reflected in how your thresholds are calculated. This can lead to a situation whereby an incident occurs and then reoccurs within x days and is not flagged as it "looks" similar to the previous incident which is now influencing threholds. See below for an example of this in an
hour_of_day
seasonal freshness graph.Jun 30th you can see an incident occurs and we get a linearly increasing delta in current time to last load time as no new data comes in. You can see the same issue occur on Jul 5th. However it's now considered within the threshold.
Note: The periodic bumps daily are expected (hence why hour_of_day is used for this to take them into account).
Describe the solution you'd like
I think there's two possible approaches.
One might ask well if you can apply that knowledge of the thresholds should you be doing an anomaly test in the first place or simply a static threshold test? I think it's still fair to do an anomaly test as it'll allow tighter tolerances during during normally consistent periods.
Describe alternatives you've considered
See option 1 above.
Would you be willing to contribute this feature?
More than happy to work on this, keen for feedback on the options though and a discussion first as I imagine this is an area that elementary may have already discussed internally.
From SyncLinear.com | ELE-1286
The text was updated successfully, but these errors were encountered: