You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# number of trailing batches to use in error calculation
window_size: 30
I also know that values will be aggregated in windows of one minute and processed in batches of 70 minutes, as stated in your paper:
Telemetry values are aggregated into one minute windows and evaluated in batches of 70 minutes mimicking the downlink schedule for SMAP and our current system implementation.
I assume that means that one minute contains 30 values (so 1 value per 2 seconds). Is that correct?
The parameter h is then used to calculate the dynamic threshold and evaluate each batch.
Could you explain the reason for h to be divided into 2 seperate parameters? Why can't there be an h paramter of 2100 instead of 30 * 70 (window_size * batch_size) to define each batch? Is there a way to efficiantly configure these two parameters for a use case not dealing with with SMAP?
Thank you in advance!
The text was updated successfully, but these errors were encountered:
I assume that means that one minute contains 30 values (so 1 value per 2 seconds). Is that correct?
No, the processed SMAP telemetry used in this experiment contains 1 value per minute per channel. Once 70 minutes pass, this data comes down from the spacecraft in a batch (approx. 70 values per channel).
Could you explain the reason for h to be divided into 2 seperate parameters?
These two numbers represent different aspects of the problem. As described above, the batch_size (70) represents the chunks of data we receive in each batch. This is not a streaming scenario, so we are restricted to processing these batches as they arrive. For different problems this value would be set to whatever the batch size/cadence is. The window_size is a tuneable parameter that balances the tradeoff between 1) providing a lot of historical context to the algorithm for evaluating the severity of an anomaly (larger window) and 2) computational efficiency and memory efficiency (larger windows are higher cost).
Hello @khundman,
to my understanding the window_size and batch_size form the number of historical error values (window_size * batch_size = h).
telemanom/config.yaml
Lines 7 to 11 in 26831a0
I also know that values will be aggregated in windows of one minute and processed in batches of 70 minutes, as stated in your paper:
I assume that means that one minute contains 30 values (so 1 value per 2 seconds). Is that correct?
The parameter h is then used to calculate the dynamic threshold and evaluate each batch.
Could you explain the reason for h to be divided into 2 seperate parameters? Why can't there be an h paramter of 2100 instead of 30 * 70 (window_size * batch_size) to define each batch? Is there a way to efficiantly configure these two parameters for a use case not dealing with with SMAP?
Thank you in advance!
The text was updated successfully, but these errors were encountered: