window_size and batch_size in config.yaml #45

kennethweitzel · 2020-09-29T16:01:16Z

to my understanding the window_size and batch_size form the number of historical error values (window_size * batch_size = h).

telemanom/config.yaml

Lines 7 to 11 in 26831a0

    
           # number of values to evaluate in each batch 
        
           batch_size: 70 
        
           # number of trailing batches to use in error calculation 
        
           window_size: 30

I also know that values will be aggregated in windows of one minute and processed in batches of 70 minutes, as stated in your paper:

Telemetry values are aggregated into one minute windows and evaluated in batches of 70 minutes mimicking the downlink schedule for SMAP and our current system implementation.

I assume that means that one minute contains 30 values (so 1 value per 2 seconds). Is that correct?
The parameter h is then used to calculate the dynamic threshold and evaluate each batch.

Could you explain the reason for h to be divided into 2 seperate parameters? Why can't there be an h paramter of 2100 instead of 30 * 70 (window_size * batch_size) to define each batch? Is there a way to efficiantly configure these two parameters for a use case not dealing with with SMAP?

Thank you in advance!

khundman · 2020-10-15T12:02:02Z

I assume that means that one minute contains 30 values (so 1 value per 2 seconds). Is that correct?

No, the processed SMAP telemetry used in this experiment contains 1 value per minute per channel. Once 70 minutes pass, this data comes down from the spacecraft in a batch (approx. 70 values per channel).

Could you explain the reason for h to be divided into 2 seperate parameters?

These two numbers represent different aspects of the problem. As described above, the batch_size (70) represents the chunks of data we receive in each batch. This is not a streaming scenario, so we are restricted to processing these batches as they arrive. For different problems this value would be set to whatever the batch size/cadence is. The window_size is a tuneable parameter that balances the tradeoff between 1) providing a lot of historical context to the algorithm for evaluating the severity of an anomaly (larger window) and 2) computational efficiency and memory efficiency (larger windows are higher cost).

khundman closed this as completed Nov 21, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

window_size and batch_size in config.yaml #45

window_size and batch_size in config.yaml #45

kennethweitzel commented Sep 29, 2020

khundman commented Oct 15, 2020

window_size and batch_size in config.yaml #45

window_size and batch_size in config.yaml #45

Comments

kennethweitzel commented Sep 29, 2020

khundman commented Oct 15, 2020