Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should the number smoothed_errors be equal to total observations? (i.e. an anomaly score for each observation) #49

Closed
gorold opened this issue Nov 13, 2020 · 1 comment

Comments

@gorold
Copy link

gorold commented Nov 13, 2020

# for values at beginning < sequence length, just use avg
if not channel.id == 'C-2': # anomaly occurs early in window
self.e_s[:self.config.l_s] = \
[np.mean(self.e_s[:self.config.l_s * 2])] * self.config.l_s

I'm wondering if the length of self.e_s should be equal to the total number of observations in the sequence? All self.e_s are missing 260 observations.

The above code snippet does not add the 250 observations to the front of the array, but simply changes the existing 250 front elements. Something like this will add them: self.e_s = np.insert(self.e_s, 0, [np.mean(self.e_s[:self.config.l_s * 2])] * self.config.l_s)

However, there is still 10 missing observations from each self.e_s, perhaps due to the n_predictions option?

@khundman
Copy link
Owner

Thanks for the question, @gorold. l_s is the length of the sequence passed to the LSTM, which will use that sequence to start generating predictions for timesteps > l_s. Some length l_s of input sequence is required to generate future predictions, and because of this the length of the resulting (smoothed) error sequence (self.e, self.e_s) will be equal to len(observations) - l_s.

The portion of errors.py that you referenced isn't intended to restore the sequence to the original observation length. The conversion of early errors to a mean value is to account for large error spikes resulting from the initial predictions that are not smoothed out by the exponentially-weight moving average. If you remove the lines you reference above and re-run you can see this effect.

The additional delta of 10 from the original length of the observations is due to n_predictions. We need 10 future test values to generate the loss that is backpropagated during training (see #24 (comment)) and evaluate normalized loss during test. This isn't a requirement for inference however, and I'd welcome a PR to address this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants