Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Anomalous indices loop #60

Closed
Anastasis-Iliopoulos opened this issue Aug 30, 2023 · 2 comments
Closed

Anomalous indices loop #60

Anastasis-Iliopoulos opened this issue Aug 30, 2023 · 2 comments

Comments

@Anastasis-Iliopoulos
Copy link

Hello, I have a question about this snippet of code (the loop helps to find the anomalies):

# data i is an anomaly if samples [(i - timesteps + 1) to (i)] are anomalies

anomalous_data = cnn_residuals > (3/2 * UCL)
anomalous_data_indices = []
    for data_idx in range(N_STEPS - 1, len(X) - N_STEPS + 1):
        if np.all(anomalous_data[data_idx - N_STEPS + 1 : data_idx]):
            anomalous_data_indices.append(data_idx)

comment says that given a point i, if all samples between i-timesteps+1 and i are anomalies then the point i is anomaly.
This loop starts from N_STEPS - 1 which is (lets say N_STEPS=10 ) from 9 and ends to len(X) - N_STEPS + 1 which is (lets say len(X)=100 ) the number 91.

Then the if statement going to check (for each iteration) the samples from data_idx - N_STEPS + 1 to data_idx
So
iteration 1: all samples from 9-10+1 which is 0 to 9 (not inclusive)
iteration 2: all samples from 10-10+1 which is 1 to 10 (not inclusive)
....[going at the end].....
iteration 82: all samples from 10-10+1 which is 81 to 90 (not inclusive)

In other words:
We check i=9 with samples 0 to 8 (inclusive)
We check i=10 with samples 1 to 9 (inclusive)
....[going at the end].....
We check i=90 with samples 81 to 89 (inclusive)

Question
I think we are missing i=92, 93, 94,....,99. Am I wrong? And why?
If am not wrong shouldn't the for data_idx in range(N_STEPS - 1, len(X) - N_STEPS + 1) be replaced with for data_idx in range(N_STEPS - 1, len(X)): in order to iterate until the end?

Thank you in advance

@Anastasis-Iliopoulos
Copy link
Author

Anastasis-Iliopoulos commented Oct 5, 2023

Ok. I figured it out.
There is no error.

Basically what we have is:
Time_window 0: [0 to 9]
Time_window 1: [1 to 10]
Time_window 2: [2 to 11]
....
Time_window 90: [90 to 99]

Each window has a score
Given an i if scores of all 10 windows before [i to i+10-1] are above the threshold then i is anomaly.

10 Windows before [i to i+10-1] are all windows that include i. So this makes sense. Which means that every window that includes i exceeds the threshold means that i is a "problem".

Further more:

  1. we need 10 windows to decide. Because the number 2 belongs to window [0 to 9], window [1 to 10] and window [2 to 11] we do not have enough windows to decide if 2 is anomaly. The same applies to all number from 0 to window_size.
  2. on the other end.... number 98 belongs to windows [89 to 98] and [90 to 99]. So we do not have enough windows to decide if 98 is anomaly. The same applies to all number from len(X)-window_size+1 to len(X)-1.

Thats why we care just for the middle numbers/windows.

@Anastasis-Iliopoulos Anastasis-Iliopoulos closed this as not planned Won't fix, can't repro, duplicate, stale Oct 5, 2023
@YKatser
Copy link
Collaborator

YKatser commented Oct 7, 2023

@Anastasis-Iliopoulos Thank you for your interest! We will try to respond sooner if any questions arise in the future.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants