Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hello @khundman,
First of all congrats for the very nice work you have done with telemanom. After trying it out I noticed there was an issue when I changed the l_s values. The tests to check if the anomalies were detected in the expected range started giving unexpected results (although I could see the anomaly was actually being detected) and also the colored plot boxes seemed to be shifted in the jupyter notebook. I think you had already mentioned these problems of non adaptability of the code to l_s in issue #5
After going through the code what I understood is that the indices you made available in the .csv file don't represent the position of the anomaly in the initial array saved in the .npy file, but the position in the array y_test that is created after you pre-process the data for training (which indices are shifted l_s to the left as the first points are used for the prediction). Correct me if I am wrong on this one, but from what I experimented it seems to be this (hit my head against the walls a few times to understand what was the problem :) )
To solve this issue what I propose and have implemented is:
Shift the indices in the .csv file to +250 so that they represent the index in the original array, this way we don't have to hard-code the value 250 somewhere in the code + anyone who wants to use their own data can create a labeled_anomalies.csv file like the one you have can do it without having to take into consideration this +250 thing. A problem that this +250 generated is that some indices went over the maximum length of the array. This happens for A-8/9, D-1/4/7/9/12, E-3, F-2/8 and T-2, in these cases I just set the index to the array length - 1. I think leaving them bigger than the maximum length wouldn't hurt the code anywhere, but it might generate questions from other people and you had already raised this question in issue Labeled anomalous sequences extend past end of sequences #6 also.
Changed just a few lines in the code to take into consideration the chosen (variable) l_s value. This changes only happen right before returning E_seq back to the main file and on the plotting functions.
I re-run the code and everything seems to be running smoothly. In this commit I also included the results with this new indexing as the old results had to be removed because they contained the 'wrong' indexing according to the new rules.
I thought this was the best alternative to change as less code as possible, avoid hard coding values plus make it as general and user friendly as possible. Hope it helps !