Dealing with different l_s values #27

diogodcarvalho · 2019-11-15T15:54:09Z

First of all congrats for the very nice work you have done with telemanom. After trying it out I noticed there was an issue when I changed the l_s values. The tests to check if the anomalies were detected in the expected range started giving unexpected results (although I could see the anomaly was actually being detected) and also the colored plot boxes seemed to be shifted in the jupyter notebook. I think you had already mentioned these problems of non adaptability of the code to l_s in issue #5

After going through the code what I understood is that the indices you made available in the .csv file don't represent the position of the anomaly in the initial array saved in the .npy file, but the position in the array y_test that is created after you pre-process the data for training (which indices are shifted l_s to the left as the first points are used for the prediction). Correct me if I am wrong on this one, but from what I experimented it seems to be this (hit my head against the walls a few times to understand what was the problem :) )

To solve this issue what I propose and have implemented is:

Shift the indices in the .csv file to +250 so that they represent the index in the original array, this way we don't have to hard-code the value 250 somewhere in the code + anyone who wants to use their own data can create a labeled_anomalies.csv file like the one you have can do it without having to take into consideration this +250 thing. A problem that this +250 generated is that some indices went over the maximum length of the array. This happens for A-8/9, D-1/4/7/9/12, E-3, F-2/8 and T-2, in these cases I just set the index to the array length - 1. I think leaving them bigger than the maximum length wouldn't hurt the code anywhere, but it might generate questions from other people and you had already raised this question in issue Labeled anomalous sequences extend past end of sequences #6 also.
Changed just a few lines in the code to take into consideration the chosen (variable) l_s value. This changes only happen right before returning E_seq back to the main file and on the plotting functions.

I re-run the code and everything seems to be running smoothly. In this commit I also included the results with this new indexing as the old results had to be removed because they contained the 'wrong' indexing according to the new rules.

I thought this was the best alternative to change as less code as possible, avoid hard coding values plus make it as general and user friendly as possible. Hope it helps !

khundman · 2019-11-23T16:55:21Z

Thanks @diogodcarvalho, this makes sense to me and I really appreciate the work! I went ahead and merged and I'm also going to push a new version shortly that incorporates your changes as well as some other significant updates.

khundman · 2019-11-23T17:19:45Z

In my latest commit you can see your changes reflected in labeled_anomalies.csv as well as here

telemanom/telemanom/errors.py

Lines 162 to 166 in 51fbb67

    
           # additional shift is applied to indices so that they represent the 
        
           # position in the original data array, obtained from the .npy files, 
        
           # and not the position on y_test (See PR #27). 
        
           self.E_seq = [(e_seq[0] + self.config.l_s, 
        
                          e_seq[1] + self.config.l_s) for e_seq in self.E_seq]

and here

telemanom/telemanom/plotting.py

Lines 64 to 66 in 51fbb67

    
           'x0': r[0] - self.config.l_s, 
        
           'y0': _min, 
        
           'x1': r[1] - self.config.l_s,

diogodcarvalho · 2019-11-25T09:09:29Z

Glad to help. I will then try out the new version.

diogodcarvalho and others added 2 commits November 14, 2019 15:45

fixed problems with different l_s values

bcf0bdc

removed and substituted results

671837a

khundman merged commit 4057b4d into khundman:master Nov 23, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dealing with different l_s values #27

Dealing with different l_s values #27

diogodcarvalho commented Nov 15, 2019

khundman commented Nov 23, 2019

khundman commented Nov 23, 2019

diogodcarvalho commented Nov 25, 2019

Dealing with different l_s values #27

Dealing with different l_s values #27

Conversation

diogodcarvalho commented Nov 15, 2019

khundman commented Nov 23, 2019

khundman commented Nov 23, 2019

diogodcarvalho commented Nov 25, 2019