Anomaly modular #956

breznak · 2014-05-23T12:25:35Z

1/ turn anomaly.py into a class for better reuse, move related logic from clamodel.py here
2/ nice speedup for python (use numpy.sum())
3/ new features/"implementation" to anomaly computation - parametrized, default off, so no change from current
3.1/ new cumulative anomaly (sliding window)
3.2/ @subutai Please have a look if you like this approach, it's prepared for integration with your likelihood anomaly code
4/ added tests

and convert tests / usecases accordingly; Just create an instance of nupic.algorithms.anomaly.Anomaly() and call the computeAnomalyScore() method

yields cca 300x speedup! (only of array is numpy, not python array) which is this case

instead of python's sum() on numpy array (much slower) found by: grep -R 'numpy.*.sum()' $NUPIC

this class was extracted from clamodel previously, so it's just moving around

avail anomaly implementations

fixed bug: self._buf must be float array!

Conflicts: py/nupic/frameworks/opf/clamodel.py

This reverts commit be71384.

breznak · 2014-05-23T12:29:03Z

TODO1: integration tests will fail on serialization/ pickle - probably as anomaly turned into a class, not just a function. Can someone help me fix that please?
TODO2: more like new-feature: add slidingWindowSize and useLikelihood to anomaly parameters so it can be used from OPF, any pointers where these can be modified? (would be a good article for wiki). Thanks a lot!

rhyolight · 2014-05-23T14:49:39Z

Subutai is out until next week.

Sent from my MegaPhone

On May 23, 2014, at 5:29 AM, breznak notifications@github.com wrote:

TODO1: integration tests will fail on serialization/ pickle - probably as anomaly turned into a class, not just a function. Can someone help me fix that please?
TODO2: more like new-feature: add slidingWindowSize and useLikelihood to anomaly parameters so it can be used from OPF, any pointers where these can be modified? (would be a good article for wiki). Thanks a lot!

—
Reply to this email directly or view it on GitHub.

rhyolight · 2014-05-23T21:19:47Z

TODO1: integration tests will fail on serialization/ pickle - probably as anomaly turned into a class, not just a function. Can someone help me fix that please?

This could be a hard thing to fix.

scottpurdy · 2014-05-23T22:46:44Z

Thanks Marek, very good idea! In fact, we have a version of this that we use internally that Matt put up in #946

As far as the specifics, I don't think we want to require clients to create an instance of the class to have access to the static functionality of the base computeAnomalyScore. We do want a more sophisticated method for de-noising the score though and in the case where you have a running state it may make sense to encapsulate that in an object. A lot of thought has gone into how to do this likelihood calculation here at Grok and Matt has a PR to merge our internal likelihood algorithm (#946 - originally created by @subutai). Can you try that out and see if it solves your needs?

Feel free to put feedback on that PR if you think the API could be improved. If that change doesn't solve your use case then please follow up with an explanation of what the data looks like and how our likelihood doesn't solve the issue / what you would like to see done.

breznak · 2014-05-24T12:52:03Z

Hi Scott, I did check #946 quickly, I'm not sure if the moving average referenced there is meant for anomaly, or for keeping track of records (to update the distribution model)? Anyways, there are still things in this PR that could be merged (numpy speedup) and usability from one anomaly class. I will comment to likelihood for the rest..

breznak · 2014-05-24T13:26:35Z

(@rhyolight btw, Matt, a closed PR gets reopened when one comments on it)
I just reviewed #946 , looks great, I just had some questions about unclear things.. As I understand it, the likelihood code is not intended to replace anomaly.py it just computes a probability that what has been detected as anomaly (input value, its anomaly score) is really an anomaly.

That's why I've created this PR so that people can turn features on/off in anomaly detection - the likelihood core is great example, final anomaly score would be anomaly_score * probability_its_really_anomaly, moving average would be other example.

This reverts commit d7e635f.

scottpurdy · 2014-05-27T18:17:03Z

@breznak - I think it would be better to separate the numpy speed ups from the anomaly score changes.

And it looks like you use numpy.sum(numpyArray) in a lot of places rather than numpyArray.sum() but I believe that is the same thing (and the original is the more canonical way to use methods in Python). Note that using the sum method of a numpy array is different than using the builtin Python sum function.

breznak · 2014-05-30T12:53:47Z

@scottpurdy ok, I'll separate the speedups to a new PR.
about sum() I was referring to https://stackoverflow.com/questions/10922231/pythons-sum-vs-numpys-numpy-sum
The thing is to use numpy.sum on numpy array; I didn't check if np_array.sum() and py_array.sum() use numpy, resp. python's implementation (I'd think so). Should I redo those cases?

breznak · 2014-05-30T12:57:47Z

TODO1: integration tests will fail on serialization/ pickle - probably as anomaly turned into a class, >>not just a function. Can someone help me fix that please?
This could be a hard thing to fix.

@rhyolight can you elaborate please? To implement serialization for a new class should not be a big problem (if I knew how); troubles can be with checkpoints from past models ... And I think we should think this through, as backwards compatibility for loading modes might be important for Grok or corporate customers, but I believe at this stage it's a minor issue for community usecases, so we could be more flexible here

breznak · 2014-05-30T13:08:31Z

@subutai I plan to use code from #946 here, so that people can easily combine all sorts of modifications (sliding window, likelihood,..) to anomaly.

I wanted to ask, is there a reason why in grok you are using just likelihood of anomalies and not something like "probability weighted anomaly"? (anomaly_score * anomaly_probability) This would make more sense to me: eg huge anomaly but very unlikely is unimportant; but somewhat likely but catastrophical anomaly is important; I plan to implement this here. Thanks

breznak · 2014-05-30T13:09:35Z

Here I also intend to fix #967

coveralls · 2014-09-05T15:08:22Z

Coverage increased (+0.01%) when pulling 42ea7a6 on breznak:anomaly_modular into 2664c19 on numenta:master.

rhyolight · 2014-09-05T21:00:13Z

@breznak What is the status of this PR at this point?

breznak · 2014-09-05T21:04:37Z

it's ready ;)

rhyolight · 2014-09-05T21:17:20Z

@scottpurdy / @chetan51 / @subutai Please review. It will be nice to finally get this merged!

breznak · 2014-09-06T16:24:02Z

it's ready ;)

rhyolight · 2014-09-09T15:11:09Z

@scottpurdy / @chetan51 / @subutai Any of you have time to review this today? This PR has been around since May, and it would be great to put it to bed.

rhyolight · 2014-09-09T22:40:28Z

@breznak I've gotten word that this PR is good to merge after some style issues are fixed. You should run pylint on the files you changed. There are a lot of little things to fix, for example:

› pylint nupic/algorithms/anomaly.py
************* Module nupic.algorithms.anomaly
W0311,  30: 0 - Bad indentation. Found 4 spaces, expected 2 (bad-indentation)
C0301,  34: 0 - Line too long (88/80) (line-too-long)
C0301,  36: 0 - Line too long (84/80) (line-too-long)
W0311,  39: 0 - Bad indentation. Found 4 spaces, expected 2 (bad-indentation)
W0311,  40: 0 - Bad indentation. Found 4 spaces, expected 2 (bad-indentation)
W0311,  44: 0 - Bad indentation. Found 6 spaces, expected 4 (bad-indentation)
W0311,  47: 0 - Bad indentation. Found 6 spaces, expected 4 (bad-indentation)
W0311,  48: 0 - Bad indentation. Found 4 spaces, expected 2 (bad-indentation)
W0311,  50: 0 - Bad indentation. Found 6 spaces, expected 4 (bad-indentation)
W0311,  51: 0 - Bad indentation. Found 4 spaces, expected 2 (bad-indentation)
W0311,  53: 0 - Bad indentation. Found 6 spaces, expected 4 (bad-indentation)
W0311,  54: 0 - Bad indentation. Found 4 spaces, expected 2 (bad-indentation)
C0303,  61: 0 - Trailing whitespace (trailing-whitespace)
C0301,  61: 0 - Line too long (93/80) (line-too-long)
C0303,  62: 0 - Trailing whitespace (trailing-whitespace)
C0301,  73: 0 - Line too long (92/80) (line-too-long)
C0303,  75: 0 - Trailing whitespace (trailing-whitespace)
C0301,  76: 0 - Line too long (89/80) (line-too-long)
C0303,  77: 0 - Trailing whitespace (trailing-whitespace)
C0301,  77: 0 - Line too long (82/80) (line-too-long)
C0303,  78: 0 - Trailing whitespace (trailing-whitespace)
C0303,  79: 0 - Trailing whitespace (trailing-whitespace)
C0301,  79: 0 - Line too long (82/80) (line-too-long)
C0303,  81: 0 - Trailing whitespace (trailing-whitespace)
C0303,  82: 0 - Trailing whitespace (trailing-whitespace)
C0301,  82: 0 - Line too long (100/80) (line-too-long)
C0303,  84: 0 - Trailing whitespace (trailing-whitespace)
C0301,  84: 0 - Line too long (106/80) (line-too-long)
C0303,  85: 0 - Trailing whitespace (trailing-whitespace)
C0301,  85: 0 - Line too long (107/80) (line-too-long)
C0301,  86: 0 - Line too long (113/80) (line-too-long)
C0301,  92: 0 - Line too long (95/80) (line-too-long)
C0301,  95: 0 - Line too long (137/80) (line-too-long)
C0325, 101: 0 - Unnecessary parens after 'not' keyword (superfluous-parens)
C0301, 102: 0 - Line too long (93/80) (line-too-long)
C0301, 103: 0 - Line too long (98/80) (line-too-long)
C0301, 109: 0 - Line too long (93/80) (line-too-long)
C0303, 111: 0 - Trailing whitespace (trailing-whitespace)
C0301, 113: 0 - Line too long (107/80) (line-too-long)
C0301, 114: 0 - Line too long (106/80) (line-too-long)
C0301, 115: 0 - Line too long (99/80) (line-too-long)
C0301, 132: 0 - Line too long (87/80) (line-too-long)
C0301, 135: 0 - Line too long (87/80) (line-too-long)
W0611,  27: 0 - Unused import nupic (unused-import)

breznak · 2014-09-09T23:11:33Z

@rhyolight that is a darn long list :) i'll try to do something about it. but we should get a coding-style template, so I could just ctrl+shift+f in netbeans and have it formated. 🐫

rhyolight · 2014-09-09T23:13:21Z

@breznak There is a pylintrc in the file tree, which provides that info. Not sure about IDE support, however. It all adheres to https://github.com/numenta/nupic/wiki/Python-Style-Guide.

rhyolight · 2014-09-10T00:02:05Z

@oxtopus Aside from Python code style issues -- which should be addressed by @breznak after adhering to pylint advice -- can you do one last functional review of this PR? @subutai and @scottpurdy have already reviewed most of it, but more commits were added.

subutai · 2014-09-10T20:26:05Z

@rhyolight I don't have any more substantive changes beyond what's already been discussed. Would be great to get this merged in soon. @breznak I've had to make those pylint changes too - even though the list looks long, they can actually be done very quickly.

breznak · 2014-09-11T11:01:40Z

@rhyolight fixed most of the formatting, although I'd like to raise a discussion on usefulness of some of those -- esp. 80char/line limit and trailing whitespace (in comments) make the code look ugly and less readible, imho.

coveralls · 2014-09-11T11:33:54Z

Coverage increased (+0.0%) when pulling 914b4cb on breznak:anomaly_modular into 3d45da2 on numenta:master.

rhyolight · 2014-09-11T15:19:56Z

@breznak said:

esp. 80char/line limit and trailing whitespace (in comments) make the code look ugly and less readible, imho.

I'm pretty positive we're not going to deviate from PEP-8 on that issue.

Anomaly modular

oxtopus · 2014-09-11T15:45:10Z

I think it's worth discussion. I find the strict 80-char limit anachronistic and even PEP-8, which is limited in scope to setting the standards for contributions to the Python stdlib, provides provisions for 100-char width in http://legacy.python.org/dev/peps/pep-0008/#maximum-line-length for code maintained exclusively or primarily by a team that can reach agreement on this issue. We deviate somewhat from PEP-8 already and as stated in http://legacy.python.org/dev/peps/pep-0020/, readability counts.

rhyolight · 2014-09-11T16:03:58Z

Well, perhaps a discussion on nupic-hackers would be worthwhile.

Matt Taylor
OS Community Flag-Bearer
Numenta

On Thu, Sep 11, 2014 at 8:45 AM, Austin Marshall notifications@github.com
wrote:

I think it's worth discussion. I find the strict 80-char limit
anachronistic and even PEP-8, which is limited in scope to setting the
standards for contributions to the Python stdlib, provides provisions for
100-char width in
http://legacy.python.org/dev/peps/pep-0008/#maximum-line-length for code
maintained exclusively or primarily by a team that can reach agreement on
this issue. We deviate somewhat from PEP-8 already and as stated in
http://legacy.python.org/dev/peps/pep-0020/, readability counts.

—
Reply to this email directly or view it on GitHub
#956 (comment).

scottpurdy · 2014-09-11T17:56:32Z

Hey all, I think this needs a bit more work. I am going to send a follow up PR but want to get feedback asap so let me know if you disagree with any of these changes:

The versioning should bump a version number rather than just check if it is missing the anomaly instance
The current method for specifying the type of anomaly you want doesn't let a 3rd party create their own. Can we change this to a base class defining the interface and then a set of subclasses? Separating into separate classes also makes each method easier to understand and test since you don't have the sliding window logic, etc all in the same class.
The anomaly class supports taking either current predicted columns or previously predicted columns. I don't see any reason to have the class support both. Rather than duplicating the previously predicted state I think the new class should take the previously predicted only.
I don't think we use the KNNAnomalyClassifierRegion but the change there added state to the class with no benefit. It should just use computeRawAnomalyScore.

breznak added 12 commits May 23, 2014 04:39

make anomaly.py a class for further modularity

716189f

and convert tests / usecases accordingly; Just create an instance of nupic.algorithms.anomaly.Anomaly() and call the computeAnomalyScore() method

add test for anomaly speedup in numpy.sum()

e47b6a6

yields cca 300x speedup! (only of array is numpy, not python array) which is this case

speedup: use numpy.sum() where applicable

5f7932a

instead of python's sum() on numpy array (much slower) found by: grep -R 'numpy.*.sum()' $NUPIC

implement "new" anomaly using temporal pooler

419de82

use TemporalPoolerAnomaly class in CLAModel

848ad46

this class was extracted from clamodel previously, so it's just moving around

CumulativeAnomaly (sliding window) implementation

b639fe0

add AnomalyChooser class that let's you select from

036e25f

avail anomaly implementations

Anomaly: remove separate classes and use parameters to method instead

e07e11e

Anomaly: prepare for Subutai's likelihood anomaly

9dfdd36

cumulative Anomaly: fix + unittest

36dd7f4

fixed bug: self._buf must be float array!

TMP use slide-window

be71384

Conflicts: py/nupic/frameworks/opf/clamodel.py

Revert "TMP use slide-window"

d7e635f

This reverts commit be71384.

breznak added the anomaly label May 23, 2014

breznak added this to the Sprint 23 milestone May 23, 2014

breznak self-assigned this May 23, 2014

scottpurdy closed this May 23, 2014

breznak reopened this May 24, 2014

breznak added 2 commits May 24, 2014 16:26

Revert "Revert "TMP use slide-window""

90468a3

This reverts commit d7e635f.

TMP: sliding window 100

bfd906c

rhyolight assigned breznak Aug 22, 2014

rhyolight removed the help wanted label Aug 22, 2014

rhyolight modified the milestones: Sprint 28, Sprint 29 Aug 22, 2014

Merge branch 'master' into anomaly_modular

42ea7a6

rhyolight modified the milestones: Sprint 30, Sprint 29 Sep 5, 2014

breznak added 2 commits September 11, 2014 14:39

Merge branch 'master' into anomaly_modular

fc1e25d

apply pylint formating changes

914b4cb

rhyolight added a commit that referenced this pull request Sep 11, 2014

Merge pull request #956 from breznak/anomaly_modular

98d1452

Anomaly modular

rhyolight merged commit 98d1452 into numenta:master Sep 11, 2014

scottpurdy mentioned this pull request Sep 11, 2014

Anomaly changes #1277

Merged

2 tasks

breznak deleted the anomaly_modular branch January 8, 2016 12:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Anomaly modular #956

Anomaly modular #956

breznak commented May 23, 2014

breznak commented May 23, 2014

rhyolight commented May 23, 2014

rhyolight commented May 23, 2014

scottpurdy commented May 23, 2014

breznak commented May 24, 2014

breznak commented May 24, 2014

scottpurdy commented May 27, 2014

breznak commented May 30, 2014

breznak commented May 30, 2014

breznak commented May 30, 2014

breznak commented May 30, 2014

coveralls commented Sep 5, 2014

rhyolight commented Sep 5, 2014

breznak commented Sep 5, 2014

rhyolight commented Sep 5, 2014

breznak commented Sep 6, 2014

rhyolight commented Sep 9, 2014

rhyolight commented Sep 9, 2014

breznak commented Sep 9, 2014

rhyolight commented Sep 9, 2014

rhyolight commented Sep 10, 2014

subutai commented Sep 10, 2014

breznak commented Sep 11, 2014

coveralls commented Sep 11, 2014

rhyolight commented Sep 11, 2014

oxtopus commented Sep 11, 2014

rhyolight commented Sep 11, 2014

scottpurdy commented Sep 11, 2014

Anomaly modular #956

Anomaly modular #956

Conversation

breznak commented May 23, 2014

breznak commented May 23, 2014

rhyolight commented May 23, 2014

rhyolight commented May 23, 2014

scottpurdy commented May 23, 2014

breznak commented May 24, 2014

breznak commented May 24, 2014

scottpurdy commented May 27, 2014

breznak commented May 30, 2014

breznak commented May 30, 2014

breznak commented May 30, 2014

breznak commented May 30, 2014

coveralls commented Sep 5, 2014

rhyolight commented Sep 5, 2014

breznak commented Sep 5, 2014

rhyolight commented Sep 5, 2014

breznak commented Sep 6, 2014

rhyolight commented Sep 9, 2014

rhyolight commented Sep 9, 2014

breznak commented Sep 9, 2014

rhyolight commented Sep 9, 2014

rhyolight commented Sep 10, 2014

subutai commented Sep 10, 2014

breznak commented Sep 11, 2014

coveralls commented Sep 11, 2014

rhyolight commented Sep 11, 2014

oxtopus commented Sep 11, 2014

rhyolight commented Sep 11, 2014

scottpurdy commented Sep 11, 2014