Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

earthgecko_skyline_detector #330

Merged
merged 2 commits into from Jan 18, 2019
Merged

Conversation

earthgecko
Copy link
Contributor

Adding earthgecko_skyline detector code, results, etc.

IssueID #2802: earthgecko_skyline_detector
IssueID #2800: earthgecko_skyline

- Added earthgecko_skyline

Added:
nab/detectors/earthgecko_skyline/
results/earthgeckoSkyline/
Modified:
README.md
config/thresholds.json
results/final_results.json
run.py
@rhyolight
Copy link
Member

@subutai If this is simply a better implementation of the skyline detector, is there any reason not to just use the version that performs better? I would prefer if we only had one version on the leaderboard. What do you think @earthgecko?

@subutai
Copy link
Member

subutai commented Jan 15, 2019

@subutai If this is simply a better implementation of the skyline detector, is there any reason not to just use the version that performs better? I would prefer if we only had one version on the leaderboard. What do you think @earthgecko?

I think the main reason to keep the old implementation is that there have been published results with that version, so from a paper reproducibility standpoint it may be worth keeping.

@earthgecko makes some really good points as to why that earlier implementation was insufficient, so I'm all for merging this version in (once any other review comments have been addressed).

@earthgecko
Copy link
Contributor Author

Hi @rhyolight and @subutai there may be another option. Seeing as the paper
references the v1.0 release, there should not be any harm in leaving the
original implementation out moving forward. But it is just a matter of time and
energy to calculate the scores, so there is little harm in keeping it in future
versions either.

However I would suggest that if it is kept in, it is disabled by default and has
to be enabled to run. That way it is not shipping/wasting time of any
unsuspecting/uninformed people that are using/testing with NAB.
I am happy enough to create a PR to enable that if you want to keep it in.

This may be especially true if more time series are added to the corpus as the
original implementation is always going to take relatively long to run,
especially on longer time series, even with the addition of the much improved
optimizer and scorer.

And on that note, I think that NAB may benefit from more real world time series
added to the corpus, which I shall endeavour to start collecting and submitting.
And I am pretty certain the numenta detector will score highest on additions
too ;)

@subutai
Copy link
Member

subutai commented Jan 16, 2019

However I would suggest that if it is kept in, it is disabled by default and has
to be enabled to run. That way it is not shipping/wasting time of any
unsuspecting/uninformed people that are using/testing with NAB.

That's an excellent idea! 👍

I am happy enough to create a PR to enable that if you want to keep it in.

That would be great.

And on that note, I think that NAB may benefit from more real world time series
added to the corpus, which I shall endeavour to start collecting and submitting.

Wow, that would be a major improvement to NAB and warrant a version change. I have some nice data from the NAB competition as well (blood pressure data) but it was not enough to create a new version. However, if you can add some more diverse datasets, we might have enough to do a 1.1, 1.5 or even 2.0.

@smirmik
Copy link
Contributor

smirmik commented Jan 16, 2019

And I am pretty certain the numenta detector will score highest on additions
too ;)

There are different opinions about this ;)
Sorry for off topic.
Just wanted to remind about CAD OSE and that this is not the only version of CAD. ;)
@subutai , @rhyolight my regards. I miss the competition. You are great guys, but someone has to be on the dark (not opened) side ;)

@earthgecko
Copy link
Contributor Author

@smirmik I think that at some point, we will all come to realise, that we are all actually on the same side

@earthgecko
Copy link
Contributor Author

earthgecko commented Jan 16, 2019

Wait until I add numenta and CAD OSE algorithms to Skyline and then we will see ;)
I could weight CONSENSUS values by NAB scores LOL

We are all anomaly hunters in an information dimension. And to be honest, I am not sure that you lot and Jeff should be allowed to play with digital brains however you see fit :)

However it must be said that it is amazing what you have achieved to date with such a port, such an abstract endeavour! I was glad to see that some of nupic scores have cyclonic complexity score of F, I would honesty expect nothing less. Sincere kudos to the team, as always, a remarkable achievement.

@smirmik not to down play CAD OSE in anyway :) It just that numenta must be acknowledged for having done something really, really unique and innovative on a much grander canvas then simple anomaly detection.

This took me much longer to post than expected, due to a github anomaly :)

@rhyolight
Copy link
Member

I am happy enough to create a PR to enable that if you want to keep it in.

Would this be a change to this PR or are you talking about creating a new PR for it?

@earthgecko
Copy link
Contributor Author

Hi @rhyolight it will be a separate PR.

@rhyolight rhyolight self-requested a review January 17, 2019 22:55
@rhyolight rhyolight self-assigned this Jan 17, 2019
Copy link
Member

@rhyolight rhyolight left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@earthgecko Looks nice and clean, but please run pylint with our pylintrc file and fix errors. Mostly spacing / indentation.

For example:

pylint --rcfile=$NUPIC/pylintrc nab/detectors/earthgecko_skyline/algorithms.py

nab/detectors/earthgecko_skyline/README.md Outdated Show resolved Hide resolved
Copy link
Member

@rhyolight rhyolight left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized that most all the pylint warnings are coming from the copied code from the skyline detector. I can't really expect you do update all that code, nor do I want you to. I'll take a manual look over it and ignore the minor stuff.

@earthgecko
Copy link
Contributor Author

@rhyolight I do not mind fixing all that actually. If I have to look at it and work with it, it always bothers me when things are not linted and it makes inline liniting in an IDE inefficient too. I will take a look.

Typo

Co-Authored-By: earthgecko <gary.wilson@of-networks.co.uk>
@rhyolight
Copy link
Member

Ok, but don't change the skyline function names that were copied over. I want them to be exactly the same for search/find later. IF you want to make a PR after this that lints the entire NAB codebase I would be cool with that.

@rhyolight
Copy link
Member

I ran locally and got the same results as this PR, aside from expected rounding errors.

@rhyolight
Copy link
Member

@subutai @earthgecko

Aside from any outstanding code review issues....

Before I do anything at this point, I want to make it clear what we have agreed to. Here is the path forward I prefer, please review and comment before I take action:

  1. We merge this PR, which adds "earthgecko skyline" to the leaderboard on the README
  2. There will be no version bump
  3. On version bump, we'll decide what to do about the two skyline implementations on the leaderboard.

If you guys agree to that, I approve this PR. 👍

@earthgecko
Copy link
Contributor Author

@rhyolight sounds good to me. If you would like any further changes with regards to aligning with the numenta/nupic/pylintrc please let me know which specifically and what you do not what changed, like the Skyline snake_case function names.

@subutai
Copy link
Member

subutai commented Jan 18, 2019

@rhyolight Sounds good to me!

I assume we will later disable the old Skyline as the default case.

@rhyolight rhyolight merged commit 33bcc77 into numenta:master Jan 18, 2019
@earthgecko
Copy link
Contributor Author

@subutai Yes I shall open an issue to set skyline as disabled by default ... #333 opened, please feel free to assign it to me :) I shall endeavour to put something together to that effect.

@earthgecko
Copy link
Contributor Author

@rhyolight the funny thing is ...

(NAB-py2715-cp27mu) $ pylint --rcfile=$NUPIC/pylintrc nab/detectors/earthgecko_skyline/algorithms.py
Using config file ~/sandbox/of/github/numenta/nupic/pylintrc

------------------------------------
Your code has been rated at 10.00/10

(NAB-py2715-cp27mu) $ pylint --rcfile=$NUPIC/pylintrc nab/detectors/earthgecko_skyline/earthgecko_skyline_detector.py
Using config file ~/sandbox/of/github/numenta/nupic/pylintrc
************* Module nab.detectors.earthgecko_skyline.earthgecko_skyline_detector
C0412,  84: 2 - Imports from package nab are not grouped (ungrouped-imports)

-----------------------------------
Your code has been rated at 9.94/10

(NAB-py2715-cp27mu) $ pylint --rcfile=$NUPIC/pylintrc nab/detectors/earthgecko_skyline/skyline_algorithms.py
Using config file ~/sandbox/of/github/numenta/nupic/pylintrc

------------------------------------
Your code has been rated at 10.00/10

(NAB-py2715-cp27mu) $

I got it pylintrc'd for you :) However I had to change skyline function names that were in snake_case to get those ratings, so it was not what you wanted any way ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants