optimization results by psteinroe · Pull Request #32 · htm-community/NAB

psteinroe · 2020-04-22T07:56:29Z

I optimized overnight (using my MacBook, so not much processing power) with two processes in parallel - on using localAreaDensity and one using numActiveCols and a fixed seed of 5.

Two things that I find interesting:

numActiveCols seems to superior. Max standard score was 71, while localAreaDensity never reached 70.
When removing the fixed seed the standard score drops to 67.88. It seems like randomness plays a big role. I think adding seed as one of the params to optimise (instead of setting it fix to 5 during the optimization) might further increase the score but in my opinion this would be bad practice...
Maybe, instead of returning the standard score during the optimization, we could try to return the mean of all three scores?

Edit: Results are

"reward_low_FN_rate": 76.5626293570994,
"reward_low_FP_rate": 61.359926511549155,
"standard": 71.3094612770284

breznak · 2020-04-22T12:04:25Z

These are very nice scores for HTMcore!!

Results are
"reward_low_FN_rate": 76.5626293570994,
"reward_low_FP_rate": 61.359926511549155,
"standard": 71.3094612770284

Compared to

Numenta HTM* 70.5-69.7 62.6-61.7 75.2-74.2 numenta

Numenta HTM using NuPIC v0.5.6* 70.1 63.1 74.3

NumentaTM HTM* 64.6 56.7 69.2 (aka our type of TM used)

Numenta HTM*, no likelihood 53.62 34.15 61.89

So we could say we're the winners now! 💯 Best HTM model score on NAB dataset (*and now we have some new features in the sleeve, just were held back because "how does it affect performance". And we can reliably answer that now!)

But...

numActiveCols seems to superior. Max standard score was 71, while localAreaDensity never reached 70.

ok, but it's a close call. 1% shouldn't be that important. Also, as I understand it, this is just 1-param opt, right? I'll get your framework running, and then try running it on a cluster as well.

When removing the fixed seed the standard score drops to 67.88. It seems like randomness plays a big role. I think adding seed as one of the params to optimise (instead of setting it fix to 5 during the optimization) might further increase the score but in my opinion this would be bad practice...

this is a bad thing. It should never be so sensitive to the rng seed! I'm wondering if the dataset is not good, being so sensitive to overfitting.
Or if there could be a bug in our algos that handles fixed seed somehow differently. Curiosity: are the good results only if rng seed is "5"? Or you get same score for, say 42? Or same for 42 after re-tuning?
But to conclude, we want results with random seed (ie not specified or set to the special value that means "completely random"). The scores might be worse, but the results would be corresponding to general reality/performance on any dataset.

we could try to return the mean of all three scores?

TBH, I don't know how exacly the scores are computed, but the "standard" should be just that. Some balance between low FP, low FN.

breznak · 2020-04-22T12:05:41Z

nab/detectors/htmcore/htmcore_detector.py

+        'synPermActiveInc': 0.003892649892638879,
+        'synPermConnected': 0.22110323252238637,
+        'synPermInactiveDec': 0.0006151856346474387,
+        'seed': 5,


let's not use the fixed seed, keep it completely random. That way, results won't be overfitted and should generalize.

breznak · 2020-04-22T12:06:31Z

results/final_results.json

-        "standard": 57.22915150504096
+        "reward_low_FN_rate": 76.5626293570994,
+        "reward_low_FP_rate": 61.359926511549155,
+        "standard": 71.3094612770284


please rerun for the new (worse :( ) scores. And you can update README with the new results! 👏

psteinroe · 2020-04-22T13:45:40Z

I will rerun the optimization tonight without setting the seed parameter to see how it influences the results. Numenta did set the seed to a fixed value check here for their models, but I fully agree that this is bad practice.

Or if there could be a bug in our algos that handles fixed seed somehow differently.

We should check if setting no seed means using a random one for TM, SP as well as the RSDE Encoder to be sure.

breznak · 2020-04-22T14:17:25Z

I will rerun the optimization tonight without setting the seed parameter to see how it influences the results.

I want to play around, so I suggest we'll just merge results and correct/update it later.

We should check if setting no seed means using a random one for TM, SP as well as the RSDE Encoder to be sure.

I can do that but I'm quite sure it defaults to random seed.
What I'm wondering is why would a fixed seed have such a great effect? The sequence should still be pseudo-random.
My intuition would be:
Say we're walking all cells/columns in a layer in a for-loop:

unless ordered (ie by overlap), we walk them "randomly".
current "random (unseeded) random" generates a different sequence each call. That is like if "asynchronous processing", imho the optimal case.
we could use "random-seeded random", a compromise between current, and fixed seed. seed=rng(), that way, each instance has a different seed, but within one instance of the object, there's a fixed seed, which leads to the columns walked in fixed manner. I think this theoretically could lead to better/easier emergence of patterns among columns.

breznak · 2020-04-22T16:47:21Z

We should check if setting no seed means using a random one for TM, SP as well as the RSDE Encoder to be sure.

Turns out default params were set to default to fixed. I have a PR that changes that, just need to iron out all the determinism tests.

A quick workaround would be to force seed=0 everywhere, that means random random.

psteinroe · 2020-04-22T18:38:20Z

we could use "random-seeded random"

This sounds like the right way to do it. Can we implement that? Or does it behave like that already?

I have a PR that changes that, just need to iron out all the determinism tests.

Very nice, thanks!!

A quick workaround would be to force seed=0 everywhere, that means random random.

Alright, I will do that for tonights run.

optimization results

20452e4

breznak requested changes Apr 22, 2020

View reviewed changes

breznak merged commit 7f73723 into htm-community:fixing_spatial_anomaly Apr 22, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optimization results#32

optimization results#32
breznak merged 1 commit intohtm-community:fixing_spatial_anomalyfrom
psteinroe:fixing_spatial_anomaly

psteinroe commented Apr 22, 2020 •

edited

Loading

Uh oh!

breznak commented Apr 22, 2020

Uh oh!

breznak Apr 22, 2020

Uh oh!

breznak Apr 22, 2020

Uh oh!

psteinroe commented Apr 22, 2020 •

edited

Loading

Uh oh!

breznak commented Apr 22, 2020

Uh oh!

breznak commented Apr 22, 2020

Uh oh!

psteinroe commented Apr 22, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

psteinroe commented Apr 22, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

breznak commented Apr 22, 2020

Uh oh!

breznak Apr 22, 2020

Choose a reason for hiding this comment

Uh oh!

breznak Apr 22, 2020

Choose a reason for hiding this comment

Uh oh!

psteinroe commented Apr 22, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

breznak commented Apr 22, 2020

Uh oh!

breznak commented Apr 22, 2020

Uh oh!

psteinroe commented Apr 22, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

psteinroe commented Apr 22, 2020 •

edited

Loading

psteinroe commented Apr 22, 2020 •

edited

Loading