Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added port to Python3 of htmresearch location framework to htm.core #609

Merged
merged 27 commits into from Sep 17, 2019

Conversation

@fcr
Copy link

commented Aug 6, 2019

I'm in the middle of porting selected Python parts of htmresearch to htm.core and Python 3.
I needed to check that the Connections were working so added some tests.
I may expand these in the future if the need arises.

@breznak

This comment has been minimized.

Copy link
Member

commented Aug 7, 2019

Thanks, there's a previous work in this matter, #349 , how does that compare?

A word of caution, Connections are considered our internal, backend structure, so are freely subject to change, even API breakage.
On the other hand I assume this work is for htm.reseach, so you would be using Connections directly (just keep in mind that same as research is not stable, nor is Connections).

But if we already have a py bindings for connections, we shoould also have tests.

@breznak breznak requested a review from ctrl-z-9000-times Aug 7, 2019

@fcr fcr changed the title Added unit tests for the Connections Python bindings Added port of htmresearch location framework to htm.core and Python3 Sep 8, 2019

@fcr

This comment has been minimized.

Copy link
Author

commented Sep 8, 2019

I have ported the location framework from htmresearch so that it is based on htm.core rather than nupic.core and runs under Python3.
The location framework is the code that underlies Jeff Hawkins Thousand Brains model of the Neocortex and hence is probably the most important framework in htmresearch.
The port includes unit tests and a couple of example applications using the location framework.
This pull request also includes a few additions the the htm.core Python bindings that are needed to support the location framework port.
Additional goodies are the RawSensor and RawValues regions, the GridCellLocationRegion and the ColumnPoolerRegion.
Please note that there is a subclass of the Connections class which provides some methods that are missing from the htm.core Connections.cpp. These methods would be good candidates for inclusion in the Connections.cpp.

@fcr fcr changed the title Added port of htmresearch location framework to htm.core and Python3 Added port to Python3 of htmresearch location framework to htm.core Sep 8, 2019

@breznak

This comment has been minimized.

Copy link
Member

commented Sep 9, 2019

I have ported the location framework from htmresearch so that it is based on htm.core rather than nupic.core and runs under Python3.
The location framework is the code that underlies Jeff Hawkins Thousand Brains model of the Neocortex and hence is probably the most important framework in htmresearch.

@fcr This is an awesome feat! 💯

Before I start looking into this, and I'd like both @dkeeney , @ctrl-z-9000-times to make their reviews too, as this is an important update! ... this is 17k LOC, do you think you could split this into a series of more digestable PRs? (if not, we'll go through, but modularization would be welcomed)

Maybe?:

The port includes unit tests and a couple of example applications using the location framework.
This pull request also includes a few additions the the htm.core Python bindings that are needed to support the location framework port.

PR-1

Additional goodies are the RawSensor and RawValues regions, the GridCellLocationRegion and the ColumnPoolerRegion.

Raw, GridCells, CP as separate PRs 2,3,4?

@breznak
Copy link
Member

left a comment

Part 1/ N? of my reviews.

Can you please move the files under a namespace. so we clearly see those are used only for the "htmresearch" code? If something becomes used by other, we'll move it up.
maybe {experimenta,htmresearch,research,...}*
py/htm/???*/algorithms

@dkeeney
Copy link

left a comment

This is a significant addition to our library. Good work.

As always, I have to defer to others regarding the usefulness of the algorithms. I was hoping to find the NetworkAPI Region modules for the algorithms. I would like my NetworkAPI users to have access to these new algorithms. Do they exist? Are they coming in a later PR or am I looking in the wrong place. Since you are using a personal fork I cannot see your entire branch.

Otherwise, I have no problem with the changes that I can see.

@fcr

This comment has been minimized.

Copy link
Author

commented Sep 9, 2019

@dkeeney> As always, I have to defer to others regarding the usefulness of the algorithms. I was hoping to find the NetworkAPI Region modules for the algorithms. I would like my NetworkAPI users to have access to these new algorithms. Do they exist? Are they coming in a later PR or am I looking in the wrong place. Since you are using a personal fork I cannot see your entire branch.

If I understand your question, look in the htm.regions package. There are quite a few new regions.

@ctrl-z-9000-times

This comment has been minimized.

Copy link

commented Sep 9, 2019

Do you need to include the *.training.log files? They appear to be just lists of numbers.

@breznak

This comment has been minimized.

Copy link
Member

commented Sep 9, 2019

Do you need to include the *.training.log files? They appear to be just lists of numbers.

nice, that would significantly bring down the num of changed files

@dkeeney

This comment has been minimized.

Copy link

commented Sep 9, 2019

If I understand your question, look in the htm.regions package. There are quite a few new regions.

Ah, I figured it out. I have to click on your branch name at the top in order to see everything.
And yes. There are the Regions. Wonderful.

@fcr

This comment has been minimized.

Copy link
Author

commented Sep 9, 2019

Do you need to include the *.training.log files? They appear to be just lists of numbers.

Yes. They are the object definition data for the example.

@fcr

This comment has been minimized.

Copy link
Author

commented Sep 9, 2019

BTW are you still sure that you want me to separate out the additions. Now that I have done it (but will not have time to release it until some time tomorrow - I am in the Israel time zone), it just doesn't look so nice:
algorithms in htm, algorithms in htm_advanced, examples under htm, examples under htm_advanced etc. and two levels of tests.

@breznak

This comment has been minimized.

Copy link
Member

commented Sep 9, 2019

to separate out the additions. [...] it just doesn't look so nice:
algorithms in htm, algorithms in htm_advanced, examples under htm, examples under htm_advanced etc. and two levels of tests.

depends if it's more readable in logical steps..?

@fcr

This comment has been minimized.

Copy link
Author

commented Sep 15, 2019

I have implemented and released the change from setParameterArray to executeCommand.
The resultant code is much nicer IMHO.
Note that region.init.py has some helper methods for the executeCommand to make things neater.
I also took the opportunity to fix some minor problems and add some new unit tests.

@dkeeney
Copy link

left a comment

The changes look good to me.
I will let @breznak do the approve because I think he is still reviewing.

@ctrl-z-9000-times

This comment has been minimized.

Copy link

commented Sep 15, 2019

Hi fcr,

I'm trying to run the examples but I can't seem to get them to work. What I'd really like to see are the figures from the columns papers. Can you demonstrate that this can reproduce any of the images?

Thanks.

@fcr

This comment has been minimized.

Copy link
Author

commented Sep 15, 2019

@ctrl-z-9000-times Did you run them with the committed in configurations? Anything less will not accumulate the statistics needed to generate the figure correctly if at all.
But when you do that, they take ages to run.
I did run them earlier and got the figures but I deleted the results folders before committing my push request to keep the release clean.
After getting your query, I started running the union path integration example directly in my copy of htm.core. It is still merrily chugging away and will probably take most of the night.
In any case my main motivation for including the examples was to show example usage of the framework and as a sort of final sanity check - not so much for people to reproduce the paper results since this takes so long.
I'll let you know what I get when I have run both examples again.

@ctrl-z-9000-times

This comment has been minimized.

Copy link

commented Sep 16, 2019

@ctrl-z-9000-times Did you run them with the committed in configurations?

I did, but I did not wait very long. If the results are the same as the Numenta papers results, then I guess there is nothing to see :) If you know the results are the same then I'm on board with this PR. Thank you for re-running this but you do not need to re-run it if it works.

+1

@fcr

This comment has been minimized.

Copy link
Author

commented Sep 16, 2019

union path integration was finished when I woke up this morning. About 1.3Gb of plots and graphs. I can send them if you want - they are only 195Mb compressed. Just don't accuse me of sending zip bombs. :-).
Since you are convinced, I will not rerun the thing calculation. It takes even longer.

Changed access to SPRegion "synPermConnected" parameter from
ReadOnlyAccess to ReadWriteAccess.
Added some unit tests in Python to test this.
@breznak

This comment has been minimized.

Copy link
Member

commented Sep 16, 2019

After getting your query, I started running the union path integration example directly in my copy of htm.core. It is still merrily chugging away and will probably take most of the night.

thank you both for throuroughly testing the code, and (re)running the examples. Good to know everything works (tm). Out of curiosity, are the results with htm.core somewhat significanltly different from those based on nupic.core? (I wouldn't think so, but we do some computations slightly differently)

@breznak
Copy link
Member

left a comment

Thanks for the added tests and the cleanups, @fcr !

from htm.bindings.engine_internal import Array
import numpy as np

from .import extractList

This comment has been minimized.

Copy link
@breznak

breznak Sep 16, 2019

Member

nit, probably ok, but next time please use the absolute paths for imports. The code is then easier to read and move around.

This comment has been minimized.

Copy link
@fcr

fcr Sep 16, 2019

Author

The reference emphasizes the close relationship of the method with the region implementations. Its there to support them. Besides this is good Pythonic 3 usage. :-)

py/htm/advanced/regions/__init__.py Outdated Show resolved Hide resolved
@fcr

This comment has been minimized.

Copy link
Author

commented Sep 16, 2019

Pushed another another small bug that I found and blocked my progress.

@fcr

This comment has been minimized.

Copy link
Author

commented Sep 16, 2019

Oops. Forgot to run the C++ unit test before pushing. :-(
Too used to just running the Python ones.

@breznak
Copy link
Member

left a comment

There are some issues with the latest commit, please review below

"seed": 1956,
"spVerbosity": 0}

class SPRegionTests(unittest.TestCase):

This comment has been minimized.

Copy link
@breznak

breznak Sep 16, 2019

Member

is this test useful? It's fine to have py-only tests, but we don't want to duplicate each. Please see bindings/py/tests/network_test.py for SPRegion tested parts.
Maybe you can add the relevant parts there(?)

@@ -349,7 +349,7 @@ Spec *SPRegion::createSpec() {
1, // elementCount
"", // constraints
"0.1", // defaultValue
ParameterSpec::ReadOnlyAccess)); // access
ParameterSpec::ReadWriteAccess)); // access

This comment has been minimized.

Copy link
@breznak

breznak Sep 16, 2019

Member

Please don't. Sorry it isn't really clear from the current code, it is a WIP in #561 , synPermConnected is essentially const, since it's used in Connections during initialization (and cannot, by its logic, be changed after that). A code that would change that variable would break all previous learning.

You mentioned this was "another obstacle" for you, how is that?

This comment has been minimized.

Copy link
@fcr

fcr Sep 16, 2019

Author

I need to be able to set this in creating my SPRegion or is there another way of doing this?

This comment has been minimized.

Copy link
@fcr

fcr Sep 16, 2019

Author

I guess I'll meanwhile rollback this change and get your help to deal with this issue.

This comment has been minimized.

Copy link
@breznak

breznak Sep 16, 2019

Member

I need to be able to set this in creating my SPRegion or is there another way of doing this?

You can create a SPRegion, just don't set the RO variables.
If you wanted to change the synPermConnected after initialization of the region, don't. The previous learning will be lost, and the code shouldn't allow it.

This comment has been minimized.

Copy link
@fcr

fcr Sep 16, 2019

Author

No. I want to set the synPermConnected at SPRegion creation. This is normally done by means of the parameters dict. However, synPermConnected has ReadOnlyAccess and hence that entry in the parameter dict is rejected.

This comment has been minimized.

Copy link
@fcr

fcr Sep 16, 2019

Author

@fcr you could workaround the inability to change synPermConnected by appropriately changing synPermActiveInc/Dec, it has the same effect.

Same effect but wrong semantics.

This comment has been minimized.

Copy link
@fcr

fcr Sep 16, 2019

Author

By the way. A parameter passed in during creation should have an access of CreateAccess rather than ReadOnlyAccess. CreateAccess means can be used for initialization but read-only after that. ReadOnlyAccess implies that it is something created by the algorithm and is readable.

Great!
Can you please change synPermConnected to CreateAccess?
I'm not touching the C++ stuff any more. Don't want to break anything. :-)

This comment has been minimized.

Copy link
@fcr

fcr Sep 16, 2019

Author

Once this PR is finally accepted, I'll go through and start applying the CreateAccess to the relevant Python region specs and make a new (smaller) PR.

This comment has been minimized.

Copy link
@fcr

fcr Sep 16, 2019

Author

BTW never came across CreateAccess before, at least not in any of the Python code.

This comment has been minimized.

Copy link
@breznak

breznak Sep 16, 2019

Member

tracked in #669

Restored synPermConnected to ReadOnlyAccess.
Skip SPRegionTest that tries to set this parameter
@fcr

This comment has been minimized.

Copy link
Author

commented Sep 16, 2019

I have restored synPermConnected to ReadOnlyAccess, but I have left the test for this py/tests/regions/sp_region_test.SPRegionTest.testSPRegionParametersAreWritable() but it is skipped for now.
If you unskip it you will understand what I mean.

@fcr

This comment has been minimized.

Copy link
Author

commented Sep 16, 2019

See my last comment in #669

dkeeney added 2 commits Sep 17, 2019
@breznak

This comment has been minimized.

Copy link
Member

commented Sep 17, 2019

@fcr Can you please merge master with #670 , which includes David's fix for the issue with executeCommand? And it should allow you to get rid of the custom htm.advanced.regions.__init__ executeCommand

@breznak
Copy link
Member

left a comment

  • please merge the new fixes for executeCommand
  • I have some questions about what should we do with the new regions
  • like the number of new tests 👍
import random
from collections import defaultdict, OrderedDict

import matplotlib

This comment has been minimized.

Copy link
@breznak

breznak Sep 17, 2019

Member

slightly OT: this example script is not run by default, and would fail to do so w/o manually installing matplotlib. Some files use try ... except for loading matplotlib.
Do we want to add it (mpl) as a pip dependency?

  • it shouldn't limit us with any platforms, I think
  • it still won't be 100% fool-proof, as I recall mpl has sometimes issues with Qt on Linux, and some manual steps may be required.
py/htm/advanced/regions/__init__.py Outdated Show resolved Hide resolved
# ----------------------------------------------------------------------
"""
Method to register regions in the nupic.research repository. Regions
specified here must be registered before NuPIC's Network API can locate them.

This comment has been minimized.

Copy link
@breznak

breznak Sep 17, 2019

Member

can this be moved to htm.advanced.regions.__init__ ?

in a FIFO and each call to compute pops the top element.
Each data record consists of the non-zero indices of the sparse vector,
a 0/1 reset flag, and an integer sequence ID.

This comment has been minimized.

Copy link
@breznak

breznak Sep 17, 2019

Member

@dkeeney do we need this region? "sparse vector" is exactly our SDR, and I think the main InputRegion was adapted to accept SDRs?
Q is if it supports the additional bells&whistles (0/1 reset flag, and sequence ID)?

This comment has been minimized.

Copy link
@dkeeney

dkeeney Sep 17, 2019

@dkeeney do we need this region? "sparse vector" is exactly our SDR,

I think that this is not so much a facility to create a sparse vector but a facility to inject data into a link that can flow to other regions. Similar to what VectorFileSensor does with data from a file. I am thinking this might be implemented as a kind of encoder.

" and I think the main InputRegion "

Not following you on this statement.


class RawValues(PyRegion):
"""
RawDate is a simple region used to send raw scalar values into networks.

This comment has been minimized.

Copy link
@breznak

breznak Sep 17, 2019

Member

@dkeeney this sounds generally useful, do we want to promote the region from .advanced to htm.regions ?

But that would be only for python. We already have SensorRegion which applies an encoder to the imputs. A solution would be to have an identity/pass-through encoder (=that does nothing with its inputs) and use that instead of this class.

  • BaseEncoder could behave as an identity encoder.

This comment has been minimized.

Copy link
@dkeeney

dkeeney Sep 17, 2019

A solution would be to have an identity/pass-through encoder (=that does nothing with its inputs) and use that instead of this class.

Its the lack of a NetworkAPI region that is the problem. This Raw region is one solution to it. The EncoderRegion would have been able to do that if I could figure out how to make it work.

This comment has been minimized.

Copy link
@breznak

breznak Sep 17, 2019

Member

Yes, I was thinking about a Generic encoder region #637 would come in handy here.

So, the remaining point, do we want to move "RawValues" (maybe rename to ??) from advanced to the common py regions?

This comment has been minimized.

Copy link
@fcr

fcr Sep 17, 2019

Author

Maybe RawSensor also. It queues sparse input and pops it as dense.
In either case, the region.__init__.py will need to move as well since it provides extractList.

This comment has been minimized.

Copy link
@fcr

fcr Sep 17, 2019

Author

I am about to push the changes due to #670

fcr added 2 commits Sep 17, 2019
@breznak
Copy link
Member

left a comment

Thanks @fcr for this big addition! 💯
It's great to have HTM.research as a part of our repo, working.

I'm happy with the current state, and I think points from other developers have been adressed as well. So let's ship this.

TODO:

  • followup issue #669
  • Fred, would you make some comment on the remaining, relevant parts of htm.research that might be useful porting here?
  • now, let's celebrate 🍾 🎆

Thank you! 👍

@breznak breznak merged commit 66f9186 into htm-community:master Sep 17, 2019

4 checks passed

ci/circleci: osx-build-debug Your tests passed on CircleCI!
Details
ci/circleci: osx-build-release Your tests passed on CircleCI!
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@breznak breznak referenced this pull request Sep 17, 2019
@fcr

This comment has been minimized.

Copy link
Author

commented Sep 17, 2019

Thank you @breznak and @dkeeney.
I am sure that is not easy to review such a large body of code.
If you want another part to port, htmresearch.core->ApicalTiebreakTemporalMemory.cpp would be a good candidate.
This would be a great performance boost to the framework. After htm.core this should be child's play for you. ;-).

@ctrl-z-9000-times

This comment has been minimized.

Copy link

commented Sep 17, 2019

Thank you @fcr, i ve tried several times to port this code and each time gotten lost in the woods. This is a great addition to this repo!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.