-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Encoders in c++ #258
Comments
I'd like to make some SDR-based tools for working with encoders. The fact that we have no existing C++ encoders means that we can make them in whatever way we'd like to. MultiEncoder via SDR-ConcatenatorThe MultiEncoder creates a group of encoders and concatenates the results together. I'd like to make an SDR-Concatenator class to do this. The users would create the encoders and use this class to join the results into a single SDR to give to the algorithms. Example:
SDR-IntersectionThis would be useful for working with multidimensional data. The user encodes each dimension separately and then takes the intersection of the resulting SDR's. The result is an SDR where each bit responds to an area of the input space. Category EncoderWould be nice to have. |
There was some interest in encoders at the forums, hope it'll make our repo more exciting and accessible.
|
We do have a rudimentary encoder: ScalarEncoder.cpp |
This wont work for encoders which have dimensions. Imagine a large image with 3 color channels (RGB), and you want to encode each color separately and then combine them into a large SDR with topology. In this situation you need to splice together each pixels encoded color. |
I started a wiki page listing all of the encoders in both C++ & Python repositories, annotated. This wiki page also contains a tentative plan of action for providing a cohesive set of features. https://github.com/htm-community/nupic.cpp/wiki/Encoder-Roundup |
Can Python Encoders use SDRs?I'd like for the python encoders to use SDRs, and this brings up an interesting topic: we agreed to merge the pure python code into this repo, see issue #216. We also agreed that the python should remain separate from the C++ code. Does it need to be absolutely 100% separate? Or python make use of the C++ SDR & Connections classes? To answer this I question why users might prefer python:
The downside of integrating SDR into the Encoders is that it adds a new API to the encoder algorithms, which then needs to be supported. This issue won't effect the NetworkAPI. |
My vote would be to encode all encoders in 100% C++.
|
For the most part I agree, but here are a few counter arguments:
|
I don't need ALL of the encoders in C++ but one of my personal objectives is to eventually provide a set of bindings for C#. A C# app using our library is not going to have access to any Python modules. It would be nice to be able to just call into C++ for encoders. Otherwise I would have to duplicate the logic in C#. |
RDSE Algorithm MemoI hope to change the implementation of the Random Distributed Scalar Encoder (RDSE). Inside of this encoder: the RDSE transforms a real valued input into an integer valued index, and then it associates the index with a set of active bits.
Pros:
Cons:
|
I would think that you could still perform a Decode. You are not storing the previously used patterns but you can re-calculate the patterns used provided you have the starting seed. Just cycle through the used real values until you find one that results in a pattern that matches the one you are trying to decode. Slow but it would work. Or am I misunderstanding what you are proposing. But then again....decoding is not biological. The only way we know a color is RED is that we match it with another pattern that someone in our experience has told us is RED. The sound of the spoken word "RED" and the word RED all match with the pattern of RED in our experiance. Is that decoding? One could argue that encoders are sort of biological depending on the data being encoded. |
That could be very time consuming. The range of values for an RDSE is infinite. An alternative to decode method is to make an SDR classifier. We could even integrate the classifier into the encoder to provide a decode method? Would want it to be optional since SDR classifier has significant overhead. |
Category EncodersCategory encoders should be implemented as Scalar encoders, which encode an Enumeration of the categories using a radius of less than 1. I think we should not implement category encoders, but rather describe to the user how to make them. We would document this in the following places:
Both places already contain a general description of what an encoder is. I think we should add our notes about encoders to these locations. Also, we should add a few unit tests to prove this works. |
yes, that is suffecient. Category encoder used to work as a demonstration example, and I guess de-coding was easier to implement, but we don't support that anymore.
this, or I can imagine an encoders/README.md with most of the text collected from this PR, issue ,.. |
will now have the best of both worlds, category encoding implemented "via" a flag to RDSE/Scalar. See #448
I really like the "blog" posts in this issue. Just a note about the wiki, I think it would be even better to make an encoders/README.md with its content.
|
I'd like to close this issue, as well as PR #291. All of the tasks here have either been completed or have been moved to another open issue, except for:
The encoders are documented, tested, and have a few examples, so I'd say this is done. Giving an in depth explanation is beyond the scope of this project. There is an HTM-School video about how encoders work, as well as a whitepaper. We could put a link to the HTM-School youtube channel in the README. Great work all around on this issue! |
Closing this issue, please reopen if there is more to discuss. |
We need to port more Encoders to c++ for practical usability of this repo.
Mainly
There's also an issue for Extra encoders for special stuff #259
EDIT:
topic
https://discourse.numenta.org/t/repo-for-merging-various-encoders/5397
Outstanding Tasks for ScalarEncoder:
I think this task is covered by the python viewing script - @ctrl-z-9000-times
Outstanding Tasks for RDSE:
See PR #278
Outstanding Tasks for CategoryEncoder:
See PR #435
The text was updated successfully, but these errors were encountered: