New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encoders in c++ #258

Open
breznak opened this Issue Feb 9, 2019 · 9 comments

Comments

Projects
None yet
3 participants
@breznak
Copy link
Member

breznak commented Feb 9, 2019

We need to port more Encoders to c++ for practical usability of this repo.

Mainly

  • RDSE
  • MultiEncoder

There's also an issue for Extra encoders for special stuff #259

EDIT:
topic
https://discourse.numenta.org/t/repo-for-merging-various-encoders/5397

@breznak breznak added the encoder label Feb 9, 2019

@ctrl-z-9000-times

This comment has been minimized.

Copy link

ctrl-z-9000-times commented Feb 12, 2019

I'd like to make some SDR-based tools for working with encoders. The fact that we have no existing C++ encoders means that we can make them in whatever way we'd like to.

MultiEncoder via SDR-Concatenator

The MultiEncoder creates a group of encoders and concatenates the results together. I'd like to make an SDR-Concatenator class to do this. The users would create the encoders and use this class to join the results into a single SDR to give to the algorithms. Example:

SDR A  <-  from constituent encoder
SDR B  <-  from constituent encoder
SDR_Concatenator C( A, B, axis=0 )
A.setDense( data )
B.setDense( data )
C.getDense() -> A & B concatenated

SDR-Intersection

This would be useful for working with multidimensional data. The user encodes each dimension separately and then takes the intersection of the resulting SDR's. The result is an SDR where each bit responds to an area of the input space.

Category Encoder

Would be nice to have.

@breznak

This comment has been minimized.

Copy link
Member Author

breznak commented Feb 12, 2019

There was some interest in encoders at the forums, hope it'll make our repo more exciting and accessible.

The MultiEncoder creates a group of encoders ... make an SDR-Concatenator class to do this.

  • make it a function of SDR::append(vector<SDR> concatenate) ?
  • call at a MultiEncoder, and let it do what you've described
  • no need as it's rather easy to do with SDRs now, just show "best practices" as
vector concat(sdr1.getDense());
concat.assign(concat.end(), sdr2.getDense().begin(), sdr2.getDense().end());
SDR concatenated; 
concatenated.setDense(concat);
@dkeeney

This comment has been minimized.

Copy link

dkeeney commented Feb 12, 2019

We do have a rudimentary encoder: ScalarEncoder.cpp
But there is a lot more we could do there. Be sure that we also include a Region implication that can handle the new encoders. ScalerSensor.cpp is the one for ScalarEncoder. Perhaps a general purpose region that can handle any type of encoder would be cool.

@ctrl-z-9000-times

This comment has been minimized.

Copy link

ctrl-z-9000-times commented Feb 12, 2019

no need as it's rather easy to do with SDRs now, just show "best practices" as

This wont work for encoders which have dimensions. Imagine a large image with 3 color channels (RGB), and you want to encode each color separately and then combine them into a large SDR with topology. In this situation you need to splice together each pixels encoded color.

@ctrl-z-9000-times

This comment has been minimized.

Copy link

ctrl-z-9000-times commented Feb 16, 2019

I started a wiki page listing all of the encoders in both C++ & Python repositories, annotated. This wiki page also contains a tentative plan of action for providing a cohesive set of features.

https://github.com/htm-community/nupic.cpp/wiki/Encoder-Roundup

@ctrl-z-9000-times

This comment has been minimized.

Copy link

ctrl-z-9000-times commented Feb 16, 2019

Can Python Encoders use SDRs?

I'd like for the python encoders to use SDRs, and this brings up an interesting topic: we agreed to merge the pure python code into this repo, see issue #216. We also agreed that the python should remain separate from the C++ code. Does it need to be absolutely 100% separate? Or python make use of the C++ SDR & Connections classes? To answer this I question why users might prefer python:

  • Python is easy to setup & install. The C++ is getting a lot better at this, many thanks to David Keeney for his work on CMake and reducing external dependencies.
  • Python is easy to inspect & interrogate. The SDR & Connections have bindings which make this easy to do.
  • Python is easy to experiment with. Python can not subclass & override C++ bindings, but this limitation can be mitigated by allowing python to register callbacks for events which the SDR & Connections C++ classes already have.
  • Python is easy to use. The C++ SDR is easy to use as well, so I think that integrating the SDR into the python code will further this goal.

The downside of integrating SDR into the Encoders is that it adds a new API to the encoder algorithms, which then needs to be supported. This issue won't effect the NetworkAPI.

@dkeeney

This comment has been minimized.

Copy link

dkeeney commented Feb 16, 2019

My vote would be to encode all encoders in 100% C++.

  • The SDR class is available.
  • The incoming raw data to the encoders can be passed to C++ easy enough.
  • Experimenters that are building apps in 100% C++ can take advantage of these encoders.
  • The encoders become language independent by calling the C++ routines via its bindings. Python, C#, or whatever.
@ctrl-z-9000-times

This comment has been minimized.

Copy link

ctrl-z-9000-times commented Feb 16, 2019

My vote would be to encode all encoders in 100% C++.

For the most part I agree, but here are a few counter arguments:

  • All of the python encoders are already written & have unit tests.
  • ScalarEncoder - I think we should provide this in every language because it's the simplest example. It's like a "hello world" level of difficulty.
  • SDR-Category - Implementation must use python hash() & dict(), can be written in C++ w/ bindings?
  • delta.py & logarithm.py - Conveniences, not necessary for C++ but since python already has them why not use them.
  • date.py - Python's datetime library is too good to give up. datetime.datetime.today() -> (year, month, day, hour, minute, second, day-of-week, day-of-year, daylight-savings, time-zone, GMT-offset)
@dkeeney

This comment has been minimized.

Copy link

dkeeney commented Feb 16, 2019

I don't need ALL of the encoders in C++ but one of my personal objectives is to eventually provide a set of bindings for C#. A C# app using our library is not going to have access to any Python modules. It would be nice to be able to just call into C++ for encoders. Otherwise I would have to duplicate the logic in C#.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment