Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What HTM implementation will we use? #2

Closed
rhyolight opened this issue Dec 7, 2015 · 15 comments
Closed

What HTM implementation will we use? #2

rhyolight opened this issue Dec 7, 2015 · 15 comments

Comments

@rhyolight
Copy link
Contributor

I started this discussion already, let's finish it here.


Ideas

  1. You could use HTM Engine and the HTTP interface provided by the
    skeleton app to get started [2]. This might be really easy, and
    provide some scalability right off the bat. The work would mostly just
    be figuring out the Docker configuration for deployment. However, it
    would not allow users to provide model params, and adding this
    functionality would need to be done in the HTM Engine itself (others
    want this too [3]).
  2. You could use the simple HTTP wrapper around NuPIC provided by
    Jared Weiss [4]. It is just a SimpleHttpServer but it would be a fast
    prototype. Again, the work would be deployment configuration.
  3. You could use HTM-Moclu [5] and HTM.Java. I haven't been able to
    get this running without multiple servers (because Akka), but someone
    with the know-how could take it on.
  4. You could wrap a Java HTTP server over HTM.Java.
@rhyolight
Copy link
Contributor Author

Keep in mind that this choice will affect our options for HTTP server technology. If we pick NuPIC, we are stuck with Python.

@rhyolight rhyolight changed the title Decide what HTM implementation to use What HTM implementation will we use? Dec 7, 2015
@rhyolight
Copy link
Contributor Author

NuPIC / HTM Engine

Pros

  • It scales horizontally already, so running a couple hundred models should be ok.
  • The skeleton app @oxtopus made already provides a simple REST interface over top of HTM Engine.
  • It is maintained by Numenta, so changes would be quality controlled by Numenta engineers.
  • I could start using it very quickly for anomaly models monitoring IoT devices and public data streams.
  • Minimal work to get an MVP (just need to make a deployment plan).

Cons

  • It was not programmed to do predictions, so a new feature must be added to get there (see https://github.com/numenta/numenta-apps/issues/504 and https://github.com/numenta/numenta-apps/issues/503).
  • HTM Engine itself has a very nebulous interface via RabbitMQ and SQL. It is not well documented, but we will need to understand it if we're going to create a decent REST API over top of it.
  • It is maintained by Numenta, so changes would need to be reviewed and approved by Numenta engineers. This could slow down development.

@cogmission
Copy link
Member

#4 - Of course I think the Java ecosystem will be the most manageable and flexible and easiest to extend, going into the future. This (networked communication solutions) is what Java was made for.

However, I acknowledge that we would have to start from scratch as opposed to HTM Engine reuse. My opinion is that it is worth it.

This is your baby @rhyolight, so I don't want to get in the way - so whatever your "leanings" are, are fine with me. I just think going forward we must admit that a Java solution is a more "permanent" solution; but it depends on how highly you prioritize completing this as quick as possible. Going with NuPIC / HTM Engine will probably get you there faster, but is that the most important criteria? Also, once the Java version is built, it will be the easiest and quickest to extend and maintain.

However, regardless - this is the direction HTM.java is going in in the near future, so its not like doing this in Python will prevent the eventual development of this in Java - however, HTM.java could really benefit from the impetus created by this project to push its state forward to be more compatible with the Python version (which is best for everyone long term because the Java version will most likely be the canonical version in the future).

Also, if we choose a Java ecosystem I will be able to commit my time to this project, which is not a big consideration, I'm just putting that out there so there is no confusion as to whether I am "in" or not if Java is chosen.

@codedhard
Copy link

This is a step toward HTM with hierarchy.

Im for option 2. SimpleHttpServer is "low level".
The transactions must be quick they need to use UDP not TCP.
Optional - both can be servers, the problem would be with port forwarding, this way client don't have to wait for the answer from the server.

Protocol:
/reset/
/model/params={ model params ... }
/subscribe/code=...
/run/field_name1=1&field_name2=2&field_name3=3
/run/val[]=1&val[]=3&val[]=3

/reset/
This should reset the HTM.
/model/
Load model params
/subscribe/code
The code should be executed in each call of /run/ then output returned back to client.
/run/
The running step + code exec from /subscribe/code.
field_name1 input field name.

@cogmission
Copy link
Member

Also in keeping with the most current methodologies inspired by "functional" programming, the client should be built as an "event-driven" interface. Meaning, the client "reacts" to incoming data rather than blocks/waiting. This involves the separation of the task which submits the incoming stream of data; from the client task which processes the output stream. (Of course they could be combined if a client explicitly does this - but as a thought it maybe should not be built as a synchronous call).

This will keep it flexible and able to parallelize and maximize throughput.

Re: UDP - I have heard of another UDP alternative (not TCP) protocol which is "reliable" but I can't remember now. But wouldn't the use of UDP introduce indeterminacy? Is this something we can abide?

@jefffohl
Copy link
Member

jefffohl commented Dec 7, 2015

@cogmission RUDP? Is this what you are thinking of? https://en.wikipedia.org/wiki/Reliable_User_Datagram_Protocol

@cogmission
Copy link
Member

@jefffohl I think so... yes. My ex-company used to use this to deliver financial market data, I believe. (and I think the Chicago Mercantile Exchange uses this in some capacity too, but to be honest I came by this information indirectly). Anyway, it's just a thought I wanted to put on the table...

@rhyolight
Copy link
Contributor Author

Let's not get ahead of ourselves. Please let's focus on HTTP and keep discussion on this issue about which HTM to use.

Sent from my MegaPhone

On Dec 7, 2015, at 7:23 AM, codedhard notifications@github.com wrote:

This is a step toward HTM with hierarchy.

Im for option 2. SimpleHttpServer is "low level".
The transactions must be quick they need to use UDP not TCP.
Optional - both can be servers, the problem would be with port forwarding, this way client don't have to wait for the answer from the server.

Protocol:
/reset/
/model/params={ model params ... }
/subscribe/code=...
/run/field_name1=1&field_name2=2&field_name3=3
/run/val[]=1&val[]=3&val[]=3

/reset/
This should reset the HTM.
/model/
Load model params
/subscribe/code
The code should be executed in each call of /run/ then output returned back to client.
/run/
The running step + code exec from /subscribe/code.
field_name1 input field name.


Reply to this email directly or view it on GitHub.

@cogmission
Copy link
Member

@rhyolight Aye, will do...

@rhyolight
Copy link
Contributor Author

If we go HTM.Java, who is willing to work on the project?

If we go NuPIC on python, who is willing to work on the project?

Ideally, I'd like to assign a team leader to the project either way who can discuss design decisions, triage bugs and features requests, and manage other people working on the project. Please volunteer below if you are interested in either working on or leading the project (and specify JVM or Python).

@jefffohl
Copy link
Member

jefffohl commented Dec 8, 2015

I am willing to work on either platform. Note that I am kind of noobish at both Python and Java (in my day job, I use mostly PHP and JavaScript), so I would definitely not be a team lead. Mainly, I just want to find a way to be of help because the project is very interesting to me.

@rhyolight
Copy link
Contributor Author

You're a good man, Jeff Fohl.

Sent from my MegaPhone

On Dec 7, 2015, at 7:49 PM, Jeff Fohl notifications@github.com wrote:

I am willing to work on either platform. Note that I am kind of noobish at both Python and Java (in my day job, I use mostly PHP and JavaScript), so I would definitely not be a team lead. Mainly, I just want to find a way to be of help because the project is very interesting to me.


Reply to this email directly or view it on GitHub.

@rhyolight
Copy link
Contributor Author

Been chatting with @JonnoFTW on gitter, he seems willing to take up lead with a NuPIC approach... Either way, I mapped out an intentionally minimal spec. If you build it, I will deploy it and use it right away, so I will find your bugs. Better to find them fast. 💥

@rhyolight
Copy link
Contributor Author

In case you missed it, we are using NuPIC.

@cogmission
Copy link
Member

Hi Guys,

I fell asleep and just woke up now. My sleep patterns have been very erratic of late. I would have been happy to lead the project in the Java case, just to let you know.

Cheers,
David

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants