Use optimized lingebra math libraries #28

subutai · 2014-02-20T18:42:05Z

breznak · 2014-08-19T21:32:28Z

is this still an issue? (given the optimizations were not that big?) Before I was suggesting multiplatform eigen library, but not sure if we should bother at this time.

breznak · 2015-02-26T00:54:21Z

relevant: #193 #151

breznak · 2015-02-26T12:05:03Z

@subutai would you mind if I reword the issue a bit?
former description:

subutai commented on Feb 20, 2014
See issue #27. We'd like to possibly add it back in later so tracking it here. Some related web pages:

https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man7/vecLib.7.html

Before adding it back in we should verify this really gives a performance improvement in real cases. This is doubtful.

Updating master

breznak · 2015-12-17T17:15:59Z

When optimizing some critical parts of C++ code, this is pretty neat tool! :
http://gcc.godbolt.org/#{%22version%22%3A3%2C%22filterAsm%22%3A{%22labels%22%3Atrue%2C%22directives%22%3Atrue%2C%22commentOnly%22%3Atrue}%2C%22compilers%22%3A[{%22sourcez%22%3A%22C4TwDgpgJhBmAEUD2BXARgGwvAbhAxsEgE7wD6ZAhsMMQJZorAQXwAUbehJZAznQC8IbAMwAmAJRSA3AChk6LIiTA2%2BJADtewXASLEAZPEoAadVp1d9RtBNkBvWee27upfPAC8xgFRo5xBDAKMQa7PgA2gAMALoA1JEAjDEScWoRYvGRIilyAL5AAAA%3D%22%2C%22compiler%22%3A%22g530%22%2C%22options%22%3A%22-Os%20-mavx%22}]}

FYI @oxtopus

rhyolight · 2016-05-18T02:12:56Z

Please review this issue

This issue needs to be reviewed by the original author or another contributor for applicability to the current codebase. The issue might be obsolete or need updating to match current standards and practices. If the issue is out of date, please close. Otherwise please leave a comment to justify its continuing existence. It may be closed in the future if no further activity is noted.

breznak · 2016-05-18T07:58:19Z

This is still valid, although noone is at the time working on the porting to lingebra libraries. I think it should stay open to monitor optimization progress and results.
E.g. the PRs from @mrcslws speeding up TM could be referenced here for record.

rhyolight · 2016-05-18T21:39:26Z

Ok, so the issue is still valid, but it is also defined very broadly. It's labeled type:optimization so I'll track it that way, but I think the ticket description needs to be simplified. It's too long and complicated, and too many subjects and TODO items. We need to try to keep our issues simpler and smaller. This could turn into a super issue, but honestly I would rather break it up even farther. Something to think about @subutai.

subutai · 2016-05-23T16:19:20Z

@rhyolight Agreed. The issue is indeed pretty big right now. I think a good first step is to replace the use of sparse matrices in the python spatial pooler, python KNN classifier, and/or optimize the existing C++ SpatialPooler (which is not currently too optimized).

breznak · 2016-05-23T20:04:37Z

I think a good first step is to replace the use of sparse matrices in the python spatial pooler, python KNN classifier, and/or optimize the existing C++ SpatialPooler (which is not currently too optimized).

@subutai shouldn't the effort focus on the big-impact first? Aka the biggest bottlenecks, which is still TM/TP?

Another point, I think all optimizations would be slower in global scope first due to the needed conversions; so a good first move would be to implement an object (which we can then easily exchange when experimenting with optimizations) for a dense and sparse vector, goal of Datastructures passed around components (Enc,SP,TP,TM,Anom,Classif) must use same type #948

You all will have to please forgive me for my novice understanding of the code (I'm still learning it... slowly), but I wanted to understand what kinds of calculations are being made within nupic that could require a library like Eigen or Armadillo or MKL or OpenBLAS or whatever. Is there massive matrix multiplication going on? Vector multiplication? Even if someone could just point me to proper class/function/file so I could get a better handle on it, I think I could offer up some help with this.

@jshahbazi Sorry, I missed your call, if you are still interested, we certainly would! The logic and operations are in algrithms/Connestions.hpp (for TemporalMemory) and in math/{Sparse,Dense}Matrix (for SpatialPooler).

The operations (someone please correct me): vector AND, searching N-highest entries, indexing and updating weights, ... @scottpurdy @mrcslws @subutai ?

The code can be benchmarked (globally, for a typical use) using #890 . Also please weight in on #948

subutai · 2016-05-23T23:49:43Z

shouldn't the effort focus on the big-impact first? Aka the biggest bottlenecks, which is still TM/TP?

The TM is actually not the biggest bottleneck right now. After changes by @mrcslws it is a pretty small part of the overall profile.

breznak · 2016-05-24T12:59:00Z

The TM is actually not the biggest bottleneck right now. ...

@subutai not really, it still is (even the code complexity compared to SP is higher)

Please see numenta/nupic-legacy#3131 for my benchmarks:

fastest SP (c++ "2D" SP): 0.040 s/call
fastest TM/TP (cpp TP): 0.040 s/call
- fastest TM: 0.158 s/call

The old SP problem I've discovered with 1D vs 2D inputs: #380
Problem with TM speed: #890 (comment)

breznak · 2016-05-24T13:00:37Z

We need to try to keep our issues simpler and smaller. This could turn into a super issue, but honestly I would rather break it up even farther

@rhyolight this IS a super issue with links to sub-issues where possible/active

breznak · 2016-05-24T13:16:02Z

Added #967 as a proposal that would halve the computation time easily.

subutai · 2016-05-24T14:58:22Z

not really, it still is (even the code complexity compared to SP is higher)

@breznak I will let @mrcslws comment on this. According to Marcus, when you run hotgym, the new TM is a small percentage of the overall profile. Marcus - am I mis-remembering?

I took a quick look at #3131 and sp_profile. I don't remember seeing this script before but it looks like the SP parameters are quite off in sp_profile. Why is potentialRadius only 3? It should be much larger to form good SDRs. Same with numActiveColumnsPerInhArea, etc. etc. I think the parameters should be set to realistic numbers and the profile re-run with those numbers.

mrcslws · 2016-05-24T14:58:32Z

I commented on numenta/nupic-legacy#3131 (comment).

subutai added the enhancement label Feb 20, 2014

breznak modified the milestones: Bug Reports, Optimization Sep 18, 2014

breznak changed the title ~~Figure out how to add vecLib back in~~ Use optimized lingebra math libraries Sep 18, 2014

breznak added the performance label Sep 18, 2014

breznak mentioned this issue Sep 18, 2014

Math (linear algebra) optimization library (Eigen) numenta/nupic-legacy#1069

Closed

rhyolight added the type:optimization label Oct 15, 2014

rhyolight modified the milestone: Optimization Oct 15, 2014

breznak added this to the 0.6.0 milestone Feb 26, 2015

breznak added the super label Feb 26, 2015

This was referenced Feb 26, 2015

Choose linear algebra/matrix library implementation for optimized nupic.core #369

Open

Handle lack of inline asm support for x64 builds #236

Closed

rhyolight removed the performance label Mar 30, 2015

rhyolight modified the milestones: 0.6.0: features, 1.1.0: Future Development Mar 30, 2015

breznak mentioned this issue Apr 2, 2015

create performance profiling scripts for python numenta/nupic-legacy#1984

Closed

rcrowder added a commit that referenced this issue Dec 16, 2015

Merge pull request #28 from numenta/master

f8ecf88

Updating master

rhyolight added triage under-review labels May 18, 2016

breznak mentioned this issue Jan 19, 2018

Optimization for performance htm-community/htm.core#3

Open

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use optimized lingebra math libraries #28

Use optimized lingebra math libraries #28

subutai commented Feb 20, 2014 •

edited by breznak

breznak commented Aug 19, 2014

breznak commented Feb 26, 2015

breznak commented Feb 26, 2015

breznak commented Dec 17, 2015

rhyolight commented May 18, 2016

breznak commented May 18, 2016

rhyolight commented May 18, 2016

subutai commented May 23, 2016

breznak commented May 23, 2016

subutai commented May 23, 2016

breznak commented May 24, 2016

breznak commented May 24, 2016

breznak commented May 24, 2016

subutai commented May 24, 2016

mrcslws commented May 24, 2016

Use optimized lingebra math libraries #28

Use optimized lingebra math libraries #28

Comments

subutai commented Feb 20, 2014 • edited by breznak

breznak commented Aug 19, 2014

breznak commented Feb 26, 2015

breznak commented Feb 26, 2015

breznak commented Dec 17, 2015

rhyolight commented May 18, 2016

Please review this issue

breznak commented May 18, 2016

rhyolight commented May 18, 2016

subutai commented May 23, 2016

breznak commented May 23, 2016

subutai commented May 23, 2016

breznak commented May 24, 2016

breznak commented May 24, 2016

breznak commented May 24, 2016

subutai commented May 24, 2016

mrcslws commented May 24, 2016

subutai commented Feb 20, 2014 •

edited by breznak