-
-
Notifications
You must be signed in to change notification settings - Fork 208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
primitve float type #45
Comments
float is used in some parts of ojAlgo, but is not available with the optimisation solvers. And it wont be easy to add. Further I'm not convinced it would be much faster. For larger models you could potentially save a lot of memory and that would improve L1/L2/L3 cache usage. There may be other things to do. Do you have inequality constraints? Are all problems the same size (same number of variables AND constraints)? Do you use ExpressionsBasedModel? |
We do have inequality contraints. (All x >= 0). If it helps, we can even make a reasonable guess on upper bounds for x too. I've been doing some thinking, though, and I suspect that floats would probably achieve somewhere between 50% to 100% improvement, though I understand that it may require a fair bit of dev (or more importantly, testing.) The problems are not exactly all the same size, but your question gave me a pretty big idea. Although there are ~100 million matrix As to optimize (the equation is the standard least squares of --> min ||Y - A.X||^2, with X >= 0) , all (or maybe say 90%) of the 100 million matrix As could probably be represented by a handful (eg ~40 to ~100) distinct matrixes...(The observation Y would all vary, but at least AtA and, more importantly, its inverse could be cached and so it would not need to recomputed). This idea occurred to me once you mentioned "Are all problems the same size". (Also, the x >= 0 constraint is the same, though the size of x vairies of course from problem to problem.) Thoughts regarding the idea of caching AtA and its inverse? For caching the invese, though, I presume I would need to modify the solver code to "handle caching"? Is there a "recommened" approach that's clean to do this, or does it ultimately require copying the code for that class and performing the needed surgery? On a different note, I like the "premultiply" options. However, I couldn't figure out how to do a similar "preTranspose" operation. Specifically, after I create A, I wish to do something like "preTranspose" A, but it doensn't seem possible to do so. (matrix "A" is generated through PrimitiveDenseStore.Factory.rows(...) statement.) Is it possible? (One workaround is to premultiply matrix A with the Identity Matrix, and then do the transpose, but that's an necessary multiplication.) Also, on a related note, if I understood the code correctly, the matrixes are stored in column major order, right? So would A.x (where A is a matrix and x is a vector) be slighly less efficient (due to caching) that if matrix A were stored in row major order like say ejml? (I'm just trying to understand at this stage, so no big deal if this question is esoteric.) Finally, how would Expression Builder improve performance in this case compared to simply using the ConveSolver.Builder(). As near as I can tell, the ExpressionBasedModel will call the ConvexSolver but in my case it seem both simpler and more direct (possibly even faster) to just supply matrix Q and C directly to the convex solver rather than (artificially) introducing an intermediary layer via Expression Builder (which would require some more coding for the translation of matrix A etc into the Expression Builder format.) Thoughts? Thanks in advance, apete. |
I’d think hard about similarities between those 100 million cases, and how those can be exploited to eliminate work.
Can’t you let the solver instances be the cache?
I asked about ExpressionsBasedModel because I think in your case you should probably not use it.
… On 18 Nov 2017, at 19:49, gsaxena888 ***@***.***> wrote:
We do have inequality contraints. (All x >= 0). If it helps, we can even make a reasonable guess on upper bounds for x too. I've been doing some thinking, though, and I suspect that floats would probably achieve somewhere between 50% to 100% improvement, though I understand that it may require a fair bit of dev (or more importantly, testing.)
The problems are not exactly all the same size, but your question gave me a pretty big idea. Although there are ~100 million matrix As to optimize (the equation is the standard least squares of --> min ||Y - A.X||^2, with X >= 0) , all (or maybe say 90%) of the 100 million matrix As could probably be represented by a handful (eg ~40 to ~100) distinct matrixes...(The observation Y would all vary, but at least AtA and, more importantly, its inverse could be cached and so it would not need to recomputed). This idea occurred to me once you mentioned "Are all problems the same size". (Also, the x >= 0 constraint is the same, though the size of x vairies of course from problem to problem.) Thoughts regarding the idea of caching AtA and its inverse? For caching the invese, though, I presume I would need to modify the solver code to "handle caching"? Is there a "recommened" approach that's clean to do this, or does it ultimately require copying the code for that class and performing the needed surgery?
On a different note, I like the "premultiply" options. However, I couldn't figure out how to do a similiar "preTranspose" operation. Specifically, after I create A, I wish to do something like "preTranspose" A, but it doensn't seem possible to do so. (matrix "A" is generated through PrimitiveDenseStore.Factory.rows(...) statement.) Is it possible? (One workaround is to premultiply matrix A with the Identity Matrix, and then do the transpose, but that's an unecessary multiplication.) Also, on a related note, if I understood the code correctly, the matrixes are stored in column major order, right? So would A.x (where A is a matrix and x is a vector) be slighly less efficient (due to caching) that if matrix A were stored in row major order like say ejml? (I'm just trying to understand at this stage, so no big deal if this question is esteric.)
Finally, how would Expression Builder improve perofmrance in this case copared to simply using the ConveSolver.Builder(). As near as I can tell, the ExpressionBasedModelw will call the ConvestSolver but in my case it seem both simpler and more direct (possibly even faster) to just supply matrix Q and C directly to the convex solver rather than (artifiially) introducing an intermediary layer via Expression Builder (which would require some more coding for the translation of matrix A etc into the Expression Builder format.) Thoughts? Thanks in advance, apete.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
To answer your original question... The requirements to make the ConvexSolver:s optionally work with float rather than double has little to do with the solvers themselves. What's needed is a PhysicalStore.Factory implementation that produces suitable DecompositionStore instances (based on float[]). That factory would then be used when instantiating the solvers, and in turn when instantiating the various internal matrix decomposers and solvers. Copy-and-paste and search-and-replace on PrimitiveDenseStore will get you underway. public final class Primitive32DenseStore extends Primitive32Array implements PhysicalStore<Float>, DecompositionStore<Float> { The Primitive32Array you need to extend already exists. Then there is a collection of supporting classes like FunctionSet, AggregatorSet and Scalar.Factory. To make everything work you also need to "copy-and-paste and search-and-replace" one method in each of the classes in the org.ojalgo.matrix.store.operation package. Then I assume there will be cases where one would have to reconsider various interface declarations... |
Awesome! Thank you. What about the last 2 remaining questions:
|
Not sure I understand the problem. Can't you just call matrix.transpose(); ojAlgo's matrix multiplication implementations do not differentiate between vector och matrix arguments, and they're proven to be quite efficient. |
As far as I know your questions have been answered. I have created a project here at GitHub to review matrix element type support: https://github.com/optimatika/ojAlgo/projects/2 |
I recently learnt about ojalgo, and I did a few quick tests / poking around the code (for convex optimizations). It looks really really (really) impressive. One question though: since we're doing lots of high performance computing (10s of millions of tiny convex optimizations (eg matrix size is ~20)), my initial thoughts that if the ojaglo provides a "primitive float" matrix representation (instead of double), we could relatively easily achieve up to 2x performance gain (or am I missing something?) (I calculated the "up to 2x" because I believe floats use 1/2 CPU cycles (and memory) as doubles; and even if some portion of the CPU time is spent doing something else other than actual computations (eg dealing with loop counters and memory pointers etc), the fact that the memory space required is halves may have significant benefits when it comes to L1/L2/L3 caching etc. (I'm also assuming that there may be some semi-automated way to generate all the needed classes but just substitute "double" for "float" everywhere.) Thoughts?
The text was updated successfully, but these errors were encountered: