New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-16499][ML][MLLib] improve ApplyInPlace function in ANN code #14156
[SPARK-16499][ML][MLLib] improve ApplyInPlace function in ANN code #14156
Conversation
Test build #62172 has finished for PR 14156 at commit
|
cc @srowen thanks! |
Is there reasonable evidence this speeds things up? just want to make sure this does not make it slower. Help me understand the := operator? I don't recognize how it's helping compute y as a function of x here. I assume the method below can't use the same mechanism? |
@srowen The mechanism of := operator for DenseMatrix is that the DenseMatrix contains an implicit member which implements |
I see, this copies x to y then modifies y in place. OK. Is that more efficient? it seems like extra work, but does the transform method make up for it? just seeing if this has actually been observed to speed it up or not. |
yeah, currently it seems to make a little overhead (do a copy), but I think it will take advantage of breeze optimization, in the future, e.g, SIMD instructions or something ? |
That's the question indeed. I'm not sure because the function that's supplied could be anything. I don't see how it could automatically be converted to a vectorized operation automatically. |
@srowen yeah, the function supplied here called cannot be turned into SIMD instructions but I think it can do some parallelization optimization on large matrix, for example we can split the matrix into several blocks and executed the "in place transform" in parallel way, although it haven't added in breeze currently. for example, currently in scala, |
I get though sounds like there is not necessarily any such optimization now and actually not sure there can be. It could even be slower; it introduces an extra copy. It is somewhat harder to understand and different from its sibling method. I'm not sure we should do this until it is a demonstrable benefit. |
@srowen OK I close the pr for now if I found better way to optimize it I will reopen it, thanks! |
What changes were proposed in this pull request?
I re-code the following fuction using breeze's matrix operating function.
def apply(x: BDM[Double], y: BDM[Double], func: Double => Double): Unit
How was this patch tested?
Existing test.