[SPARK-16566][MLLib] sort sparseVector's indices before doing multiplication#14219
[SPARK-16566][MLLib] sort sparseVector's indices before doing multiplication#14219wilson-lauw wants to merge 10 commits intoapache:masterfrom
Conversation
|
Can one of the admins verify this patch? |
|
IMO, order of the indices in SparseVector potentially affects many other BLAS operations. Perhaps we should have it in a more general/reusable pattern if we do want to impose the validation. Yet the most efficient way as I see is still to perform the check during construction, maybe something like an optional |
| val xValues = x.values | ||
| val xIndices = x.indices | ||
|
|
||
| def isSorted(l:Array[Int]): Boolean = { |
There was a problem hiding this comment.
Many small style issues here. I think this is best as
private def isSorted(array: Array[Int]): Boolean = {
var index = 1
while (index < array.length) {
if (array(index - 1) > array(index)) {
return false
}
index += 1
}
true
}
There was a problem hiding this comment.
Thanks for the suggestion, will fix it.
|
@hhbyyh I just checked, the implementation of |
|
Modifications also needed on module mllib-local. Once the changes is okay, I will also apply the changes there. |
|
SparseVector indices are assumed to be sorted already. We should not re-sort them during operations. The deeper issue which needs to be fixed is SPARK-14707 |
|
Closing this. |
What changes were proposed in this pull request?
https://issues.apache.org/jira/browse/SPARK-16566
sort sparseVector's indices before doing multiplication to make sure the result returned correctly
How was this patch tested?
manual and existing tests