-
Notifications
You must be signed in to change notification settings - Fork 28k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-2909] [MLlib] [PySpark] SparseVector in pyspark now supports indexing #4025
Conversation
ping @jkbradley @mengxr Would be great if you could have a look :) |
Test build #25480 has started for PR 4025 at commit
|
Test build #25480 has finished for PR 4025 at commit
|
Test FAILed. |
Test build #25481 has started for PR 4025 at commit
|
@@ -510,6 +510,22 @@ def __eq__(self, other): | |||
and np.array_equal(other.indices, self.indices) | |||
and np.array_equal(other.values, self.values)) | |||
|
|||
def __getitem__(self, item): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we rename item
to index
?
Test build #25481 has finished for PR 4025 at commit
|
Test PASSed. |
@mengxr I've fixed up your comments. Btw, should we use a similar logic for the scala code? Right now it seems to convert it into a dense vector, which I'm not sure is advisable. |
Test build #25503 has started for PR 4025 at commit
|
Test build #25503 has finished for PR 4025 at commit
|
Test PASSed. |
Also I'm thinking out aloud if it's worthy enough to implement |
This looks good to me, thanks! |
Thanks, has this been pushed to master, so that I can close it? |
It's not merged yet, github will close it once it get merged. |
LGTM. @MechCoder The Scala code uses Breeze's index lookup, which uses bisection as well. You can try implementing bisection in MLlib and then doing a micro-benchmark. If there is a big difference, we will have the implementation in MLlib. |
Merged into master. Thanks! |
Slightly different than the scala code which converts the sparsevector into a densevector and then checks the index.
I also hope I've added tests in the right place.