Skip to content

Conversation

@bhargav
Copy link
Contributor

@bhargav bhargav commented Oct 12, 2015

…rror message

For negative indices in the SparseVector, we update the index value. If we have an incorrect index
at this point, the error message has the incorrect updated index instead of the original one. This
change contains the fix for the same.

…rror message

For negative indices in the SparseVector, we update the index value. If we have an incorrect index
at this point, the error message has the incorrect *updated* index instead of the original one. This
change contains the fix for the same.
@JoshRosen
Copy link
Contributor

I've been seeing a lot of bugfix patches for the PySpark SparseVector recently. This suggests to me that we need to write significantly more tests for this component.

@bhargav
Copy link
Contributor Author

bhargav commented Oct 12, 2015

@jkbradley for this.

If there is a task for adding more tests, I can take that up as well.

@JoshRosen
Copy link
Contributor

It'd be really fun to try using Hypothesis to write some property-based tests for this: https://hypothesis.readthedocs.org/en/master/

@jkbradley
Copy link
Member

@JoshRosen I think it's that PySpark has less coverage than Scala, and our linear algebra code could use some more coverage too. And PySpark SparseVector is in the intersection of these problems. (Though 4 of those bug fix patches may have been patch + 3 backports.)

Hypothesis looks cool...future work? : )

@bhargav There is not a specific task for adding more tests, but since you're interested, it'd be awesome if you could check through some of the PySpark Vector and Matrix APIs and unit tests and see if you can find missing coverage. Note also that the tests are split between doc tests (in each .py file) and unit tests (in tests.py files). If you find missing items, can you please make one or more JIRAs? If you're unsure about any, I'd recommend making a single JIRA and listing there; we can then create subtasks for each major issue. Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be better to move this check to line 768, before index gets modified. That way, you don't have to even create a copy of index (though you'll need to adjust the check's index range).

@SparkQA
Copy link

SparkQA commented Oct 12, 2015

Test build #1882 has finished for PR 9069 at commit 316acac.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class ChildProcAppHandle implements SparkAppHandle
    • abstract class LauncherConnection implements Closeable, Runnable
    • final class LauncherProtocol
    • static class Message implements Serializable
    • static class Hello extends Message
    • static class SetAppId extends Message
    • static class SetState extends Message
    • static class Stop extends Message
    • class LauncherServer implements Closeable
    • class NamedThreadFactory implements ThreadFactory
    • class OutputRedirector
    • public final class UnsafeRow extends MutableRow implements Externalizable, KryoSerializable
    • /** Run a function within Hive state (SessionState, HiveConf, Hive client and class loader) */

@bhargav
Copy link
Contributor Author

bhargav commented Oct 12, 2015

Updated the PR. Though I now see that the branch name is different from the JIRA issue number.

@bhargav
Copy link
Contributor Author

bhargav commented Oct 13, 2015

@jkbradley Gentle ping. :)

@jkbradley
Copy link
Member

@bhargav Thanks for the update! LGTM pending tests

@jkbradley
Copy link
Member

Btw, it doesn't matter what you call your branch name; that's not a problem.

@SparkQA
Copy link

SparkQA commented Oct 16, 2015

Test build #1916 has finished for PR 9069 at commit 19bc764.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jkbradley
Copy link
Member

Merging with master

@asfgit asfgit closed this in 1ec0a0d Oct 16, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants