Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-9408] [PySpark] [MLlib] Refactor linalg.py to /linalg #7731

Closed
wants to merge 3 commits into from

Conversation

MechCoder
Copy link
Contributor

I refactored linalg.py to a folder /linalg so that future updates like distributed.py can be made easily.

@MechCoder MechCoder changed the title [SPARK-9408] [PySpark] Refactor linalg.py to /linalg [SPARK-9408] [PySpark] [MLlib] Refactor linalg.py to /linalg Jul 28, 2015
@MechCoder
Copy link
Contributor Author

@mengxr This breaks code on my machine, but I can't figure out why. :(

@SparkQA
Copy link

SparkQA commented Jul 28, 2015

Test build #38738 has finished for PR 7731 at commit 200f08f.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mengxr
Copy link
Contributor

mengxr commented Jul 28, 2015

@MechCoder I think you need to add __all__ to __init__.py to import the names, e.g., https://github.com/apache/spark/blob/master/python/pyspark/mllib/stat/__init__.py#L27.

@MechCoder
Copy link
Contributor Author

That does not work as well. I tried locally (I pushed it still).

from pyspark.mllib.linalg import Vectors
Vectors.dense([0.0, 1.0])

Do you think it's some issue with serialization and deserialization?

@mengxr
Copy link
Contributor

mengxr commented Jul 28, 2015

Where is VectorUDT and MatrixUDT?

@MechCoder
Copy link
Contributor Author

I added those but still it does not work.

I also changed , but it should not matter (but did not push)

pyUDT to pyspark.mllib.linalg.local.MatrixUDT

and @classmethod def module(cls): to pyspark.mllib.local.linalg but it still gives the same error.

@SparkQA
Copy link

SparkQA commented Jul 28, 2015

Test build #38760 has finished for PR 7731 at commit d870a1a.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dusenberrymw
Copy link
Contributor

Yeah, I've been working on this as well, and I've run into the same serialization issues.

@mengxr
Copy link
Contributor

mengxr commented Jul 29, 2015

@MechCoder @dusenberrymw Thanks for testing! I will take a look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants