Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-5814][MLLIB][GRAPHX] Remove JBLAS from runtime #4699

Closed
wants to merge 8 commits into from

Conversation

mengxr
Copy link
Contributor

@mengxr mengxr commented Feb 20, 2015

The issue is discussed in https://issues.apache.org/jira/browse/SPARK-5669. Replacing all JBLAS usage by netlib-java gives us a simpler dependency tree and less license issues to worry about. I didn't touch the test scope in this PR. The user guide is not modified to avoid merge conflicts with branch-1.3. @srowen @ankurdave @pwendell

@SparkQA
Copy link

SparkQA commented Feb 20, 2015

Test build #27757 has started for PR 4699 at commit e53e9f4.

  • This patch merges cleanly.

<dependency>
<groupId>net.sourceforge.f2j</groupId>
<artifactId>arpack_combined_all</artifactId>
<version>0.1</version>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be runtime-only scope? is this basically a pure Java backend to netlib? I checked the .jar and it seems to be all Java. I double-checked and this is BSD licensed. Since license issues are front of mind in this change, add a line for this new lib in the LICENSE file under the other BSD dependencies?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some netlib-java BLAS routines contains org.netlib.intW, which is not part of com.github.fommil.netlib but arpack_combined_all. In the netlib:core pom file (https://repo1.maven.org/maven2/com/github/fommil/netlib/core/1.1.2/core-1.1.2.pom), arpack_combined_all is specified as a dependency but I cannot run the tests without explicitly specifying in the pom. Maybe it is because that it also include arpack_combined_all:javadoc as a dependency, sbt/maven doesn't resolve the dependency correctly.

Breeze already depends on netlib-java and arpack_combined_all. Should we already have something under LICENSE?

@srowen
Copy link
Member

srowen commented Feb 20, 2015

The changes look like what I'd expect. I haven't reviewed every replacement. The only real risk in merging for 1.3 is that there is a typo in the translation, but let's see what tests say.

@SparkQA
Copy link

SparkQA commented Feb 20, 2015

Test build #27757 has finished for PR 4699 at commit e53e9f4.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/27757/
Test FAILed.

@mengxr
Copy link
Contributor Author

mengxr commented Feb 20, 2015

Added mima excludes. (I don't know why it complains in this PR.)

@SparkQA
Copy link

SparkQA commented Feb 20, 2015

Test build #27763 has started for PR 4699 at commit 0f20cad.

  • This patch does not merge cleanly.

@andrewor14
Copy link
Contributor

Are these legit mina problems with another patch, or just false positives? Should we send a hot fix to exclude all of these? (There are quite a few of them)

@mengxr
Copy link
Contributor Author

mengxr commented Feb 20, 2015

@andrewor14 I didn't track changes in MimaExcludes. But apparently these synthetic classes should be excluded by default.

@andrewor14
Copy link
Contributor

Ok I stopped seeing them fail in more recent builds so I'm just going to leave it as is.

@SparkQA
Copy link

SparkQA commented Feb 20, 2015

Test build #27763 has finished for PR 4699 at commit 0f20cad.

  • This patch passes all tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/27763/
Test PASSed.

@SparkQA
Copy link

SparkQA commented Feb 20, 2015

Test build #27774 has started for PR 4699 at commit c5c4183.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Feb 20, 2015

Test build #27774 has finished for PR 4699 at commit c5c4183.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class Assignment(val id: Long, val cluster: Int) extends Serializable
    • class FPGrowthModel[Item: ClassTag](val freqItemsets: RDD[FreqItemset[Item]]) extends Serializable
    • class FreqItemset[Item](val items: Array[Item], val freq: Long) extends Serializable

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/27774/
Test PASSed.

@mengxr
Copy link
Contributor Author

mengxr commented Feb 20, 2015

I went through the changes twice. The math should be the same as before. SVDPlusPlus should get better performance by allocating less temp objects. I suggest merging this into branch-1.3. @pwendell

@mengxr
Copy link
Contributor Author

mengxr commented Feb 21, 2015

@srowen Just realized that we cannot merge this into branch-1.3 because we still need to return DoubleMatrix in 1.3. So for 1.3, could you send a PR and put the binaries back? We can merge this PR after 1.3 is released.

@srowen
Copy link
Member

srowen commented Feb 21, 2015

Oh right, of course. Yes, PR coming right up...

@SparkQA
Copy link

SparkQA commented Mar 11, 2015

Test build #28464 has started for PR 4699 at commit ca21c74.

  • This patch merges cleanly.

@mengxr mengxr changed the title [SPAR-5814][MLLIB][GRAPHX] Remove JBLAS from runtime [SPARK-5814][MLLIB][GRAPHX] Remove JBLAS from runtime Mar 11, 2015
@SparkQA
Copy link

SparkQA commented Mar 11, 2015

Test build #28464 has finished for PR 4699 at commit ca21c74.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class KMeansModel (val clusterCenters: Array[Vector]) extends Saveable with Serializable

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28464/
Test PASSed.

<version>${jblas.version}</version>
<groupId>com.github.fommil.netlib</groupId>
<artifactId>core</artifactId>
<version>1.1.2</version>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we manage the version centrally in the parent POM? this is repeated otherwise in the mllib POM.

@srowen
Copy link
Member

srowen commented Mar 11, 2015

Looking good. I didn't verify each translation to netlib, but eyeballing them, it looked like what I'd expect. that's what the tests will confirm.

@SparkQA
Copy link

SparkQA commented Mar 11, 2015

Test build #28484 has started for PR 4699 at commit 48635c6.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Mar 11, 2015

Test build #28484 has finished for PR 4699 at commit 48635c6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class KMeansModel (val clusterCenters: Array[Vector]) extends Saveable with Serializable

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28484/
Test PASSed.

@mengxr
Copy link
Contributor Author

mengxr commented Mar 12, 2015

Merged into master. Thanks for reviewing!

@asfgit asfgit closed this in 0cba802 Mar 12, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants