New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrading to Spark 1.0 #256

Merged
merged 5 commits into from Jun 6, 2014

Conversation

Projects
None yet
3 participants
@tdanford
Contributor

tdanford commented Jun 4, 2014

Upgrading the dependency on Spark to version 1.0.0.

The major changes here are:

  1. all the RDD.groupBy calls return Iterables rather than Seqs, and
  2. we now need an explicit dependency on fastutil.
@tdanford

This comment has been minimized.

Show comment
Hide comment
@tdanford

tdanford Jun 4, 2014

Contributor

Fixes #253

Contributor

tdanford commented Jun 4, 2014

Fixes #253

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Jun 4, 2014

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/ADAM-prb/345/

AmplabJenkins commented Jun 4, 2014

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/ADAM-prb/345/

Show outdated Hide outdated adam-core/pom.xml
Show outdated Hide outdated pom.xml

tdanford added some commits Jun 3, 2014

Updated Spark version to 1.0.0
Also adding the fastutil dependency back in. Spark 0.9 had included the dependency on
fastutil, which we also depended on! But Spark 1.0 apparently removes that dependency.
So this commit adds that back in, for our own use.
Updating all instances of Seq[_] to Iterable[_] in RDD.groupBy()
Apparently, the signature for the RDD.groupBy method has changed (in 1.0) to return an Iterable
rather than a Seq.  This commit includes all the changes taht are needed to account for this change
downstream in our code, mostly updating types to Iterable and inserting a few calls to toSeq in cases
where that's not sufficient.
Updated kryo.referenceTracking parameter, fixed test values.
As suggested by Matt and Frank, updated two things:
1. the spark.kryo.referenceTracking value, set to 'true', which fixes a StackOverflowError, and
2. updated the target (test) values for the IndelRealignmentTargetSuite tests, which Frank says are apparently
   going to change soon anyway.
@tdanford

This comment has been minimized.

Show comment
Hide comment
@tdanford

tdanford Jun 5, 2014

Contributor

Matt, I think this rebase should address your comments. Let me know if you see any other details to be fixed!

Contributor

tdanford commented Jun 5, 2014

Matt, I think this rebase should address your comments. Let me know if you see any other details to be fixed!

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Jun 5, 2014

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/ADAM-prb/347/

AmplabJenkins commented Jun 5, 2014

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/ADAM-prb/347/

massie added a commit that referenced this pull request Jun 6, 2014

@massie massie merged commit 8a93aed into bigdatagenomics:master Jun 6, 2014

1 check passed

default Merged build finished.
Details
@massie

This comment has been minimized.

Show comment
Hide comment
@massie

massie Jun 6, 2014

Member

Thanks, Timothy!

Member

massie commented Jun 6, 2014

Thanks, Timothy!

@tdanford tdanford deleted the broadinstitute:spark-1.0 branch Jun 6, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment