Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADAM-1497] Add union to GenomicRDD. #1526

Merged
merged 1 commit into from May 17, 2017

Conversation

@fnothaft
Copy link
Member

@fnothaft fnothaft commented May 13, 2017

Resolves #1497. Also adds size methods to SequenceDictionary and RecordGroupDictionary. Requires the addition of a ClassTag to the GenericGenomicRDD case class.

@fnothaft fnothaft added this to the 0.23.0 milestone May 13, 2017
@AmplabJenkins
Copy link

@AmplabJenkins AmplabJenkins commented May 13, 2017

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2003/

Build result: FAILURE

[...truncated 16 lines...] > /home/jenkins/git2/bin/git rev-parse origin/pr/1526/merge^{commit} # timeout=10 > /home/jenkins/git2/bin/git branch -a --contains ce29373ec3eff23b4c14bc2a99f1a7a7a3efc59c # timeout=10 > /home/jenkins/git2/bin/git rev-parse remotes/origin/pr/1526/merge^{commit} # timeout=10Checking out Revision ce29373ec3eff23b4c14bc2a99f1a7a7a3efc59c (origin/pr/1526/merge) > /home/jenkins/git2/bin/git config core.sparsecheckout # timeout=10 > /home/jenkins/git2/bin/git checkout -f ce29373ec3eff23b4c14bc2a99f1a7a7a3efc59cFirst time build. Skipping changelog.Triggering ADAM-prb ? 2.3.0,2.11,1.6.1,centosTriggering ADAM-prb ? 2.6.0,2.10,1.6.1,centosTriggering ADAM-prb ? 2.3.0,2.11,2.0.0,centosTriggering ADAM-prb ? 2.3.0,2.10,2.0.0,centosTriggering ADAM-prb ? 2.6.0,2.11,2.0.0,centosTriggering ADAM-prb ? 2.6.0,2.10,2.0.0,centosTriggering ADAM-prb ? 2.3.0,2.10,1.6.1,centosTriggering ADAM-prb ? 2.6.0,2.11,1.6.1,centosADAM-prb ? 2.3.0,2.11,1.6.1,centos completed with result FAILUREADAM-prb ? 2.6.0,2.10,1.6.1,centos completed with result FAILUREADAM-prb ? 2.3.0,2.11,2.0.0,centos completed with result FAILUREADAM-prb ? 2.3.0,2.10,2.0.0,centos completed with result FAILUREADAM-prb ? 2.6.0,2.11,2.0.0,centos completed with result FAILUREADAM-prb ? 2.6.0,2.10,2.0.0,centos completed with result FAILUREADAM-prb ? 2.3.0,2.10,1.6.1,centos completed with result FAILUREADAM-prb ? 2.6.0,2.11,1.6.1,centos completed with result FAILURENotifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

Resolves #1497. Also adds `size` methods to `SequenceDictionary` and
`RecordGroupDictionary`. Requires the addition of a `ClassTag` to the
`GenericGenomicRDD` case class.
@fnothaft fnothaft force-pushed the fnothaft:issues/1497-union branch from 55cb7f9 to 7013184 May 14, 2017
@coveralls
Copy link

@coveralls coveralls commented May 14, 2017

Coverage Status

Coverage increased (+0.1%) to 82.139% when pulling 7013184 on fnothaft:issues/1497-union into 18191f9 on bigdatagenomics:master.

@AmplabJenkins
Copy link

@AmplabJenkins AmplabJenkins commented May 14, 2017

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2005/
Test PASSed.

val union = features1.union(features2)
assert(union.rdd.count === (features1.rdd.count + features2.rdd.count))
// only a single contig between the two
assert(union.sequences.size === 1)

This comment has been minimized.

@heuermh

heuermh May 15, 2017
Member

implies there is a distinct here . . .

This comment has been minimized.

@fnothaft

fnothaft May 15, 2017
Author Member

There is a filter by contig name when merging two SequenceDictionarys.

This comment has been minimized.

@heuermh

heuermh May 15, 2017
Member

Thanks for the clarification

val genotype2 = sc.loadGenotypes(testFile("small.vcf"))
val union = genotype1.union(genotype2)
assert(union.rdd.count === (genotype1.rdd.count + genotype2.rdd.count))
assert(union.sequences.size === (genotype1.sequences.size + genotype2.sequences.size))

This comment has been minimized.

@heuermh

heuermh May 15, 2017
Member

... but not here. Does (_ ++ _) collapse elements that are equals?

This comment has been minimized.

@fnothaft

fnothaft May 15, 2017
Author Member

This case is everyone's jolly favorite bioinformatics boondoggle: chr prefixes vs no prefixes. Thus, the contigs are distinct.

This comment has been minimized.

This comment has been minimized.

@fnothaft

fnothaft May 15, 2017
Author Member

I know, I know. However, works for our test purposes.

@heuermh heuermh merged commit 37b971a into bigdatagenomics:master May 17, 2017
3 checks passed
3 checks passed
codacy/pr Good work! A positive pull request.
Details
coverage/coveralls Coverage increased (+0.1%) to 82.139%
Details
default Merged build finished.
Details
@heuermh
Copy link
Member

@heuermh heuermh commented May 17, 2017

Thank you, @fnothaft

@heuermh heuermh added this to Completed in Release 0.23.0 May 30, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants
You can’t perform that action at this time.