refactored ReferenceFile to require SequenceDictionary #1086

Closed
wants to merge 1 commit into
from

Conversation

Projects
None yet
5 participants
@akmorrow13
Contributor

akmorrow13 commented Jul 19, 2016

  • NucleotideContigFragmentRDD now extends ReferenceFile
  • ReferenceFile requires Sequence Dictionary
@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Jul 19, 2016

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1354/

Build result: FAILURE

GitHub pull request #1086 of commit 2f065ff automatically merged.Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'[EnvInject] - Loading node environment variables.Building remotely on amp-jenkins-worker-05 (centos spark-test) in workspace /home/jenkins/workspace/ADAM-prb > /home/jenkins/git2/bin/git rev-parse --is-inside-work-tree # timeout=10Fetching changes from the remote Git repository > /home/jenkins/git2/bin/git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > /home/jenkins/git2/bin/git --version # timeout=10 > /home/jenkins/git2/bin/git -c core.askpass=true fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ # timeout=15 > /home/jenkins/git2/bin/git rev-parse origin/pr/1086/merge^{commit} # timeout=10 > /home/jenkins/git2/bin/git branch -a --contains 53fefde # timeout=10 > /home/jenkins/git2/bin/git rev-parse remotes/origin/pr/1086/merge^{commit} # timeout=10Checking out Revision 53fefde (origin/pr/1086/merge) > /home/jenkins/git2/bin/git config core.sparsecheckout # timeout=10 > /home/jenkins/git2/bin/git checkout -f 53fefde4c3a8171af3150fe52313af2f29a38f52First time build. Skipping changelog.Triggering ADAM-prb ? 2.6.0,2.11,1.5.2,centosTriggering ADAM-prb ? 2.6.0,2.10,1.5.2,centosTouchstone configurations resulted in FAILURE, so aborting...Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1354/

Build result: FAILURE

GitHub pull request #1086 of commit 2f065ff automatically merged.Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'[EnvInject] - Loading node environment variables.Building remotely on amp-jenkins-worker-05 (centos spark-test) in workspace /home/jenkins/workspace/ADAM-prb > /home/jenkins/git2/bin/git rev-parse --is-inside-work-tree # timeout=10Fetching changes from the remote Git repository > /home/jenkins/git2/bin/git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > /home/jenkins/git2/bin/git --version # timeout=10 > /home/jenkins/git2/bin/git -c core.askpass=true fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ # timeout=15 > /home/jenkins/git2/bin/git rev-parse origin/pr/1086/merge^{commit} # timeout=10 > /home/jenkins/git2/bin/git branch -a --contains 53fefde # timeout=10 > /home/jenkins/git2/bin/git rev-parse remotes/origin/pr/1086/merge^{commit} # timeout=10Checking out Revision 53fefde (origin/pr/1086/merge) > /home/jenkins/git2/bin/git config core.sparsecheckout # timeout=10 > /home/jenkins/git2/bin/git checkout -f 53fefde4c3a8171af3150fe52313af2f29a38f52First time build. Skipping changelog.Triggering ADAM-prb ? 2.6.0,2.11,1.5.2,centosTriggering ADAM-prb ? 2.6.0,2.10,1.5.2,centosTouchstone configurations resulted in FAILURE, so aborting...Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

@akmorrow13

This comment has been minimized.

Show comment
Hide comment
@akmorrow13

akmorrow13 Jul 19, 2016

Contributor

This is erroring out in 1.5.2 for files I did not modify.
For example: [ERROR] Error formatting /home/jenkins/workspace/ADAM-prb/HADOOP_VERSION/2.6.0/SCALAVER/2.10/SPARK_VERSION/1.5.2/label/centos/adam-cli/src/main/scala/org/bdgenomics/adam/cli/PrintADAM.scala: java.nio.charset.MalformedInputException: Input length = 1

Contributor

akmorrow13 commented Jul 19, 2016

This is erroring out in 1.5.2 for files I did not modify.
For example: [ERROR] Error formatting /home/jenkins/workspace/ADAM-prb/HADOOP_VERSION/2.6.0/SCALAVER/2.10/SPARK_VERSION/1.5.2/label/centos/adam-cli/src/main/scala/org/bdgenomics/adam/cli/PrintADAM.scala: java.nio.charset.MalformedInputException: Input length = 1

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Jul 19, 2016

Member

That error is fine. Keep reading down the log ;):

Please run './scripts/format-source'
Member

fnothaft commented Jul 19, 2016

That error is fine. Keep reading down the log ;):

Please run './scripts/format-source'
@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Jul 19, 2016

Member

Maven builds from the command line do that for you. Give up your IDE! 😄

Member

heuermh commented Jul 19, 2016

Maven builds from the command line do that for you. Give up your IDE! 😄

@@ -169,7 +170,7 @@ case class NucleotideContigFragmentRDD(
* @param region Reference region over which to get sequence.
* @return String of bases corresponding to reference sequence.
*/
- def getReferenceString(region: ReferenceRegion): String = {
+ def extract(region: ReferenceRegion): String = {

This comment has been minimized.

@heuermh

heuermh Jul 19, 2016

Member

Wouldn't this read so much better if this class was a SequenceRDD (or SliceRDD, depending on how the data were loaded) and this method was

def slice(region: ReferenceRegion): Slice = { ... }

See bigdatagenomics/bdg-formats#83

@heuermh

heuermh Jul 19, 2016

Member

Wouldn't this read so much better if this class was a SequenceRDD (or SliceRDD, depending on how the data were loaded) and this method was

def slice(region: ReferenceRegion): Slice = { ... }

See bigdatagenomics/bdg-formats#83

This comment has been minimized.

@akmorrow13

akmorrow13 Jul 27, 2016

Contributor

So you would like ReferenceFile to change its interface to be slice instead of extract?

@akmorrow13

akmorrow13 Jul 27, 2016

Contributor

So you would like ReferenceFile to change its interface to be slice instead of extract?

This comment has been minimized.

@heuermh

heuermh Jul 27, 2016

Member

I think it's fine for now, with an eye to refactor if/when the Slice stuff is added, between versions 0.20 and 0.21 presumably

@heuermh

heuermh Jul 27, 2016

Member

I think it's fine for now, with an eye to refactor if/when the Slice stuff is added, between versions 0.20 and 0.21 presumably

This comment has been minimized.

@akmorrow13

akmorrow13 Jul 27, 2016

Contributor

I can definitely change it. I just am wondering if slice is generic enough for the 2bitFile, because this extends ReferenceFile as well.

@akmorrow13

akmorrow13 Jul 27, 2016

Contributor

I can definitely change it. I just am wondering if slice is generic enough for the 2bitFile, because this extends ReferenceFile as well.

This comment has been minimized.

@heuermh

heuermh Jul 27, 2016

Member

extract is fine with me

@heuermh

heuermh Jul 27, 2016

Member

extract is fine with me

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Jul 22, 2016

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1360/
Test PASSed.

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1360/
Test PASSed.

@akmorrow13

This comment has been minimized.

Show comment
Hide comment
@akmorrow13

akmorrow13 Jul 29, 2016

Contributor

+1?

Contributor

akmorrow13 commented Jul 29, 2016

+1?

import org.bdgenomics.formats.avro.NucleotideContigFragment
case class ReferenceContigMap(contigMap: Map[String, Seq[NucleotideContigFragment]]) extends ReferenceFile {
+
+ /**
+ * Sequence dictionary extracted from contigMap

This comment has been minimized.

@heuermh

heuermh Jul 29, 2016

Member

doc strings should be complete sentences

@heuermh

heuermh Jul 29, 2016

Member

doc strings should be complete sentences

+
+ /**
+ * Sequence dictionary extracted from contigMap
+ * @see ReferenceFile.scala

This comment has been minimized.

@heuermh

heuermh Jul 29, 2016

Member

this @see won't work in javadoc/scaladoc, it should be @see ReferenceFile, although generally I don't find doc links all that useful.

@heuermh

heuermh Jul 29, 2016

Member

this @see won't work in javadoc/scaladoc, it should be @see ReferenceFile, although generally I don't find doc links all that useful.

@@ -34,7 +33,8 @@ class MDTaggingSuite extends ADAMFunSuite {
sc.parallelize(
for {
(contig, start, seq) <- frags
- } yield NucleotideContigFragment.newBuilder().setContig(contig).setFragmentStartPosition(start.toLong).setFragmentSequence(seq).build()
+ } yield NucleotideContigFragment.newBuilder().setContig(contig).setFragmentStartPosition(start.toLong)

This comment has been minimized.

@heuermh

heuermh Jul 29, 2016

Member

format this as

NucleotideContigFragment.newBuilder()
  .setContig(contig)
  .setFragmentStartPosition(start.toLong)
  ...
@heuermh

heuermh Jul 29, 2016

Member

format this as

NucleotideContigFragment.newBuilder()
  .setContig(contig)
  .setFragmentStartPosition(start.toLong)
  ...
@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Jul 29, 2016

Member

Added some minor doc-related review comments.

Member

heuermh commented Jul 29, 2016

Added some minor doc-related review comments.

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Jul 29, 2016

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1363/

Build result: FAILURE

[...truncated 24 lines...]Triggering ADAM-prb ? 2.3.0,2.10,1.4.1,centosTriggering ADAM-prb ? 2.3.0,2.10,1.5.2,centosTriggering ADAM-prb ? 2.6.0,2.11,1.4.1,centosTriggering ADAM-prb ? 2.6.0,2.10,1.3.1,centosTriggering ADAM-prb ? 2.6.0,2.11,1.3.1,centosTriggering ADAM-prb ? 2.3.0,2.11,1.5.2,centosTriggering ADAM-prb ? 2.6.0,2.11,1.6.1,centosTriggering ADAM-prb ? 2.3.0,2.11,1.4.1,centosTriggering ADAM-prb ? 2.3.0,2.11,1.6.1,centosADAM-prb ? 2.3.0,2.10,1.6.1,centos completed with result FAILUREADAM-prb ? 2.6.0,2.10,1.6.1,centos completed with result SUCCESSADAM-prb ? 2.6.0,2.10,1.4.1,centos completed with result SUCCESSADAM-prb ? 2.3.0,2.11,1.3.1,centos completed with result SUCCESSADAM-prb ? 2.3.0,2.10,1.3.1,centos completed with result SUCCESSADAM-prb ? 2.3.0,2.10,1.4.1,centos completed with result SUCCESSADAM-prb ? 2.3.0,2.10,1.5.2,centos completed with result SUCCESSADAM-prb ? 2.6.0,2.11,1.4.1,centos completed with result SUCCESSADAM-prb ? 2.6.0,2.10,1.3.1,centos completed with result SUCCESSADAM-prb ? 2.6.0,2.11,1.3.1,centos completed with result SUCCESSADAM-prb ? 2.3.0,2.11,1.5.2,centos completed with result SUCCESSADAM-prb ? 2.6.0,2.11,1.6.1,centos completed with result SUCCESSADAM-prb ? 2.3.0,2.11,1.4.1,centos completed with result SUCCESSADAM-prb ? 2.3.0,2.11,1.6.1,centos completed with result SUCCESSNotifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1363/

Build result: FAILURE

[...truncated 24 lines...]Triggering ADAM-prb ? 2.3.0,2.10,1.4.1,centosTriggering ADAM-prb ? 2.3.0,2.10,1.5.2,centosTriggering ADAM-prb ? 2.6.0,2.11,1.4.1,centosTriggering ADAM-prb ? 2.6.0,2.10,1.3.1,centosTriggering ADAM-prb ? 2.6.0,2.11,1.3.1,centosTriggering ADAM-prb ? 2.3.0,2.11,1.5.2,centosTriggering ADAM-prb ? 2.6.0,2.11,1.6.1,centosTriggering ADAM-prb ? 2.3.0,2.11,1.4.1,centosTriggering ADAM-prb ? 2.3.0,2.11,1.6.1,centosADAM-prb ? 2.3.0,2.10,1.6.1,centos completed with result FAILUREADAM-prb ? 2.6.0,2.10,1.6.1,centos completed with result SUCCESSADAM-prb ? 2.6.0,2.10,1.4.1,centos completed with result SUCCESSADAM-prb ? 2.3.0,2.11,1.3.1,centos completed with result SUCCESSADAM-prb ? 2.3.0,2.10,1.3.1,centos completed with result SUCCESSADAM-prb ? 2.3.0,2.10,1.4.1,centos completed with result SUCCESSADAM-prb ? 2.3.0,2.10,1.5.2,centos completed with result SUCCESSADAM-prb ? 2.6.0,2.11,1.4.1,centos completed with result SUCCESSADAM-prb ? 2.6.0,2.10,1.3.1,centos completed with result SUCCESSADAM-prb ? 2.6.0,2.11,1.3.1,centos completed with result SUCCESSADAM-prb ? 2.3.0,2.11,1.5.2,centos completed with result SUCCESSADAM-prb ? 2.6.0,2.11,1.6.1,centos completed with result SUCCESSADAM-prb ? 2.3.0,2.11,1.4.1,centos completed with result SUCCESSADAM-prb ? 2.3.0,2.11,1.6.1,centos completed with result SUCCESSNotifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Jul 30, 2016

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1364/
Test PASSed.

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1364/
Test PASSed.

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Jul 30, 2016

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1365/
Test PASSed.

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1365/
Test PASSed.

@akmorrow13

This comment has been minimized.

Show comment
Hide comment
@akmorrow13

akmorrow13 Jul 30, 2016

Contributor

I fixed all the comments @heuermh

Contributor

akmorrow13 commented Jul 30, 2016

I fixed all the comments @heuermh

@akmorrow13

This comment has been minimized.

Show comment
Hide comment
Contributor

akmorrow13 commented Aug 2, 2016

@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Aug 2, 2016

Member

LGTM

Member

heuermh commented Aug 2, 2016

LGTM

@jpdna

This comment has been minimized.

Show comment
Hide comment
@jpdna

jpdna Aug 4, 2016

Member

+1

Member

jpdna commented Aug 4, 2016

+1

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Aug 4, 2016

Member

So NucleotideContigFragmentRDD is not going to be usable as a ReferenceFile from inside an RDD. Is this OK? Because of such, I'm -0.9 on this PR, unless I misunderstand the use case for this PR.

Member

fnothaft commented Aug 4, 2016

So NucleotideContigFragmentRDD is not going to be usable as a ReferenceFile from inside an RDD. Is this OK? Because of such, I'm -0.9 on this PR, unless I misunderstand the use case for this PR.

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Aug 4, 2016

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1369/
Test PASSed.

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1369/
Test PASSed.

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Aug 5, 2016

Member

Thanks for updating this PR with the documentation fixes we discussed, @akmorrow13. I'm retracting my -0.9 and have just merged this as e7e1adf. Thank you!

Member

fnothaft commented Aug 5, 2016

Thanks for updating this PR with the documentation fixes we discussed, @akmorrow13. I'm retracting my -0.9 and have just merged this as e7e1adf. Thank you!

@fnothaft fnothaft closed this Aug 5, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment