New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ADAMContext/GenomicRDD/pipe docs #1422

Merged
merged 3 commits into from Mar 14, 2017

Conversation

Projects
None yet
5 participants
@fnothaft
Member

fnothaft commented Mar 6, 2017

Resolves #1368. Also added latest citation information to the README.

@fnothaft fnothaft added this to the 0.22.0 milestone Mar 6, 2017

@coveralls

This comment has been minimized.

Show comment
Hide comment
@coveralls

coveralls Mar 6, 2017

Coverage Status

Coverage remained the same at 76.399% when pulling 8eddeb0 on fnothaft:issues/1368-pipe-markdown into 07c1982 on bigdatagenomics:master.

coveralls commented Mar 6, 2017

Coverage Status

Coverage remained the same at 76.399% when pulling 8eddeb0 on fnothaft:issues/1368-pipe-markdown into 07c1982 on bigdatagenomics:master.

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Mar 6, 2017

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1836/
Test PASSed.

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1836/
Test PASSed.

The `JavaADAMContext` class provides Java-friendly methods that are equivalent
to the `ADAMContext` methods. Specifically, these methods use Java types, and do
not make use of default parameters. In addition to the load/save methods
described above, the `ADAMContext` adds the implicit methods needed for using

This comment has been minimized.

@devin-petersohn

devin-petersohn Mar 6, 2017

Member

Should we explicitly say that the Scala and Java labels in the above function calls indicate whether they can be used with JavaADAMContext/ADAMContext?

@devin-petersohn

devin-petersohn Mar 6, 2017

Member

Should we explicitly say that the Scala and Java labels in the above function calls indicate whether they can be used with JavaADAMContext/ADAMContext?

Show outdated Hide outdated docs/source/60_building_apps.md
<dependency>
<groupId>org.bdgenomics.adam</groupId>
<artifactId>adam-core${binary.version}</artifactId>
<version>0.21.0</version>

This comment has been minimized.

@heuermh

heuermh Mar 6, 2017

Member

would prefer a ${version} or ${adam.version} variable here, so that the docs don't need to be updated on every new release

@heuermh

heuermh Mar 6, 2017

Member

would prefer a ${version} or ${adam.version} variable here, so that the docs don't need to be updated on every new release

Show outdated Hide outdated docs/source/60_building_apps.md
conversion from a `SparkContext` to an `ADAMContext`. To use this, import the
implicit, and call an `ADAMContext` method:
```

This comment has been minimized.

@heuermh

heuermh Mar 6, 2017

Member

does the syntax highlighting language used in Github-flavored markdown cause problems with pandoc generation? If not, it would be useful to have ```scala here (and elsewhere below)

@heuermh

heuermh Mar 6, 2017

Member

does the syntax highlighting language used in Github-flavored markdown cause problems with pandoc generation? If not, it would be useful to have ```scala here (and elsewhere below)

Show outdated Hide outdated docs/source/60_building_apps.md
In Java, instantiate a JavaADAMContext, which wraps an ADAMContext:
```

This comment has been minimized.

@heuermh

heuermh Mar 6, 2017

Member

...and ```java here

@heuermh

heuermh Mar 6, 2017

Member

...and ```java here

With an `ADAMContext`, you can load:
* Single reads as an `AlignmentRecordRDD`:

This comment has been minimized.

@heuermh

heuermh Mar 6, 2017

Member

would this be easier to read as a table?

@heuermh

heuermh Mar 6, 2017

Member

would this be easier to read as a table?

With an `ADAMContext`, you can load:
* Single reads as an `AlignmentRecordRDD`:
* From SAM/BAM/CRAM using `loadBam` (Scala only)

This comment has been minimized.

@heuermh

heuermh Mar 6, 2017

Member

If something says (Scala only) here, does that mean the method is not usable from Java or is simply inconvenient to do so? I've been working with some of these from Java without issue.

@heuermh

heuermh Mar 6, 2017

Member

If something says (Scala only) here, does that mean the method is not usable from Java or is simply inconvenient to do so? I've been working with some of these from Java without issue.

Show outdated Hide outdated docs/source/60_building_apps.md
functional [transformations]{#transforming} across the whole dataset.
Around an `RDD`, ADAM adds metadata which describes the genome, samples, or
pipeline that a dataset came from. Specifically, ADAM supports the following

This comment has been minimized.

@heuermh

heuermh Mar 6, 2017

Member

isn't obvious which definition pipeline has here; this is about read groups?

@heuermh

heuermh Mar 6, 2017

Member

isn't obvious which definition pipeline has here; this is about read groups?

Show outdated Hide outdated docs/source/60_building_apps.md
We provide four implementations:
* `ADAMContext.sameTypeConversionFn`: For piped commands that do not change the
type of the `GenomicRDD` (e.g., `AlignmentRecordRDD` -> `AlignmentRecordRDD`).

This comment has been minimized.

@heuermh

heuermh Mar 6, 2017

Member

-> → &rarr;

(this must be my favorite review comment ever)

@heuermh

heuermh Mar 6, 2017

Member

-> → &rarr;

(this must be my favorite review comment ever)

This comment has been minimized.

// save to vcf
variantContexts.saveAsVcf("hdfs://mynamenode/my/variants.vcf")
```

This comment has been minimized.

@heuermh

heuermh Mar 6, 2017

Member

might be useful to add a section/paragraph describing how data streamed from GenomicRDDs to external apps is post-partitioning; performance implications thereof. I see now this is mentioned above on line 483.

@heuermh

heuermh Mar 6, 2017

Member

might be useful to add a section/paragraph describing how data streamed from GenomicRDDs to external apps is post-partitioning; performance implications thereof. I see now this is mentioned above on line 483.

@devin-petersohn

Looks great, just a couple of nits for examples/further clarification.

Although `GenomicRDD`s do not extend Apache Spark's `RDD` class, `RDD`
operations can be performed on them using the `transform` method. Currently,
we only support `RDD` to `RDD` transformations that keep the same type as
the base type of the `GenomicRDD`. To apply an `RDD` transform, use the

This comment has been minimized.

@devin-petersohn

devin-petersohn Mar 6, 2017

Member

Nit: Add an example: i.e. We can transform AlignmentRecordRDDs to increment the position of the AlignmentRecord by 1 for every AlignmentRecord, but we cannot transform an AlignmentRecordRDD to a FeatureRDD.

I think it will make it much more clear.

@devin-petersohn

devin-petersohn Mar 6, 2017

Member

Nit: Add an example: i.e. We can transform AlignmentRecordRDDs to increment the position of the AlignmentRecord by 1 for every AlignmentRecord, but we cannot transform an AlignmentRecordRDD to a FeatureRDD.

I think it will make it much more clear.

This comment has been minimized.

@fnothaft

fnothaft Mar 7, 2017

Member

+1, sounds good, will add.

@fnothaft

fnothaft Mar 7, 2017

Member

+1, sounds good, will add.

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Mar 10, 2017

Member

Addressed reviewer comments and pulled changes in from #1384.

Member

fnothaft commented Mar 10, 2017

Addressed reviewer comments and pulled changes in from #1384.

@coveralls

This comment has been minimized.

Show comment
Hide comment
@coveralls

coveralls Mar 10, 2017

Coverage Status

Coverage remained the same at 76.399% when pulling 3f1d348 on fnothaft:issues/1368-pipe-markdown into 07c1982 on bigdatagenomics:master.

coveralls commented Mar 10, 2017

Coverage Status

Coverage remained the same at 76.399% when pulling 3f1d348 on fnothaft:issues/1368-pipe-markdown into 07c1982 on bigdatagenomics:master.

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Mar 10, 2017

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1849/
Test PASSed.

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1849/
Test PASSed.

@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Mar 10, 2017

Member

Should I rebase and merge as is or did you want to squash some of the commits first?

Member

heuermh commented Mar 10, 2017

Should I rebase and merge as is or did you want to squash some of the commits first?

@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Mar 10, 2017

Member

Feel free to merge in commits from #1435 here and update 60_building_apps.md with something similar to the changes in commit 446ae81.

Member

heuermh commented Mar 10, 2017

Feel free to merge in commits from #1435 here and update 60_building_apps.md with something similar to the changes in commit 446ae81.

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Mar 14, 2017

Member

Hi @heuermh! This should be good to go now. I figured that #1435 and this PR were sufficiently distinct that it made more sense to just merge #1435 into trunk separately. I wanted to pull #1384 into this since we were touching the same section of the document and would conflict otherwise. Anyways, I've got this squashed down as desired.

Member

fnothaft commented Mar 14, 2017

Hi @heuermh! This should be good to go now. I figured that #1435 and this PR were sufficiently distinct that it made more sense to just merge #1435 into trunk separately. I wanted to pull #1384 into this since we were touching the same section of the document and would conflict otherwise. Anyways, I've got this squashed down as desired.

@coveralls

This comment has been minimized.

Show comment
Hide comment
@coveralls

coveralls Mar 14, 2017

Coverage Status

Coverage remained the same at 76.396% when pulling 6ec6749 on fnothaft:issues/1368-pipe-markdown into 57264bb on bigdatagenomics:master.

coveralls commented Mar 14, 2017

Coverage Status

Coverage remained the same at 76.396% when pulling 6ec6749 on fnothaft:issues/1368-pipe-markdown into 57264bb on bigdatagenomics:master.

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Mar 14, 2017

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1858/
Test PASSed.

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1858/
Test PASSed.

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Mar 14, 2017

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1859/

Build result: FAILURE

[...truncated 16 lines...] > /home/jenkins/git2/bin/git rev-parse origin/pr/1422/merge^{commit} # timeout=10 > /home/jenkins/git2/bin/git branch -a --contains 8933ec928ae71b6fd796e053ba91083c0b56f46a # timeout=10 > /home/jenkins/git2/bin/git rev-parse remotes/origin/pr/1422/merge^{commit} # timeout=10Checking out Revision 8933ec928ae71b6fd796e053ba91083c0b56f46a (origin/pr/1422/merge) > /home/jenkins/git2/bin/git config core.sparsecheckout # timeout=10 > /home/jenkins/git2/bin/git checkout -f 8933ec928ae71b6fd796e053ba91083c0b56f46aFirst time build. Skipping changelog.Triggering ADAM-prb ? 2.3.0,2.11,1.6.1,centosTriggering ADAM-prb ? 2.6.0,2.10,1.6.1,centosTriggering ADAM-prb ? 2.3.0,2.11,2.0.0,centosTriggering ADAM-prb ? 2.3.0,2.10,2.0.0,centosTriggering ADAM-prb ? 2.6.0,2.11,2.0.0,centosTriggering ADAM-prb ? 2.6.0,2.10,2.0.0,centosTriggering ADAM-prb ? 2.3.0,2.10,1.6.1,centosTriggering ADAM-prb ? 2.6.0,2.11,1.6.1,centosADAM-prb ? 2.3.0,2.11,1.6.1,centos completed with result FAILUREADAM-prb ? 2.6.0,2.10,1.6.1,centos completed with result FAILUREADAM-prb ? 2.3.0,2.11,2.0.0,centos completed with result FAILUREADAM-prb ? 2.3.0,2.10,2.0.0,centos completed with result FAILUREADAM-prb ? 2.6.0,2.11,2.0.0,centos completed with result FAILUREADAM-prb ? 2.6.0,2.10,2.0.0,centos completed with result FAILUREADAM-prb ? 2.3.0,2.10,1.6.1,centos completed with result FAILUREADAM-prb ? 2.6.0,2.11,1.6.1,centos completed with result FAILURENotifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1859/

Build result: FAILURE

[...truncated 16 lines...] > /home/jenkins/git2/bin/git rev-parse origin/pr/1422/merge^{commit} # timeout=10 > /home/jenkins/git2/bin/git branch -a --contains 8933ec928ae71b6fd796e053ba91083c0b56f46a # timeout=10 > /home/jenkins/git2/bin/git rev-parse remotes/origin/pr/1422/merge^{commit} # timeout=10Checking out Revision 8933ec928ae71b6fd796e053ba91083c0b56f46a (origin/pr/1422/merge) > /home/jenkins/git2/bin/git config core.sparsecheckout # timeout=10 > /home/jenkins/git2/bin/git checkout -f 8933ec928ae71b6fd796e053ba91083c0b56f46aFirst time build. Skipping changelog.Triggering ADAM-prb ? 2.3.0,2.11,1.6.1,centosTriggering ADAM-prb ? 2.6.0,2.10,1.6.1,centosTriggering ADAM-prb ? 2.3.0,2.11,2.0.0,centosTriggering ADAM-prb ? 2.3.0,2.10,2.0.0,centosTriggering ADAM-prb ? 2.6.0,2.11,2.0.0,centosTriggering ADAM-prb ? 2.6.0,2.10,2.0.0,centosTriggering ADAM-prb ? 2.3.0,2.10,1.6.1,centosTriggering ADAM-prb ? 2.6.0,2.11,1.6.1,centosADAM-prb ? 2.3.0,2.11,1.6.1,centos completed with result FAILUREADAM-prb ? 2.6.0,2.10,1.6.1,centos completed with result FAILUREADAM-prb ? 2.3.0,2.11,2.0.0,centos completed with result FAILUREADAM-prb ? 2.3.0,2.10,2.0.0,centos completed with result FAILUREADAM-prb ? 2.6.0,2.11,2.0.0,centos completed with result FAILUREADAM-prb ? 2.6.0,2.10,2.0.0,centos completed with result FAILUREADAM-prb ? 2.3.0,2.10,1.6.1,centos completed with result FAILUREADAM-prb ? 2.6.0,2.11,1.6.1,centos completed with result FAILURENotifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Mar 14, 2017

Member

Jenkins, retest this please.

Member

fnothaft commented Mar 14, 2017

Jenkins, retest this please.

@coveralls

This comment has been minimized.

Show comment
Hide comment
@coveralls

coveralls Mar 14, 2017

Coverage Status

Coverage remained the same at 76.396% when pulling 6ec6749 on fnothaft:issues/1368-pipe-markdown into 57264bb on bigdatagenomics:master.

coveralls commented Mar 14, 2017

Coverage Status

Coverage remained the same at 76.396% when pulling 6ec6749 on fnothaft:issues/1368-pipe-markdown into 57264bb on bigdatagenomics:master.

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Mar 14, 2017

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1861/
Test PASSed.

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1861/
Test PASSed.

@heuermh heuermh merged commit 1cae769 into bigdatagenomics:master Mar 14, 2017

2 checks passed

coverage/coveralls Coverage remained the same at 76.396%
Details
default Merged build finished.
Details
@heuermh

This comment has been minimized.

Show comment
Hide comment
Member

heuermh commented Mar 14, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment