[ADAM-1328] Rename Transform to TransformAlignments.

Resolves #1328.
bigdatagenomics · May 12, 2017 · 12d624c · 12d624c
1 parent ea9ce6c
commit 12d624c
Show file tree

Hide file tree

Showing 11 changed files with 55 additions and 55 deletions.
diff --git a/README.md b/README.md
@@ -70,7 +70,7 @@ Choose one of the following commands:
 ADAM ACTIONS
           countKmers : Counts the k-mers/q-mers from a read dataset.
     countContigKmers : Counts the k-mers/q-mers from a read dataset.
-           transform : Convert SAM/BAM to ADAM format and optionally perform read pre-processing transformations
+ transformAlignments : Convert SAM/BAM to ADAM format and optionally perform read pre-processing transformations
    transformFeatures : Convert a file with sequence features into corresponding ADAM format and vice versa
          mergeShards : Merges the shards of a file
       reads2coverage : Calculate the coverage from a given ADAM file
@@ -93,7 +93,7 @@ PRINT
 You can learn more about a command, by calling it without arguments or with `--help`, e.g.
 
 ```
-$ adam-submit transform --help
+$ adam-submit transformAlignments --help
  INPUT                                                           : The ADAM, BAM or SAM file to apply the transforms to
  OUTPUT                                                          : Location to write the transformed data in ADAM/Parquet format
  -add_md_tags VAL                                                : Add MD Tags to reads based on the FASTA (or equivalent) file passed to this option.
@@ -145,7 +145,7 @@ $ adam-submit transform --help
                                                                    to LENIENT
 ```
 
-The ADAM `transform` command allows you to mark duplicates, run base quality score recalibration (BQSR) and other pre-processing steps on your data.
+The ADAM `transformAlignments` command allows you to mark duplicates, run base quality score recalibration (BQSR) and other pre-processing steps on your data.
 
 # Getting Started
 
@@ -209,11 +209,11 @@ These aliases call scripts that wrap the `spark-submit` and `spark-shell` comman
 
 Now you can try running some simple ADAM commands:
 
-### `transform`
+### `transformAlignments`
 Make your first `.adam` file like this:
 
 ````
-adam-submit transform $ADAM_HOME/adam-core/src/test/resources/small.sam /tmp/small.adam
+adam-submit transformAlignments $ADAM_HOME/adam-core/src/test/resources/small.sam /tmp/small.adam
 ````
 
 If you didn't obtain your copy of adam from github, you can [grab `small.sam` here](https://raw.githubusercontent.com/bigdatagenomics/adam/master/adam-core/src/test/resources/small.sam).

diff --git a/adam-cli/src/main/scala/org/bdgenomics/adam/cli/ADAMMain.scala b/adam-cli/src/main/scala/org/bdgenomics/adam/cli/ADAMMain.scala
@@ -34,7 +34,7 @@ object ADAMMain {
         List(
           CountReadKmers,
           CountContigKmers,
-          Transform,
+          TransformAlignments,
           TransformFeatures,
           MergeShards,
           Reads2Coverage

diff --git a/...a/org/bdgenomics/adam/cli/Transform.scala → ...nomics/adam/cli/TransformAlignments.scala b/...a/org/bdgenomics/adam/cli/Transform.scala → ...nomics/adam/cli/TransformAlignments.scala
@@ -33,16 +33,16 @@ import org.bdgenomics.utils.cli._
 import org.bdgenomics.utils.misc.Logging
 import org.kohsuke.args4j.{ Argument, Option => Args4jOption }
 
-object Transform extends BDGCommandCompanion {
-  val commandName = "transform"
+object TransformAlignments extends BDGCommandCompanion {
+  val commandName = "transformAlignments"
   val commandDescription = "Convert SAM/BAM to ADAM format and optionally perform read pre-processing transformations"
 
   def apply(cmdLine: Array[String]) = {
-    new Transform(Args4j[TransformArgs](cmdLine))
+    new TransformAlignments(Args4j[TransformAlignmentsArgs](cmdLine))
   }
 }
 
-class TransformArgs extends Args4jBase with ADAMSaveAnyArgs with ParquetArgs {
+class TransformAlignmentsArgs extends Args4jBase with ADAMSaveAnyArgs with ParquetArgs {
   @Argument(required = true, metaVar = "INPUT", usage = "The ADAM, BAM or SAM file to apply the transforms to", index = 0)
   var inputPath: String = null
   @Argument(required = true, metaVar = "OUTPUT", usage = "Location to write the transformed data in ADAM/Parquet format", index = 1)
@@ -91,13 +91,13 @@ class TransformArgs extends Args4jBase with ADAMSaveAnyArgs with ParquetArgs {
   var forceShuffle: Boolean = false
   @Args4jOption(required = false, name = "-sort_fastq_output", usage = "Sets whether to sort the FASTQ output, if saving as FASTQ. False by default. Ignored if not saving as FASTQ.")
   var sortFastqOutput: Boolean = false
-  @Args4jOption(required = false, name = "-force_load_bam", usage = "Forces Transform to load from BAM/SAM.")
+  @Args4jOption(required = false, name = "-force_load_bam", usage = "Forces TransformAlignments to load from BAM/SAM.")
   var forceLoadBam: Boolean = false
-  @Args4jOption(required = false, name = "-force_load_fastq", usage = "Forces Transform to load from unpaired FASTQ.")
+  @Args4jOption(required = false, name = "-force_load_fastq", usage = "Forces TransformAlignments to load from unpaired FASTQ.")
   var forceLoadFastq: Boolean = false
-  @Args4jOption(required = false, name = "-force_load_ifastq", usage = "Forces Transform to load from interleaved FASTQ.")
+  @Args4jOption(required = false, name = "-force_load_ifastq", usage = "Forces TransformAlignments to load from interleaved FASTQ.")
   var forceLoadIFastq: Boolean = false
-  @Args4jOption(required = false, name = "-force_load_parquet", usage = "Forces Transform to load from Parquet.")
+  @Args4jOption(required = false, name = "-force_load_parquet", usage = "Forces TransformAlignments to load from Parquet.")
   var forceLoadParquet: Boolean = false
   @Args4jOption(required = false, name = "-single", usage = "Saves OUTPUT as single file")
   var asSingleFile: Boolean = false
@@ -123,8 +123,8 @@ class TransformArgs extends Args4jBase with ADAMSaveAnyArgs with ParquetArgs {
   var storageLevel: String = "MEMORY_ONLY"
 }
 
-class Transform(protected val args: TransformArgs) extends BDGSparkCommand[TransformArgs] with Logging {
-  val companion = Transform
+class TransformAlignments(protected val args: TransformAlignmentsArgs) extends BDGSparkCommand[TransformAlignmentsArgs] with Logging {
+  val companion = TransformAlignments
 
   val stringency = ValidationStringency.valueOf(args.stringency)
 

diff --git a/adam-cli/src/test/scala/org/bdgenomics/adam/cli/ADAMMainSuite.scala b/adam-cli/src/test/scala/org/bdgenomics/adam/cli/ADAMMainSuite.scala
@@ -61,7 +61,7 @@ class ADAMMainSuite extends FunSuite {
   test("single command group") {
     val stream = new ByteArrayOutputStream()
     Console.withOut(stream) {
-      new ADAMMain(List(CommandGroup("SINGLE COMMAND GROUP", List(Transform)))).apply(Array())
+      new ADAMMain(List(CommandGroup("SINGLE COMMAND GROUP", List(TransformAlignments)))).apply(Array())
     }
     val out = stream.toString()
     assert(out.contains("Usage"))
@@ -72,7 +72,7 @@ class ADAMMainSuite extends FunSuite {
   test("add new command group to default command groups") {
     val stream = new ByteArrayOutputStream()
     Console.withOut(stream) {
-      val commandGroups = defaultCommandGroups.union(List(CommandGroup("NEW COMMAND GROUP", List(Transform))))
+      val commandGroups = defaultCommandGroups.union(List(CommandGroup("NEW COMMAND GROUP", List(TransformAlignments))))
       new ADAMMain(commandGroups)(Array())
     }
     val out = stream.toString()
@@ -97,7 +97,7 @@ class ADAMMainSuite extends FunSuite {
     Console.withOut(stream) {
       val module = new AbstractModule with ScalaModule {
         def configure() = {
-          bind[List[CommandGroup]].toInstance(List(CommandGroup("SINGLE COMMAND GROUP", List(Transform))))
+          bind[List[CommandGroup]].toInstance(List(CommandGroup("SINGLE COMMAND GROUP", List(TransformAlignments))))
         }
       }
       val injector = Guice.createInjector(module)
@@ -115,7 +115,7 @@ class ADAMMainSuite extends FunSuite {
     Console.withOut(stream) {
       val module = new AbstractModule with ScalaModule {
         def configure() = {
-          bind[List[CommandGroup]].toInstance(defaultCommandGroups.union(List(CommandGroup("NEW COMMAND GROUP", List(Transform)))))
+          bind[List[CommandGroup]].toInstance(defaultCommandGroups.union(List(CommandGroup("NEW COMMAND GROUP", List(TransformAlignments)))))
         }
       }
       val injector = Guice.createInjector(module)

diff --git a/adam-cli/src/test/scala/org/bdgenomics/adam/cli/MergeShardsSuite.scala b/adam-cli/src/test/scala/org/bdgenomics/adam/cli/MergeShardsSuite.scala
@@ -26,7 +26,7 @@ class MergeShardsSuite extends ADAMFunSuite {
     val inputPath = copyResource("unordered.sam")
     val actualPath = tmpFile("unordered.sam")
     val expectedPath = inputPath
-    Transform(Array("-single", "-defer_merging", inputPath, actualPath)).run(sc)
+    TransformAlignments(Array("-single", "-defer_merging", inputPath, actualPath)).run(sc)
     MergeShards(Array(actualPath + "_tail", actualPath,
       "-header_path", actualPath + "_head")).run(sc)
     checkFiles(expectedPath, actualPath)
@@ -36,7 +36,7 @@ class MergeShardsSuite extends ADAMFunSuite {
     val inputPath = copyResource("unordered.sam")
     val actualPath = tmpFile("ordered.sam")
     val expectedPath = copyResource("ordered.sam")
-    Transform(Array("-single",
+    TransformAlignments(Array("-single",
       "-sort_reads",
       "-sort_lexicographically",
       "-defer_merging",
@@ -49,7 +49,7 @@ class MergeShardsSuite extends ADAMFunSuite {
   sparkTest("merge sharded bam") {
     val inputPath = copyResource("unordered.sam")
     val actualPath = tmpFile("unordered.bam")
-    Transform(Array("-single",
+    TransformAlignments(Array("-single",
       "-defer_merging",
       inputPath, actualPath)).run(sc)
     MergeShards(Array(actualPath + "_tail", actualPath,
@@ -66,7 +66,7 @@ class MergeShardsSuite extends ADAMFunSuite {
     println(referencePath)
 
     val actualPath = tmpFile("artificial.cram")
-    Transform(Array("-single",
+    TransformAlignments(Array("-single",
       "-sort_reads",
       "-sort_lexicographically",
       "-defer_merging",

diff --git a/adam-cli/src/test/scala/org/bdgenomics/adam/cli/Reads2FragmentsSuite.scala b/adam-cli/src/test/scala/org/bdgenomics/adam/cli/Reads2FragmentsSuite.scala
@@ -28,7 +28,7 @@ class Reads2FragmentsSuite extends ADAMFunSuite {
     val expectedPath = copyResource("ordered.sam")
     Reads2Fragments(Array(inputPath, fragmentsPath)).run(sc)
     Fragments2Reads(Array(fragmentsPath, readsPath)).run(sc)
-    Transform(Array("-single", "-sort_reads", "-sort_lexicographically",
+    TransformAlignments(Array("-single", "-sort_reads", "-sort_lexicographically",
       readsPath, actualPath)).run(sc)
     checkFiles(expectedPath, actualPath)
   }

diff --git a/.../bdgenomics/adam/cli/TransformSuite.scala → ...s/adam/cli/TransformAlignmentsSuite.scala b/.../bdgenomics/adam/cli/TransformSuite.scala → ...s/adam/cli/TransformAlignmentsSuite.scala
@@ -19,20 +19,20 @@ package org.bdgenomics.adam.cli
 
 import org.bdgenomics.adam.util.ADAMFunSuite
 
-class TransformSuite extends ADAMFunSuite {
+class TransformAlignmentsSuite extends ADAMFunSuite {
   sparkTest("unordered sam to unordered sam") {
     val inputPath = copyResource("unordered.sam")
     val actualPath = tmpFile("unordered.sam")
     val expectedPath = inputPath
-    Transform(Array("-single", inputPath, actualPath)).run(sc)
+    TransformAlignments(Array("-single", inputPath, actualPath)).run(sc)
     checkFiles(expectedPath, actualPath)
   }
 
   sparkTest("unordered sam to ordered sam") {
     val inputPath = copyResource("unordered.sam")
     val actualPath = tmpFile("ordered.sam")
     val expectedPath = copyResource("ordered.sam")
-    Transform(Array("-single", "-sort_reads", "-sort_lexicographically", inputPath, actualPath)).run(sc)
+    TransformAlignments(Array("-single", "-sort_reads", "-sort_lexicographically", inputPath, actualPath)).run(sc)
     checkFiles(expectedPath, actualPath)
   }
 
@@ -41,8 +41,8 @@ class TransformSuite extends ADAMFunSuite {
     val intermediateAdamPath = tmpFile("unordered.adam")
     val actualPath = tmpFile("unordered.sam")
     val expectedPath = inputPath
-    Transform(Array(inputPath, intermediateAdamPath)).run(sc)
-    Transform(Array("-single", intermediateAdamPath, actualPath)).run(sc)
+    TransformAlignments(Array(inputPath, intermediateAdamPath)).run(sc)
+    TransformAlignments(Array("-single", intermediateAdamPath, actualPath)).run(sc)
     checkFiles(expectedPath, actualPath)
   }
 
@@ -51,8 +51,8 @@ class TransformSuite extends ADAMFunSuite {
     val intermediateAdamPath = tmpFile("unordered.adam")
     val actualPath = tmpFile("ordered.sam")
     val expectedPath = copyResource("ordered.sam")
-    Transform(Array(inputPath, intermediateAdamPath)).run(sc)
-    Transform(Array("-single", "-sort_reads", "-sort_lexicographically", intermediateAdamPath, actualPath)).run(sc)
+    TransformAlignments(Array(inputPath, intermediateAdamPath)).run(sc)
+    TransformAlignments(Array("-single", "-sort_reads", "-sort_lexicographically", intermediateAdamPath, actualPath)).run(sc)
     checkFiles(expectedPath, actualPath)
   }
 }
diff --git a/adam-cli/src/test/scala/org/bdgenomics/adam/cli/ViewSuite.scala b/adam-cli/src/test/scala/org/bdgenomics/adam/cli/ViewSuite.scala
@@ -32,8 +32,8 @@ class ViewSuite extends ADAMFunSuite {
   sparkBefore("initialize 'reads' Array from flag-values.sam") {
 
     val transform =
-      new Transform(
-        Args4j[TransformArgs](
+      new TransformAlignments(
+        Args4j[TransformAlignmentsArgs](
           Array(
             inputSamPath,
             "unused_output_path"

diff --git a/docs/source/01_intro.md b/docs/source/01_intro.md
@@ -140,7 +140,7 @@ PRINT
 You can learn more about a command, by calling it without arguments or with `--help`, e.g.
 
 ```
-$ adam-submit transform
+$ adam-submit transformAlignments
 Argument "INPUT" is required
  INPUT                                                           : The ADAM, BAM or SAM file to apply the transforms to
  OUTPUT                                                          : Location to write the transformed data in ADAM/Parquet format
@@ -191,7 +191,7 @@ Argument "INPUT" is required
                                                                    to LENIENT
 ```
 
-The ADAM transform command allows you to mark duplicates, run base quality score recalibration (BQSR) and other pre-processing steps on your data.
+The ADAM transformAlignments command allows you to mark duplicates, run base quality score recalibration (BQSR) and other pre-processing steps on your data.
 
 There are also a number of projects built on ADAM, e.g.
 

diff --git a/docs/source/40_deploying_ADAM.md b/docs/source/40_deploying_ADAM.md
@@ -96,7 +96,7 @@ spark-master using `scp` and then copy to HDFS using
 From ADAM shell, or as parameter to ADAM submit, you would refer HDFS URLs
 such as:
 ```
-adam-submit transform hdfs://spark-master/work_dir/sample1.bam \
+adam-submit transformAlignments hdfs://spark-master/work_dir/sample1.bam \
                       hdfs://spark-master/work_dir/sample1.adam
 ```
 
@@ -200,9 +200,9 @@ can cause jobs to fail. To eliminate this issue, you can set the
 resource request to YARN over the JVM Heap size indicated by `--driver-memory`
 or `--executor-memory`.
 
-As a final example, to run the ADAM [transform](#transform) CLI using YARN
-cluster mode on a 64 node cluster with one executor per node and a 2GB per
-executor overhead, we would run:
+As a final example, to run the ADAM [transformAlignments](#transformAlignments)
+CLI using YARN cluster mode on a 64 node cluster with one executor per node and
+a 2GB per executor overhead, we would run:
 
 ```
 ./bin/adam-submit \
@@ -215,7 +215,7 @@ executor overhead, we would run:
   --conf spark.yarn.executor.memoryOverhead=2048 \
   --conf spark.executor.instances=64 \
   -- \
-  transform in.sam out.adam
+  transformAlignments in.sam out.adam
 ```
 
 In this example, we are allocating 200GB of JVM heap space per executor and for
@@ -258,7 +258,7 @@ include:
   this workflow was demonstrated in [@vivian16] and sets up a Spark cluster
   which then runs ADAM's [`countKmers` CLI](#countKmers).
 * [adam-pipeline](https://github.com/BD2KGenomics/toil-scripts/tree/master/src/toil_scripts/adam_pipeline):
-  this workflow runs several stages in the ADAM [`transform` CLI](#transform).
+  this workflow runs several stages in the ADAM [`transformAlignments` CLI](#transformAlignments).
   This pipeline is the ADAM equivalent to the GATK's "Best Practice" read
   preprocessing pipeline. We then stitch together this pipeline with
   [BWA-MEM](https://github.com/lh3/bwa) and the GATK in the [adam-gatk-pipeline](
@@ -446,7 +446,7 @@ does the following work:
         # convert the file
         _log.info('Converting %s into ADAM format at %s.', hdfs_tmp_file, hdfs_input_file)
         call_adam(master_ip,
-                  ['transform',
+                  ['transformAlignments',
                    hdfs_tmp_file, hdfs_input_file],
                   memory=memory, override_parameters=spark_conf)
 ```

diff --git a/docs/source/50_cli.md b/docs/source/50_cli.md
@@ -101,22 +101,22 @@ Beyond the [default options](#default-args), both `countKmers` and
 * `-print_histogram`: If provided, prints a histogram of the $k$-mer count
   distribution to standard out.
 
-### transform {#transform}
+### transformAlignments {#transformAlignments}
 
-The `transform` CLI is the entrypoint to ADAM's read preprocessing tools. This
-command provides drop-in replacement commands for several commands in the
-[Genome Analysis Toolkit](https://software.broadinstitute.org/gatk/) "Best
-Practices" read preprocessing pipeline and more [@depristo11]. This CLI tool
-takes two required arguments:
+The `transformAlignments` CLI is the entrypoint to ADAM's read preprocessing
+tools. This command provides drop-in replacement commands for several commands
+in the [Genome Analysis Toolkit](https://software.broadinstitute.org/gatk/)
+"Best Practices" read preprocessing pipeline and more [@depristo11]. This CLI
+tool takes two required arguments:
 
 1. `INPUT`: The input path. A file containing reads in any of the supported
   ADAM read input formats.
 2. `OUTPUT`: The path to save the transformed reads to. Supports any of ADAM's
   read output formats.
 
 Beyond the [default options](#default-args) and the [legacy output
-options](#legacy-output), `transform` supports a vast range of options. These
-options fall into several general categories:
+options](#legacy-output), `transformAlignments` supports a vast range of options.
+These options fall into several general categories:
 
 * General options:
     * `-cache`: If provided, the results of intermediate stages will be cached.
@@ -197,7 +197,7 @@ options fall into several general categories:
       fragment to load. Defaults to 10,000bp.
     * `-md_tag_overwrite`: If provided, recomputes and overwrites the
       `mismatchingPositions` field for records where this field was provided.
-* Output options: `transform` supports the [legacy output](#legacy-output)
+* Output options: `transformAlignments` supports the [legacy output](#legacy-output)
   options. Additionally, there are the following options:
     * `-coalesce`: Sets the number of partitions to coalesce the output to.
       If `-force_shuffle_coalesce` is not provided, the Spark engine may ignore
@@ -392,9 +392,9 @@ options](#default-args). Additionally, `adam2fasta` takes the following options:
 
 ### adam2fastq
 
-While the [`transform`](#transform) command can export to FASTQ, the
-`adam2fastq` provides a simpler CLI with more output options. `adam2fastq`
-takes two required arguments and an optional third argument:
+While the [`transformAlignments`](#transformAlignments) command can export to
+FASTQ, the `adam2fastq` provides a simpler CLI with more output options.
+`adam2fastq` takes two required arguments and an optional third argument:
 
 1. `INPUT`: The input read file, in any ADAM-supported read format.
 2. `OUTPUT`: The path to save an unpaired or interleaved FASTQ file to, or the