Load/store sequence dictionaries alongside Genotype RDDs #909

Closed
fnothaft opened this Issue Jan 7, 2016 · 0 comments

Comments

Projects
None yet
2 participants
@fnothaft
Member

fnothaft commented Jan 7, 2016

From discussion with @akmorrow13 and @erictu. For certain queries against an RDD[Genotype], we wind up recreating the SequenceDictionary via the ADAMSequenceDictionaryRDDAggregator. This works, but is slow. We can take a similar approach to #906 to eliminate this problem.

@fnothaft fnothaft added this to the 0.19.0 milestone Jan 7, 2016

@heuermh heuermh modified the milestones: 0.20.0, 0.19.0 Feb 24, 2016

fnothaft added a commit to fnothaft/adam that referenced this issue Apr 24, 2016

[ADAM-909] Refactoring variation RDDs.
Resolves #909:

* Refactors `org.bdgenomics.adam.rdd.variation` to add `GenomicRDD`s for
  `Genotype`, `Variant`, and `VariantContext`. These classes write
  sequence and sample metadata to disk.
* Refactors `ADAMRDDFunctions` to an abstract class in preparation for
  further refactoring in #1011.
* Added `AvroGenomicRDD` trait which consolidates Parquet + Avro metadata
  writing code across all Avro data models.

fnothaft added a commit to fnothaft/adam that referenced this issue May 20, 2016

[ADAM-909] Refactoring variation RDDs.
Resolves #909:

* Refactors `org.bdgenomics.adam.rdd.variation` to add `GenomicRDD`s for
  `Genotype`, `Variant`, and `VariantContext`. These classes write
  sequence and sample metadata to disk.
* Refactors `ADAMRDDFunctions` to an abstract class in preparation for
  further refactoring in #1011.
* Added `AvroGenomicRDD` trait which consolidates Parquet + Avro metadata
  writing code across all Avro data models.

fnothaft added a commit to fnothaft/adam that referenced this issue May 25, 2016

[ADAM-909] Refactoring variation RDDs.
Resolves #909:

* Refactors `org.bdgenomics.adam.rdd.variation` to add `GenomicRDD`s for
  `Genotype`, `Variant`, and `VariantContext`. These classes write
  sequence and sample metadata to disk.
* Refactors `ADAMRDDFunctions` to an abstract class in preparation for
  further refactoring in #1011.
* Added `AvroGenomicRDD` trait which consolidates Parquet + Avro metadata
  writing code across all Avro data models.

fnothaft added a commit to fnothaft/adam that referenced this issue May 26, 2016

[ADAM-909] Refactoring variation RDDs.
Resolves #909:

* Refactors `org.bdgenomics.adam.rdd.variation` to add `GenomicRDD`s for
  `Genotype`, `Variant`, and `VariantContext`. These classes write
  sequence and sample metadata to disk.
* Refactors `ADAMRDDFunctions` to an abstract class in preparation for
  further refactoring in #1011.
* Added `AvroGenomicRDD` trait which consolidates Parquet + Avro metadata
  writing code across all Avro data models.

fnothaft added a commit to fnothaft/adam that referenced this issue May 26, 2016

[ADAM-909] Refactoring variation RDDs.
Resolves #909:

* Refactors `org.bdgenomics.adam.rdd.variation` to add `GenomicRDD`s for
  `Genotype`, `Variant`, and `VariantContext`. These classes write
  sequence and sample metadata to disk.
* Refactors `ADAMRDDFunctions` to an abstract class in preparation for
  further refactoring in #1011.
* Added `AvroGenomicRDD` trait which consolidates Parquet + Avro metadata
  writing code across all Avro data models.

fnothaft added a commit to fnothaft/adam that referenced this issue May 26, 2016

[ADAM-909] Refactoring variation RDDs.
Resolves #909:

* Refactors `org.bdgenomics.adam.rdd.variation` to add `GenomicRDD`s for
  `Genotype`, `Variant`, and `VariantContext`. These classes write
  sequence and sample metadata to disk.
* Refactors `ADAMRDDFunctions` to an abstract class in preparation for
  further refactoring in #1011.
* Added `AvroGenomicRDD` trait which consolidates Parquet + Avro metadata
  writing code across all Avro data models.

fnothaft added a commit to fnothaft/adam that referenced this issue Jun 1, 2016

[ADAM-909] Refactoring variation RDDs.
Resolves #909:

* Refactors `org.bdgenomics.adam.rdd.variation` to add `GenomicRDD`s for
  `Genotype`, `Variant`, and `VariantContext`. These classes write
  sequence and sample metadata to disk.
* Refactors `ADAMRDDFunctions` to an abstract class in preparation for
  further refactoring in #1011.
* Added `AvroGenomicRDD` trait which consolidates Parquet + Avro metadata
  writing code across all Avro data models.

@heuermh heuermh closed this in #1015 Jun 3, 2016

@heuermh heuermh referenced this issue Jun 7, 2016

Closed

Release ADAM version 0.20.0 #1048

47 of 61 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment