Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

removing all google genomics API dependencies #4266

Merged
merged 3 commits into from Feb 12, 2018

Conversation

lbergelson
Copy link
Member

the google genomics API has deprecated all the features we were using,
this includes the reference lookup api, and the google Read data types

removing all google genomics related dependencies

  • replacing com.google.cloud.genomics:gatk-tools-java:1.1 with gov.nist.math.jama:gov.nist.math.jama:1.1.1
    we rely on this transitive dependency, making it a direct dependency instead

  • remove com.google.apis:google-api-services-genomics:v1-rev527-1.22.0

  • remove com.google.cloud.genomics:google-genomics-utils:v1-0.10

  • delete ReferenceAPISource and tests

  • delete GoogleGenomicsReadToGATKReadAdapter and tests

  • delete CigarConversionUtils and tests

  • update other classes to remove references to these types

  • improve an error message

@@ -49,8 +49,9 @@ public ReferenceMultiSource(final String referenceURL,
} else {
referenceSource = new ReferenceFileSource(referenceURL);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completely unrelated question, but something that I realized reviewing this: is the separation of ReferenceFileSource and ReferenceHadoopSource really necessary? It looks like the java.nio.Path implementation would take care of Hadoop stuff (just for a different PR, but maybe worthy to open an issue about it)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's probably redundant like you say. We should take a look at it. #4279

Copy link
Contributor

@magicDGS magicDGS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions about the tests.

@@ -28,7 +27,7 @@
@DataProvider(name = "bases")
public Object[][] bases(){
Object[][] data = new Object[2][];
List<Class<?>> classes = Arrays.asList(Read.class, SAMRecord.class);
List<Class<?>> classes = Arrays.asList(SAMRecord.class, SAMRecord.class);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a singleton list now, at least until there is no other implementation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, good call, I just replaced read -> samrecord, I wasn't paying close enough attention here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opening a new ticket to simplify these if we're only going to have 1 read type #4318

@@ -23,7 +22,7 @@
List<Object[]> testCases = new ArrayList<>();

for ( JoinStrategy joinStrategy : JoinStrategy.values() ) {
for ( Class<?> readImplementation : Arrays.asList(Read.class, SAMRecord.class) ) {
for ( Class<?> readImplementation : Arrays.asList(SAMRecord.class, SAMRecord.class) ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -149,7 +148,7 @@ public void testBQSRSpark(BQSRTest params) throws IOException {
public Object[][] createBQSRCloudTestData() {
final String localResources = getResourceDir();

final String GRCh37RefCloud = ReferenceAPISource.URL_PREFIX + ReferenceAPISource.GRCH37_REF_ID;
final String GRCh37RefCloud = GCS_b37_CHR20_21_REFERENCE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't you remove the GRCh37RefCloud variable and use the GCS_b37_CHR20_21_REFERENCE directly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to keep this because it matches the style of the rest of the test and on the off chance that we want to change this reference path again here.

@@ -211,7 +209,7 @@ public void testBlowUpOnBroadcastIncompatibleReference() throws IOException {
//This data provider is for tests that use BAM files stored in buckets
@DataProvider(name = "BQSRTestBucket")
public Object[][] createBQSRTestDataBucket() {
final String GRCh37RefCloud = ReferenceAPISource.URL_PREFIX + ReferenceAPISource.GRCH37_REF_ID;
final String GRCh37RefCloud = GCS_b37_CHR20_21_REFERENCE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as before

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

@@ -61,8 +60,7 @@ private String getCloudInputs() {
@DataProvider(name = "BQSRTest")
public Object[][] createBQSRTestData() {
final String localResources = getResourceDir();
final String hg19Ref = ReferenceAPISource.HG19_REF_ID;
final String GRCh37Ref = ReferenceAPISource.URL_PREFIX + ReferenceAPISource.GRCH37_REF_ID;
final String GRCh37Ref = GCS_b37_CHR20_21_REFERENCE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as before

@@ -90,7 +88,7 @@ private String getCloudInputs() {

@DataProvider(name = "BQSRTestBucket")
public Object[][] createBQSRTestDataBucket() {
final String GRCh37Ref = ReferenceAPISource.URL_PREFIX + ReferenceAPISource.GRCH37_REF_ID;
final String GRCh37Ref = GCS_b37_CHR20_21_REFERENCE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as before

@droazen droazen self-assigned this Jan 26, 2018
@droazen droazen self-requested a review January 26, 2018 13:33
@codecov-io
Copy link

Codecov Report

Merging #4266 into master will decrease coverage by 0.062%.
The diff coverage is 33.333%.

@@               Coverage Diff               @@
##              master     #4266       +/-   ##
===============================================
- Coverage     79.088%   79.026%   -0.062%     
+ Complexity     16588     16372      -216     
===============================================
  Files           1048      1045        -3     
  Lines          59506     58982      -524     
  Branches        9718      9634       -84     
===============================================
- Hits           47062     46611      -451     
+ Misses          8666      8628       -38     
+ Partials        3778      3743       -35
Impacted Files Coverage Δ Complexity Δ
...institute/hellbender/exceptions/UserException.java 71.318% <ø> (+0.207%) 3 <0> (ø) ⬇️
...ute/hellbender/utils/read/ArtificialReadUtils.java 93.478% <ø> (-0.616%) 67 <0> (-3)
...lbender/utils/read/SAMRecordToGATKReadAdapter.java 92.43% <ø> (-0.03%) 135 <0> (-1)
...broadinstitute/hellbender/utils/read/GATKRead.java 31.25% <ø> (-37.5%) 7 <0> (-6)
...itute/hellbender/engine/spark/GATKRegistrator.java 100% <ø> (ø) 2 <0> (ø) ⬇️
...dinstitute/hellbender/utils/pileup/ReadPileup.java 91.946% <ø> (ø) 64 <0> (ø) ⬇️
...utils/test/ReadsPreprocessingPipelineTestData.java 0.862% <0%> (+0.015%) 1 <1> (ø) ⬇️
...ender/engine/datasources/ReferenceMultiSource.java 61.538% <0%> (-15.385%) 9 <0> (-2)
...bender/engine/datasources/ReferenceFileSource.java 88.462% <100%> (ø) 9 <0> (ø) ⬇️
...titute/hellbender/engine/spark/JsonSerializer.java 0% <0%> (-63.636%) 0% <0%> (-4%)
... and 6 more

@@ -33,7 +32,7 @@
public class AddContextDataToReadSparkUnitTest extends GATKBaseTest {
@DataProvider(name = "bases")
public Object[][] bases() {
List<Class<?>> classes = Arrays.asList(Read.class, SAMRecord.class);
List<Class<?>> classes = Arrays.asList(SAMRecord.class, SAMRecord.class);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specify SAMRecord only once.

emptyCigarRead.getAlignment().setCigar(null);

return new Object[][]{
return new Object[][]{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation is off here

noRGGoogleRead.setReadGroupId(null);

return new Object[][] {
return new Object[][] {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here as well

Copy link
Collaborator

@droazen droazen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few trivial remaining comments, then good to merge 👍

@droazen droazen assigned lbergelson and unassigned droazen Feb 12, 2018
the google genomics API has deprecated all the features we were using,
this includes the reference lookup api, and the google Read data types

removing all google genomics related dependencies
* replacing com.google.cloud.genomics:gatk-tools-java:1.1 with gov.nist.math.jama:gov.nist.math.jama:1.1.1
	we rely on this transitive dependency, making it a direct
	dependency
* remove com.google.apis:google-api-services-genomics:v1-rev527-1.22.0
* remove com.google.cloud.genomics:google-genomics-utils:v1-0.10

* delete ReferenceAPISource and tests
* delete GoogleGenomicsReadToGATKReadAdapter and tests
* delete CigarConversionUtils and tests

* update other classes to remove references to these types
* improve an error message
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants