New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
removing all google genomics API dependencies #4266
Conversation
@@ -49,8 +49,9 @@ public ReferenceMultiSource(final String referenceURL, | |||
} else { | |||
referenceSource = new ReferenceFileSource(referenceURL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Completely unrelated question, but something that I realized reviewing this: is the separation of ReferenceFileSource
and ReferenceHadoopSource
really necessary? It looks like the java.nio.Path
implementation would take care of Hadoop stuff (just for a different PR, but maybe worthy to open an issue about it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's probably redundant like you say. We should take a look at it. #4279
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some questions about the tests.
@@ -28,7 +27,7 @@ | |||
@DataProvider(name = "bases") | |||
public Object[][] bases(){ | |||
Object[][] data = new Object[2][]; | |||
List<Class<?>> classes = Arrays.asList(Read.class, SAMRecord.class); | |||
List<Class<?>> classes = Arrays.asList(SAMRecord.class, SAMRecord.class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be a singleton list now, at least until there is no other implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, good call, I just replaced read -> samrecord, I wasn't paying close enough attention here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
opening a new ticket to simplify these if we're only going to have 1 read type #4318
@@ -23,7 +22,7 @@ | |||
List<Object[]> testCases = new ArrayList<>(); | |||
|
|||
for ( JoinStrategy joinStrategy : JoinStrategy.values() ) { | |||
for ( Class<?> readImplementation : Arrays.asList(Read.class, SAMRecord.class) ) { | |||
for ( Class<?> readImplementation : Arrays.asList(SAMRecord.class, SAMRecord.class) ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -149,7 +148,7 @@ public void testBQSRSpark(BQSRTest params) throws IOException { | |||
public Object[][] createBQSRCloudTestData() { | |||
final String localResources = getResourceDir(); | |||
|
|||
final String GRCh37RefCloud = ReferenceAPISource.URL_PREFIX + ReferenceAPISource.GRCH37_REF_ID; | |||
final String GRCh37RefCloud = GCS_b37_CHR20_21_REFERENCE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't you remove the GRCh37RefCloud
variable and use the GCS_b37_CHR20_21_REFERENCE
directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to keep this because it matches the style of the rest of the test and on the off chance that we want to change this reference path again here.
@@ -211,7 +209,7 @@ public void testBlowUpOnBroadcastIncompatibleReference() throws IOException { | |||
//This data provider is for tests that use BAM files stored in buckets | |||
@DataProvider(name = "BQSRTestBucket") | |||
public Object[][] createBQSRTestDataBucket() { | |||
final String GRCh37RefCloud = ReferenceAPISource.URL_PREFIX + ReferenceAPISource.GRCH37_REF_ID; | |||
final String GRCh37RefCloud = GCS_b37_CHR20_21_REFERENCE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as before
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
@@ -61,8 +60,7 @@ private String getCloudInputs() { | |||
@DataProvider(name = "BQSRTest") | |||
public Object[][] createBQSRTestData() { | |||
final String localResources = getResourceDir(); | |||
final String hg19Ref = ReferenceAPISource.HG19_REF_ID; | |||
final String GRCh37Ref = ReferenceAPISource.URL_PREFIX + ReferenceAPISource.GRCH37_REF_ID; | |||
final String GRCh37Ref = GCS_b37_CHR20_21_REFERENCE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as before
@@ -90,7 +88,7 @@ private String getCloudInputs() { | |||
|
|||
@DataProvider(name = "BQSRTestBucket") | |||
public Object[][] createBQSRTestDataBucket() { | |||
final String GRCh37Ref = ReferenceAPISource.URL_PREFIX + ReferenceAPISource.GRCH37_REF_ID; | |||
final String GRCh37Ref = GCS_b37_CHR20_21_REFERENCE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as before
ca175e4
to
606ed63
Compare
Codecov Report
@@ Coverage Diff @@
## master #4266 +/- ##
===============================================
- Coverage 79.088% 79.026% -0.062%
+ Complexity 16588 16372 -216
===============================================
Files 1048 1045 -3
Lines 59506 58982 -524
Branches 9718 9634 -84
===============================================
- Hits 47062 46611 -451
+ Misses 8666 8628 -38
+ Partials 3778 3743 -35
|
@@ -33,7 +32,7 @@ | |||
public class AddContextDataToReadSparkUnitTest extends GATKBaseTest { | |||
@DataProvider(name = "bases") | |||
public Object[][] bases() { | |||
List<Class<?>> classes = Arrays.asList(Read.class, SAMRecord.class); | |||
List<Class<?>> classes = Arrays.asList(SAMRecord.class, SAMRecord.class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specify SAMRecord
only once.
emptyCigarRead.getAlignment().setCigar(null); | ||
|
||
return new Object[][]{ | ||
return new Object[][]{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation is off here
noRGGoogleRead.setReadGroupId(null); | ||
|
||
return new Object[][] { | ||
return new Object[][] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few trivial remaining comments, then good to merge 👍
the google genomics API has deprecated all the features we were using, this includes the reference lookup api, and the google Read data types removing all google genomics related dependencies * replacing com.google.cloud.genomics:gatk-tools-java:1.1 with gov.nist.math.jama:gov.nist.math.jama:1.1.1 we rely on this transitive dependency, making it a direct dependency * remove com.google.apis:google-api-services-genomics:v1-rev527-1.22.0 * remove com.google.cloud.genomics:google-genomics-utils:v1-0.10 * delete ReferenceAPISource and tests * delete GoogleGenomicsReadToGATKReadAdapter and tests * delete CigarConversionUtils and tests * update other classes to remove references to these types * improve an error message
606ed63
to
1c5b36b
Compare
the google genomics API has deprecated all the features we were using,
this includes the reference lookup api, and the google Read data types
removing all google genomics related dependencies
replacing com.google.cloud.genomics:gatk-tools-java:1.1 with gov.nist.math.jama:gov.nist.math.jama:1.1.1
we rely on this transitive dependency, making it a direct dependency instead
remove com.google.apis:google-api-services-genomics:v1-rev527-1.22.0
remove com.google.cloud.genomics:google-genomics-utils:v1-0.10
delete ReferenceAPISource and tests
delete GoogleGenomicsReadToGATKReadAdapter and tests
delete CigarConversionUtils and tests
update other classes to remove references to these types
improve an error message