Fix IndexFactory.writeIndex for tabix indexes #683

Merged
merged 5 commits into from Sep 16, 2016

Conversation

Projects
None yet
4 participants
Contributor

magicDGS commented Aug 12, 2016

No description provided.

Coverage Status

Coverage increased (+0.001%) to 68.808% when pulling 275fa9d on magicDGS:dgs_issue_430 into c8202f2 on samtools:master.

@yfarjoun yfarjoun and 1 other commented on an outdated diff Aug 23, 2016

src/main/java/htsjdk/tribble/index/IndexFactory.java
@@ -292,7 +294,9 @@ public static LinearIndex createLinearIndex(final File inputFile, final FeatureC
public static void writeIndex(final Index idx, final File idxFile) throws IOException {
LittleEndianOutputStream stream = null;
try {
- stream = new LittleEndianOutputStream(new BufferedOutputStream(new FileOutputStream(idxFile)));
+ final OutputStream underlyingStream = (idx instanceof TabixIndex) ?
+ new BlockCompressedOutputStream(idxFile) : new FileOutputStream(idxFile);
@yfarjoun

yfarjoun Aug 23, 2016

Contributor

would it be possible for the index to know what type of outputstream it needs and either open it for the factory, or write itself to that stream (given a file). it seems that it's more appropriate to give that decision to the index rather than to the factory.

also: tests please.

@magicDGS

magicDGS Aug 24, 2016

Contributor

I was thinking about this before, because actually it makes more sense to have a Index.write(File idxFile) method instead of the factory. But I decided to go for this to maintain compatibility. In addition, the javadoc said that "It is the responsibility of this class to determine and create the correct index type from the input file or stream.", so that's why I did it like this.

Contributor

magicDGS commented Aug 24, 2016

@yfarjoun, I don't know how to do tests for checking this...

Coverage Status

Coverage increased (+0.07%) to 68.879% when pulling db82f9b on magicDGS:dgs_issue_430 into c8202f2 on samtools:master.

@yfarjoun yfarjoun commented on an outdated diff Sep 1, 2016

src/main/java/htsjdk/tribble/index/Index.java
@@ -26,6 +26,7 @@
import htsjdk.tribble.util.LittleEndianOutputStream;
import java.io.File;
+import java.io.FileNotFoundException;
@yfarjoun

yfarjoun Sep 1, 2016

Contributor

where is this used?

@yfarjoun yfarjoun commented on an outdated diff Sep 1, 2016

src/main/java/htsjdk/tribble/index/AbstractIndex.java
@@ -25,6 +25,7 @@
import java.io.BufferedOutputStream;
import java.io.File;
+import java.io.FileNotFoundException;
@yfarjoun

yfarjoun Sep 1, 2016

Contributor

doesn't seem to be used

Contributor

yfarjoun commented Sep 1, 2016

couldn't you test that you can write to a file (and an index is created) and write to a stream (and no exception is thrown)?

Coverage Status

Coverage increased (+0.1%) to 68.919% when pulling f90b54c on magicDGS:dgs_issue_430 into c8202f2 on samtools:master.

Contributor

magicDGS commented Sep 1, 2016

@yfarjoun, I added a test to assert that write(File idxFile) is writing a file, and that write( LittleEndianOutputStream stream) to a null stream is not throwing an error. This was done for both LinearIndex and TabixIndex. Are they enough?

Contributor

magicDGS commented Sep 1, 2016

I also squashed the commits. I think that now that the writing to the file is inside the index, the method IndexFatory.writeIndex(final Index idx, final File idxFile) could be deprecated. If someone already have the index file, idx.write(idxFile) could be used...

Coverage Status

Coverage increased (+0.004%) to 68.925% when pulling e785e1e on magicDGS:dgs_issue_430 into c3d5a88 on samtools:master.

@magicDGS magicDGS fix #430
e599f9b

Coverage Status

Coverage decreased (-0.002%) to 68.919% when pulling e599f9b on magicDGS:dgs_issue_430 into c3d5a88 on samtools:master.

Contributor

magicDGS commented Sep 9, 2016

Can you have a look to this, @yfarjoun? It is important for other PR that I'm preparing...

yfarjoun was assigned by droazen Sep 13, 2016

magicDGS changed the title from Fix #430 to Fix IndexFactory.writeIndex for tabix indexes Sep 13, 2016

Contributor

magicDGS commented Sep 13, 2016

Could you have a look, @yfarjoun? Thanks in advance!

- new LittleEndianOutputStream(new BufferedOutputStream(new FileOutputStream(Tribble.indexFile(featureFile))));
- write(idxStream);
- idxStream.close();
+ new LittleEndianOutputStream(new BufferedOutputStream(new FileOutputStream(idxFile)));
@yfarjoun

yfarjoun Sep 14, 2016

Contributor

extra indent?

@magicDGS

magicDGS Sep 15, 2016

Contributor

It comes from the previous version, but I've just put everything in a single line.

Contributor

yfarjoun commented Sep 14, 2016

@lbergelson could you take a look? we originally reviewed this together.

@magicDGS magicDGS re-format line
39a45ce

Coverage Status

Coverage increased (+0.06%) to 68.983% when pulling 39a45ce on magicDGS:dgs_issue_430 into c3d5a88 on samtools:master.

@lbergelson

@magicDGS Thanks for adding tests. I have a few comments about structure, and I want to get @yfarjoun's comments about what to do with tribble vs ioexceptions.

@@ -343,13 +343,19 @@ public void write(final LittleEndianOutputStream stream) throws IOException {
}
@Override
+ public void write(final File idxFile) throws IOException {
@lbergelson

lbergelson Sep 15, 2016

Contributor

@magicDGS Sorry, I didn't notice this before, but this would probably be cleaner as a try-with-resources instead of a try/finally, could you change it?

@magicDGS

magicDGS Sep 16, 2016

Contributor

Done

@@ -290,16 +287,7 @@ public static LinearIndex createLinearIndex(final File inputFile, final FeatureC
* @throws IOException
*/
public static void writeIndex(final Index idx, final File idxFile) throws IOException {
@lbergelson

lbergelson Sep 15, 2016

Contributor

I agree that this should be deprecated since it's now unnecessary.

@magicDGS

magicDGS Sep 16, 2016

Contributor

Done

@@ -201,6 +201,7 @@ public TabixFormat getFormatSpec() {
*
* @param tabixFile Where to write the index.
*/
+ @Override
public void write(final File tabixFile) {
@lbergelson

lbergelson Sep 15, 2016 edited

Contributor

This seems like it should also be a a try-with-resources. It's weird that that this actually throws a TribbleException and the other writes throw IOException. It seems strangely inconsistent. We can either widen them all to throw IOException or narrow to TribbleException. I think I would lean toward everything throws IOException since that would be the least work. It would be a breaking change for people who are already calling write on a tabix index though. @yfarjoun what's your opinion?

@magicDGS

magicDGS Sep 16, 2016

Contributor

Done. Now throwing IOException

@@ -112,4 +117,45 @@ public void testCreateTabixIndexOnBlockCompressed() {
"Tabix indexed (bgzipped) VCF does not contain sequence: " + samSequenceRecord.getSequenceName());
}
}
+
+ @Test
+ public void testWriteIndex() throws Exception {
@lbergelson

lbergelson Sep 15, 2016

Contributor

Since this test is in indexFactory.writeIndex() it should probably be testing indexFactory.writeIndex()...

Test for write on the index files themselves should live in TabixIndexTest and LinearIndexTest respectively.

Either way (or both might be the best at the risk of redundantly testing things), TabixIndexTest.readWriteTest provides a good model for how the test might be made. Instead of repeating the code for the different cases you can write a test function that takes a dataprovider with several inputs. It might be a pain though for linear indexes since I don't see a read or equals method on them.

@magicDGS

magicDGS Sep 16, 2016

Contributor

I changed exactly the same method (with a data provider instead) to IndexTest, because I think that it could live in that package (it is testing the interface method for different implementations).

I added a check for the sequence names and properties loaded and written, and a test bed file too.

@lbergelson

lbergelson Sep 16, 2016

Contributor

Good decision 👍

+ @Test
+ public void testWriteIndex() throws Exception {
+ // temp directory for this test
+ final File tempDir = IOUtil.createTempDir("writeIndex", null);
@lbergelson

lbergelson Sep 15, 2016

Contributor

I think the standard in htsjdk is to use File output = File.createTempFile() to get your output file, and then call output.deleteOnExit() to make sure it's cleaned up afterwards.

@magicDGS

magicDGS Sep 16, 2016

Contributor

I change it in the new test.

+ final TabixIndex tabixIndexVcfGz =
+ IndexFactory
+ .createTabixIndex(inputFileVcfGz, new VCFCodec(), TabixFormat.VCF, null);
+ // wirte to a file
@lbergelson

lbergelson Sep 15, 2016

Contributor

typo wirte -> write

@magicDGS

magicDGS Sep 16, 2016

Contributor

Changed in the new test.

magicDGS added some commits Sep 16, 2016

@magicDGS magicDGS try-with-resources and IOException ae7fcef
@magicDGS magicDGS deprecated IndexFactory.writeIndex() c03e36d
@magicDGS magicDGS updated test
6dda2e5
Contributor

magicDGS commented Sep 16, 2016

Addressed comments, @lbergelson. I changed the method to throw IOException, although I'm waiting for @yfarjoun comments about it.

Now that I'm looking at the tribble package, I that an issue should be open to review exceptions thrown by the classes. It will be useful for distinguish them. For instance, when I implemented getTabixFormat() it throws a generic TribbleException, but I can create another extension of the class that will be more informative.

Contributor

magicDGS commented Sep 16, 2016

Tests are failing for FTP issues, @yfarjoun and @lbergelson...

Contributor

lbergelson commented Sep 16, 2016

@magicDGS I restarted the tests. I hate those ftp tests.

This gets my approval. This is a breaking change with the changes to the exceptions, but I think it's a positive one to make them more consistent. Lets hear what @yfarjoun thinks before merging though.

I agree with you that the current state of exceptions is pretty random. It would be good to do an overhaul of them.

Contributor

yfarjoun commented Sep 16, 2016

a strange test is failing again. otherwise 👍

Contributor

lbergelson commented Sep 16, 2016

Oh yeah, that one. #578 strikes again. Restarted.

Coverage Status

Coverage increased (+0.07%) to 68.995% when pulling 6dda2e5 on magicDGS:dgs_issue_430 into c3d5a88 on samtools:master.

@lbergelson lbergelson merged commit fbba536 into samtools:master Sep 16, 2016

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage increased (+0.07%) to 68.995%
Details

magicDGS deleted the magicDGS:dgs_issue_430 branch Sep 17, 2016

Contributor

magicDGS commented Sep 17, 2016

Thanks for review and accept this change, @yfarjoun and @lbergelson. I also opened an issue to have a look to the exceptions thrown in the package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment