Path index #762

Merged
merged 4 commits into from Dec 5, 2016
@@ -307,11 +307,27 @@ public SamReader open(final SamInputResource resource) {
Math.max(Defaults.BUFFER_SIZE, BlockCompressedStreamConstants.MAX_COMPRESSED_BLOCK_SIZE)
);
File sourceFile = data.asFile();
+ // calling asFile is safe even if indexMaybe is a Google Cloud Storage bucket
+ // (in that case we just get null)
final File indexFile = indexMaybe == null ? null : indexMaybe.asFile();
if (SamStreams.isBAMFile(bufferedStream)) {
@droazen

droazen Nov 29, 2016 edited

Contributor

Is it ok to leave the indexMaybe.asFile() call intact here (even if the index might be in a bucket)? If yes, add a comment explaining why.

@jean-philippe-martin

jean-philippe-martin Nov 30, 2016

Contributor

There is no such call on this line so I'm guessing you mean the call just before? Yes, calling toFile on a bucket has no harmful effect, it just returns null. We use other ways to get to the index inside this block, except for the case where there's no index at all (this matches the previous behavior: we pass null and the code deals with the fact there's no index). I've added a comment to this effect.

if (sourceFile == null || !sourceFile.isFile()) {
- // Handle case in which file is a named pipe, e.g. /dev/stdin or created by mkfifo
- primitiveSamReader = new BAMFileReader(bufferedStream, indexFile, false, asynchronousIO, validationStringency, this.samRecordFactory);
+ // check whether we can seek
+ final SeekableStream indexSeekable = indexMaybe == null ? null : indexMaybe.asUnbufferedSeekableStream();
+ // do not close bufferedStream, it's the same stream we're getting here.
+ SeekableStream sourceSeekable = data.asUnbufferedSeekableStream();
+ if (indexFile!=null || null == sourceSeekable || null == indexSeekable) {
+ // not seekable.
+ // it's OK that we consumed a bit of the stream already, this ctor expects it.
+ primitiveSamReader = new BAMFileReader(bufferedStream, indexFile, false, asynchronousIO, validationStringency, this.samRecordFactory);
+ } else {
+ // seekable.
+ // need to return to the beginning because it's the same stream we used earlier
+ // and read a bit from, and that form of the ctor expects the stream to start at 0.
+ sourceSeekable.seek(0);
+ primitiveSamReader = new BAMFileReader(
+ sourceSeekable, indexSeekable, false, asynchronousIO, validationStringency, this.samRecordFactory);
+ }
} else {
bufferedStream.close();
primitiveSamReader = new BAMFileReader(sourceFile, indexFile, false, asynchronousIO, validationStringency, this.samRecordFactory);