Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add warning to CheckIlluminaDir if we detect cycles without data. #874

Merged
merged 2 commits into from Jul 18, 2017

Conversation

jacarey
Copy link
Collaborator

@jacarey jacarey commented Jul 18, 2017

CBCLReader will ignore and log a warning for tiles without data.

This fixes an issue where you have a tile definition in the cbcl but the tile has no data to decompress.
In addition we added some warning messages when we encounter a tile with no data.

Checklist (never delete this)

Never delete this, it is our record that procedure was followed. If you find that for whatever reason one of the checklist points doesn't apply to your PR, you can leave it unchecked but please add an explanation below.

Content

  • Added or modified tests to cover changes and any new functionality
  • Edited the README / documentation (if applicable)
  • All tests passing on Travis

Review

  • Final thumbs-up from reviewer
  • Rebase, squash and reword as applicable

For more detailed guidelines, see https://github.com/broadinstitute/picard/wiki/Guidelines-for-pull-requests

CBCLReader will ignore and log a warning for tiles without data.
@coveralls
Copy link

coveralls commented Jul 18, 2017

Coverage Status

Coverage decreased (-0.02%) to 76.0% when pulling ad6ab6b on jc_DSDEGP-1416 into 46d8617 on master.

Copy link
Member

@pshapiro4broad pshapiro4broad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usual nitpick comments but 👍 overall.

Would be nice to have a unit test for this.

reader.getAllTiles().forEach((key, value) -> {
final List<File> fileForCycle = reader.getFilesForCycle(key);
final long totalFilesSize = fileForCycle.stream().mapToLong(file -> file.length() - reader.getHeaderSize()).sum();
final long expectedFileSize = value.stream().mapToLong(BaseBclReader.TileData::getCompressedBlockSize).sum();
final Stream<BaseBclReader.TileData> emptyCycles = value.stream().filter((cycle) -> cycle.getCompressedBlockSize() <= 2);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't need ( ) around cycle. Also, the magic number 2 should have a constant or a comment or both.

reader.getAllTiles().forEach((key, value) -> {
final List<File> fileForCycle = reader.getFilesForCycle(key);
final long totalFilesSize = fileForCycle.stream().mapToLong(file -> file.length() - reader.getHeaderSize()).sum();
final long expectedFileSize = value.stream().mapToLong(BaseBclReader.TileData::getCompressedBlockSize).sum();
final Stream<BaseBclReader.TileData> emptyCycles = value.stream().filter((cycle) -> cycle.getCompressedBlockSize() <= 2);

String emptyCycleString = emptyCycles.map(tile -> String.valueOf(tile.getTileNum())).collect(Collectors.joining(", "));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also write

                    String emptyCycleString = value.stream()
                            .filter(cycle -> cycle.getCompressedBlockSize() <= 2)
                            .map(BaseBclReader.TileData::getTileNum)
                            .map(Object::toString)
                            .collect(Collectors.joining(", "));


String emptyCycleString = emptyCycles.map(tile -> String.valueOf(tile.getTileNum())).collect(Collectors.joining(", "));

if (emptyCycleString.length() > 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be clearer IMO to add the new variables and check before we compute filesForCycle, totalFilesSize, expectedFileSize since they are only used by the if on line 185 below. That way it's obvious that the new code doesn't rely on these variables.

if (read == 0) break;
totalRead += read;
//only decompress the data if we are expecting data.
if (uncompressedByteArray.length > 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really long method. It would be easier to understand if it was split into smaller methods.

int totalRead = 0;
try (GZIPInputStream gzipInputStream = new GZIPInputStream(byteInputStream, uncompressedByteArray.length)) {
while ((read = gzipInputStream.read(uncompressedByteArray, totalRead, uncompressedByteArray.length - totalRead)) != -1) {
if (read == 0) break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need { } here

(totalCycleCount + 1), this.streamFiles[totalCycleCount].getAbsolutePath()));
}
} else {
log.warn("Ignoring tile " + tileData.tileNum + " there are no PF reads.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this was in a method, you could do the negative case first, with a return, which IMO would be clearer.

@coveralls
Copy link

coveralls commented Jul 18, 2017

Coverage Status

Coverage decreased (-0.02%) to 75.999% when pulling b06bea7 on jc_DSDEGP-1416 into 46d8617 on master.

@jacarey jacarey merged commit 34e388d into master Jul 18, 2017
@jacarey jacarey deleted the jc_DSDEGP-1416 branch July 18, 2017 15:33
@sooheelee
Copy link
Contributor

Just to be clear @jacarey, before this fix the tool would error on no data tiles?

@jacarey
Copy link
Collaborator Author

jacarey commented Jul 24, 2017

@sooheelee No there was just no warning. We didn't realize you could have tiles defined without data. This is a valid state, but one that seemed relevant to anyone running this program so we output a warning.

@sooheelee
Copy link
Contributor

Ok, thanks @jacarey. I will correct the updated release notes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants