Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added a check to VariantAnnotator that all read samples are present in the VCF #6944

Merged
merged 1 commit into from Nov 24, 2020

Conversation

jamesemery
Copy link
Collaborator

Fixes #6915

…t in the provided vcf before running variant annotator
@droazen droazen self-assigned this Nov 19, 2020
Copy link
Collaborator

@droazen droazen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jamesemery I think the check you added here might be the reverse of the one required

final Set<String> readsSamples = getHeaderForReads().getReadGroups().stream().map(rg -> rg.getSample()).collect(Collectors.toSet());
readsSamples.forEach(readSample -> {
if (!samples.contains(readSample))
throw new UserException(String.format("Reads sample '%s' from readgroups tags does not match any sample in the variant genotypes", readSample));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This checks whether the samples in the bam are present in the vcf. But the original bug report in #6915 was (according to @gbrandt6 ) triggered by the reverse case where samples in the vcf don't exist in the bam.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upon further investigation, @jamesemery and I confirmed that this patch is correct, and the original bug report was backwards. The NPE is triggered when there are samples in the bam not present in the vcf.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into it @droazen @jamesemery I checked again and it looks like the user said there were no samples in common.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look at the version that user was running the NPE is actually occurring on this line:
returnMap.get(ReadUtils.getSampleName(read, header)).add(read);

Which means that the problem is caused when the returnMap (which gets generated right above from the VCF samples list) doesn't contain a sample that is present in one of the reads at that site. It sounded like the user had accentually matched up the wrong bam with the wrong VCF and thats why they got the crash. Technically what they said is accurate but not exactly what caused the bug, it was the reverse of what they reported as the bug. Given that multisample bams exist perhaps there is something more robust that can be done to fix the bug if this ever comes up again.

@droazen droazen merged commit 9e381c9 into master Nov 24, 2020
@droazen droazen deleted the je_betterVariantAnnotatorErrorMessage branch November 24, 2020 21:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

VariantAnnotator returns NullPointerException
3 participants