Fix for issue #513: NPE in ValidateSamFile with CRAM #735

Merged
merged 3 commits into from Mar 7, 2017

Conversation

Projects
None yet
7 participants
Contributor

ekazachkova commented Oct 28, 2016

Fix for issue #513:

Current Htsjdk implementation throws NullPointerException during NM tag validation in SamFileValidator because getSequenceDictionary() method in ReferenceFileWalker returns null without .dict file. We propose check result of getSequenceDictionary() for null. If this method returns null don't throw any exceptions and continue processing as if reference isn't indexed. Also, we added new unit test to check code correctness.

Checklist

  • Code compiles correctly
  • New tests covering changes and new functionality
  • All tests passing
@SilinPavel @ekazachkova SilinPavel resolve issue #513 NPE when validating a CRAM file with a .fasta refe…
…rence with no .dict file
6d3f109

Coverage Status

Coverage increased (+0.01%) to 69.821% when pulling 6d3f109 on EpamLifeSciencesTeam:epam-ls_npe_in_ValidateSamFile_with_cram into 45ee380 on samtools:master.

Contributor

ZLyanov commented Nov 22, 2016

@yfarjoun could you take a look, please?

yfarjoun added the cram label Nov 22, 2016

Contributor

droazen commented Jan 10, 2017

Can you review?

@vadimzalunin

minor change to ensure there is no index file used

+ validationStringency(ValidationStringency.SILENT).
+ referenceSequence(new File(TEST_DATA_DIR, "nm_tag_validation.fa")).
+ open(new File(TEST_DATA_DIR, "nm_tag_validation.cram"));
+ final ReferenceSequenceFile reference = new FastaSequenceFile(new File(TEST_DATA_DIR, "nm_tag_validation.fa"), true);
@vadimzalunin

vadimzalunin Jan 24, 2017

Contributor

just to be sure there is no index/dictionary involved:

final ReferenceSequenceFile reference = new FastaSequenceFile(new File(TEST_DATA_DIR, "nm_tag_validation.fa"), true);

final SamReader samReader = SamReaderFactory.
makeDefault().
validationStringency(ValidationStringency.SILENT).
referenceSource(new ReferenceSource(reference)).
...

@ekazachkova

ekazachkova Jan 30, 2017

Contributor

@vadimzalunin I can not understand why do we need this check: this test will always fail since cram reader can't work with reference without index. Could you please clarify your comment?

@ekazachkova ekazachkova fix for test data: cram file paired with reference now
6798580
Contributor

ekazachkova commented Jan 30, 2017

Also, we have discovered that our test data aren't valid: cram file is not paired with reference. So we have regenerated them.

Contributor

vadimzalunin commented Feb 1, 2017

Could you please check the CRAM file? Reference MD5 checksums don't seem to match.

cat nm_tag_validation.fa | tail -n +2 | tr -d '\n' | md5sum
f8c08a4411f07717451464d546b3706d -

samtools view -H nm_tag_validation.cram | grep '^@sq' | grep -P 'M5:\w+' -o
M5:351f64d4f4f9ddd45b35336ad97aa6de

@ekazachkova ekazachkova fix for cram file
8d790be
Contributor

ekazachkova commented Feb 3, 2017

ooops, @vadimzalunin thanks for remark

Contributor

yfarjoun commented Mar 7, 2017

verbal thumbs from vadim. 👍

@yfarjoun yfarjoun merged commit 6d22658 into samtools:master Mar 7, 2017

3 checks passed

codecov/patch 66.66% of diff hit (target 64.53%)
Details
codecov/project 64.56% (+0.03%) compared to 4231660
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@magicDGS magicDGS added a commit to magicDGS/htsjdk that referenced this pull request Mar 8, 2017

@ekazachkova @magicDGS ekazachkova + magicDGS Fix for issue #513: NPE in ValidateSamFile with CRAM (#735)
* resolve issue #513 NPE when validating a CRAM file with a .fasta reference with no .dict file
78ca562
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment