Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixed an error where the read mate key in SplitNCigarReads was not sufficiently strict about readnames causing cigar errors #6909

Merged
merged 5 commits into from
Nov 16, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -422,9 +422,11 @@ public void setMateChanged() {
}
}

// Generates the string key to be used for
private static String makeKey(String name, boolean firstOfPair, int mateStart) {
return name + (firstOfPair ? 1 : 0) + mateStart;
// Generates the string key to be used for storing mates that had their cigar strings changed between iterations over the input reads
@VisibleForTesting
static String makeKey(String name, boolean firstOfPair, int mateStart) {
// NOTE: We use the '@' character here because it is explicitly forbidden for use in read names by the SAM Spec and we don't want to risk accidental matches across read pairs
return name + "@" + (firstOfPair ? 1 : 0) + "@" + mateStart;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

}
/**
* Will edit the mate MC tag and mate start position field for given read if that read has been recorded being edited
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -201,6 +201,20 @@ public void testMappingReadMateRepair() {
Assert.assertEquals(read2.getMateStart(), 10003);
}

@Test
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jamesemery Have you tested what happens if you add a read with a @s in the name which could recreate the problem?

// Test asserting that the failure found in https://github.com/broadinstitute/gatk/issues/6776 has been fixed
public void testIncidentalMatchFromReportedFailure() {
GATKRead read1 = ArtificialReadUtils.createArtificialRead(hg19Header, "RR.6123", 1, 3500, 100);
read1.setIsFirstOfPair();
GATKRead read2 = ArtificialReadUtils.createArtificialUnmappedRead(hg19Header, new byte[]{(byte)'A'}, new byte[]{(byte)'A'});
read2.setName("RR.6123135");
read2.setIsSecondOfPair();

// Asserting that in the case of reads that end with widely ranged numbers it is no longer possible to incidentally overlap in key generation
Assert.assertNotEquals( OverhangFixingManager.makeKey(read1.getName(), read1.isFirstOfPair(), read1.getStart()),
OverhangFixingManager.makeKey(read2.getName(), read2.isFirstOfPair(), read2.getStart()));
}

@Test
public void testMappingReadMateRepairNoMCTag() {
final OverhangFixingManager manager = new OverhangFixingManagerAlwaysSplit10000Reads(getHG19Header(), null, hg19GenomeLocParser, hg19ReferenceReader, 10000, 1, 40, false, false);
Expand Down