Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rolling results #2

Closed
mbhall88 opened this issue Sep 27, 2022 · 60 comments
Closed

Rolling results #2

mbhall88 opened this issue Sep 27, 2022 · 60 comments

Comments

@mbhall88
Copy link
Owner

mbhall88 commented Sep 27, 2022

This issue will document the rolling results.


The first sneak peak is all 437 Nanopore isolates and 400 Illumina isolates (selected at random).

Nanopore

nanopore

Drug Tool FN(R) FP(S) Sensitivity (95% CI) Specificity (95% CI) MCC
Amikacin Drprg 1(15) 3(157) 93.3% (70.2-98.8%) 98.1% (94.5-99.3%) 0.8642875644329534
Amikacin Mykrobe 1(15) 3(157) 93.3% (70.2-98.8%) 98.1% (94.5-99.3%) 0.8642875644329534
Amikacin Tbprofiler 1(15) 3(157) 93.3% (70.2-98.8%) 98.1% (94.5-99.3%) 0.8642875644329534
Capreomycin Drprg 1(3) 1(64) 66.7% (20.8-93.9%) 98.4% (91.7-99.7%) 0.6510416666666666
Capreomycin Mykrobe 1(3) 1(64) 66.7% (20.8-93.9%) 98.4% (91.7-99.7%) 0.6510416666666666
Capreomycin Tbprofiler 1(3) 1(64) 66.7% (20.8-93.9%) 98.4% (91.7-99.7%) 0.6510416666666666
Ethambutol Drprg 7(29) 21(328) 75.9% (57.9-87.8%) 93.6% (90.4-95.8%) 0.5830010462966467
Ethambutol Mykrobe 6(29) 22(328) 79.3% (61.6-90.2%) 93.3% (90.1-95.5%) 0.5975951983859127
Ethambutol Tbprofiler 7(29) 22(328) 75.9% (57.9-87.8%) 93.3% (90.1-95.5%) 0.5747241429661474
Ethionamide Drprg 26(30) 1(86) 13.3% (5.3-29.7%) 98.8% (93.7-99.8%) 0.2624057235284411
Ethionamide Mykrobe 5(30) 57(86) 83.3% (66.4-92.7%) 33.7% (24.6-44.2%) 0.16405763204424978
Ethionamide Tbprofiler 14(30) 4(86) 53.3% (36.1-69.8%) 95.3% (88.6-98.2%) 0.5643248464313276
Isoniazid Drprg 35(117) 3(265) 70.1% (61.3-77.6%) 98.9% (96.7-99.6%) 0.7641591750285314
Isoniazid Mykrobe 18(117) 45(265) 84.6% (77.0-90.0%) 83.0% (78.0-87.1%) 0.6432989433455073
Isoniazid Tbprofiler 21(117) 4(265) 82.1% (74.1-88.0%) 98.5% (96.2-99.4%) 0.8445257661394283
Kanamycin Drprg 0(3) 2(118) 100.0% (43.9-100.0%) 98.3% (94.0-99.5%) 0.7680042372764464
Kanamycin Mykrobe 0(3) 2(118) 100.0% (43.9-100.0%) 98.3% (94.0-99.5%) 0.7680042372764464
Kanamycin Tbprofiler 0(3) 2(118) 100.0% (43.9-100.0%) 98.3% (94.0-99.5%) 0.7680042372764464
Moxifloxacin Drprg 0(0) 1(1) - 0.0% (0.0-79.3%) -
Moxifloxacin Mykrobe 0(0) 1(1) - 0.0% (0.0-79.3%) -
Moxifloxacin Tbprofiler 0(0) 1(1) - 0.0% (0.0-79.3%) -
Ofloxacin Drprg 13(15) 1(158) 13.3% (3.7-37.9%) 99.4% (96.5-99.9%) 0.27378347692948213
Ofloxacin Mykrobe 3(15) 4(158) 80.0% (54.8-93.0%) 97.5% (93.7-99.0%) 0.7524691275756947
Ofloxacin Tbprofiler 3(15) 3(158) 80.0% (54.8-93.0%) 98.1% (94.6-99.4%) 0.7810126582278482
Pyrazinamide Drprg 16(28) 3(243) 42.9% (26.5-60.9%) 98.8% (96.4-99.6%) 0.5540455669886942
Pyrazinamide Mykrobe 10(28) 5(243) 64.3% (45.8-79.3%) 97.9% (95.3-99.1%) 0.6796400181154288
Pyrazinamide Tbprofiler 12(28) 6(243) 57.1% (39.1-73.5%) 97.5% (94.7-98.9%) 0.6093260891549527
Rifampicin Drprg 8(77) 6(287) 89.6% (80.8-94.6%) 97.9% (95.5-99.0%) 0.8837166974977075
Rifampicin Mykrobe 6(77) 6(287) 92.2% (84.0-96.4%) 97.9% (95.5-99.0%) 0.9011719987329744
Rifampicin Tbprofiler 75(77) 1(287) 2.6% (0.7-9.0%) 99.7% (98.1-99.9%) 0.10159114367294636
Streptomycin Drprg 9(55) 16(126) 83.6% (71.7-91.1%) 87.3% (80.4-92.0%) 0.6875051115519748
Streptomycin Mykrobe 7(55) 38(126) 87.3% (76.0-93.7%) 69.8% (61.3-77.2%) 0.526015018770466
Streptomycin Tbprofiler 15(55) 19(126) 72.7% (59.8-82.7%) 84.9% (77.6-90.1%) 0.5656453811464112

Next avenues of investigation:

  • Figure out what has happened to cause the crazy dropout in tbprofiler RIF sensitivity - there must be a problem with the panel I created for it
  • There is something weird going on with the PZA, FLQ and ETO sensitivities for drprg. Will also look into INH sensitivity

Positives:

  • STM

Illumina

illumina

Drug Tool FN(R) FP(S) Sensitivity (95% CI) Specificity (95% CI) MCC
Amikacin Drprg 0(15) 1(166) 100.0% (79.6-100.0%) 99.4% (96.7-99.9%) 0.9653250279768749
Amikacin Mykrobe 1(15) 1(166) 93.3% (70.2-98.8%) 99.4% (96.7-99.9%) 0.9273092369477912
Amikacin Tbprofiler 0(15) 1(166) 100.0% (79.6-100.0%) 99.4% (96.7-99.9%) 0.9653250279768749
Capreomycin Drprg 2(13) 4(93) 84.6% (57.8-95.7%) 95.7% (89.5-98.3%) 0.7558571990231637
Capreomycin Mykrobe 3(13) 4(93) 76.9% (49.7-91.8%) 95.7% (89.5-98.3%) 0.70359611692274
Capreomycin Tbprofiler 2(13) 4(93) 84.6% (57.8-95.7%) 95.7% (89.5-98.3%) 0.7558571990231637
Delamanid Drprg 1(1) 0(93) 0.0% (0.0-79.3%) 100.0% (96.0-100.0%) -
Delamanid Mykrobe 1(1) 0(93) 0.0% (0.0-79.3%) 100.0% (96.0-100.0%) -
Delamanid Tbprofiler 1(1) 0(93) 0.0% (0.0-79.3%) 100.0% (96.0-100.0%) -
Ethambutol Drprg 5(53) 16(229) 90.6% (79.7-95.9%) 93.0% (89.0-95.7%) 0.7795344823612268
Ethambutol Mykrobe 5(53) 17(229) 90.6% (79.7-95.9%) 92.6% (88.4-95.3%) 0.7712443312992736
Ethambutol Tbprofiler 5(53) 18(229) 90.6% (79.7-95.9%) 92.1% (87.9-95.0%) 0.7631197122668696
Ethionamide Drprg 20(37) 11(127) 45.9% (31.0-61.6%) 91.3% (85.2-95.1%) 0.4141740735523773
Ethionamide Mykrobe 11(37) 15(127) 70.3% (54.2-82.5%) 88.2% (81.4-92.7%) 0.5643018224168724
Ethionamide Tbprofiler 10(37) 16(127) 73.0% (57.0-84.6%) 87.4% (80.5-92.1%) 0.5737592501981046
Isoniazid Drprg 17(135) 5(218) 87.4% (80.8-92.0%) 97.7% (94.7-99.0%) 0.868118053524601
Isoniazid Mykrobe 14(135) 7(218) 89.6% (83.3-93.7%) 96.8% (93.5-98.4%) 0.8735871080954598
Isoniazid Tbprofiler 10(135) 7(218) 92.6% (86.9-95.9%) 96.8% (93.5-98.4%) 0.8977596305525896
Kanamycin Drprg 4(25) 2(168) 84.0% (65.3-93.6%) 98.8% (95.8-99.7%) 0.8582554180919595
Kanamycin Mykrobe 6(25) 2(168) 76.0% (56.6-88.5%) 98.8% (95.8-99.7%) 0.8066918414409607
Kanamycin Tbprofiler 5(25) 2(168) 80.0% (60.9-91.1%) 98.8% (95.8-99.7%) 0.8327103314106743
Levofloxacin Drprg 18(28) 5(145) 35.7% (20.7-54.2%) 96.6% (92.2-98.5%) 0.42231266491410374
Levofloxacin Mykrobe 2(28) 10(145) 92.9% (77.4-98.0%) 93.1% (87.8-96.2%) 0.7799214704762708
Levofloxacin Tbprofiler 1(28) 11(145) 96.4% (82.3-99.4%) 92.4% (86.9-95.7%) 0.7903590726235468
Linezolid Drprg 0(0) 0(128) - 100.0% (97.1-100.0%) -
Linezolid Mykrobe 0(0) 0(128) - 100.0% (97.1-100.0%) -
Linezolid Tbprofiler 0(0) 0(128) - 100.0% (97.1-100.0%) -
Moxifloxacin Drprg 11(15) 8(126) 26.7% (10.9-52.0%) 93.7% (88.0-96.7%) 0.2244992239690201
Moxifloxacin Mykrobe 2(15) 18(126) 86.7% (62.1-96.3%) 85.7% (78.5-90.8%) 0.5388625547887866
Moxifloxacin Tbprofiler 2(15) 20(126) 86.7% (62.1-96.3%) 84.1% (76.8-89.5%) 0.5155328733959855
Ofloxacin Drprg 3(3) 0(5) 0.0% (0.0-56.1%) 100.0% (56.6-100.0%) -
Ofloxacin Mykrobe 0(3) 0(5) 100.0% (43.9-100.0%) 100.0% (56.6-100.0%) 1.0
Ofloxacin Tbprofiler 0(3) 0(5) 100.0% (43.9-100.0%) 100.0% (56.6-100.0%) 1.0
Pyrazinamide Drprg 9(23) 0(132) 60.9% (40.8-77.8%) 100.0% (97.2-100.0%) 0.7548792871746883
Pyrazinamide Mykrobe 2(23) 2(132) 91.3% (73.2-97.6%) 98.5% (94.6-99.6%) 0.8978919631093544
Pyrazinamide Tbprofiler 2(23) 2(132) 91.3% (73.2-97.6%) 98.5% (94.6-99.6%) 0.8978919631093544
Rifampicin Drprg 8(118) 4(240) 93.2% (87.2-96.5%) 98.3% (95.8-99.4%) 0.9237938244724586
Rifampicin Mykrobe 6(118) 4(240) 94.9% (89.3-97.6%) 98.3% (95.8-99.4%) 0.936595807066951
Rifampicin Tbprofiler 117(118) 1(240) 0.8% (0.1-4.6%) 99.6% (97.7-99.9%) 0.0271689721964718
Streptomycin Drprg 12(42) 1(64) 71.4% (56.4-82.8%) 98.4% (91.7-99.7%) 0.7512240395539047
Streptomycin Mykrobe 8(42) 5(64) 81.0% (66.7-90.0%) 92.2% (83.0-96.6%) 0.7418210904541338
Streptomycin Tbprofiler 8(42) 2(64) 81.0% (66.7-90.0%) 96.9% (89.3-99.1%) 0.8037977341533646

Next avenues of investigation:

I'll probably wait until the Nanopore stuff is debugged and then run on a larger sample of data before debugging Illumina

@mbhall88
Copy link
Owner Author

mbhall88 commented Oct 4, 2022

I took a look at the INH FNs today. For those where mykrobe made a TP call (so I can see which mutation we missed) 13/16 were use missing fabG1 C-15T promoter mutation. drprg called this variant, for all of those 13, but they were filtered by the fraction of read support (FRS) filter - which is set to 0.70. Nearly all of those mutations had an FRS of 0.58-0.64.

The reason for this is the alleles are quite similar, and I suspect maybe some shared minimizers are wreaking havoc here.

An example VCF record showing the alleles and coverage

fabG1   81      555cbd3d        CGAGACGATAGGT   CGAGACGATAGGC,CGAGATGATAGGT,TGAGACGATAGGT       .       frs     VC=PH_SNPs;GRAPHTYPE=SIMPLE;VARID=fabG1_G-17T,fabG1_A-16X,fabG1_C-15X,fabG1_T-8X;PREDICT=S,S,R,S      GT:MEAN_FWD_COVG:MEAN_REV_COVG:MED_FWD_COVG:MED_REV_COVG:SUM_FWD_COVG:SUM_REV_COVG:GAPS:LIKELIHOOD:GT_CONF      2:13,16,22,3:11,13,27,2:7,14,19,0:4,8,30,1:82,82,134,14:71,68,164,10:0.5,0.4,0,0.75:-441.21,-406.243,-273.794,-577.451:132.448

These dont get collapsed by make PRG because the minimum match lengths between the three variants described by this allele are 4 and 6 (we use min match len of 7).

The other interesting thing is that each time, the allele with the next best coverage is allele 1 which differs in two positions from allele 2 (middle and end), so I reckon there's a minimizer that covers the start of this allele before the two alleles differ.

Not sure whether the "hacky" way of decreasing the FRS threshold is the best way to go? Or changing some parameters in make prg or pandora...

@iqbal-lab
Copy link
Collaborator

iqbal-lab commented Oct 4, 2022

Could drop min match length also?
In our covid work using pileups, we find frs of 0.7 is too high just because of noise in the reads

@mbhall88
Copy link
Owner Author

mbhall88 commented Oct 4, 2022

Interesting. What FRS have you been using in your covid work?

@iqbal-lab
Copy link
Collaborator

Well, we've just shifted to 0.6 for nanopore, but now we're distracted fixing bugs before going back to carefully choose FRS thresholds

@mbhall88
Copy link
Owner Author

mbhall88 commented Oct 6, 2022

Okay, so after changing the minimum match length to 5 and the minimum FRS to 0.60, there are only two FNs that mykrobe calls that we don't. One of those is an indel which fails the FRS filter at 0.59 and the other is a dodgey looking indel call from mykrobe that isn't called by tbprofiler or drprg, so I'm not phased about that. Interestingly, that sample has a synonymous SNP in the first codon.

The allele at that fabG1 variant now looks slightly better and has been split in two

fabG1   81      8ca378d9        CGAGAC  CGAGAT,TGAGAC   .       PASS    VC=PH_SNPs;GRAPHTYPE=SIMPLE;VARID=fabG1_G-17T,fabG1_A-16X,fabG1_C-15X;PREDICT=S,S,R     GT:MEAN_FWD_COVG:MEAN_REV_COVG:MED_FWD_COVG:MED_REV_COVG:SUM_FWD_COVG:SUM_REV_COVG:GAPS:LIKELIHOOD:GT_CONF      1:11,16,6:12,25,2:9,15,0:5,28,1:71,101,25:75,155,11:0.5,0,0.75:-281.052,-151.402,-388.948:129.65
fabG1   93      fa7956b3        T       C       .       ld;sb   VC=SNP;GRAPHTYPE=SIMPLE;VARID=fabG1_T-8X;PREDICT=S      GT:MEAN_FWD_COVG:MEAN_REV_COVG:MED_FWD_COVG:MED_REV_COVG:SUM_FWD_COVG:SUM_REV_COVG:GAPS:LIKELIHOOD:GT_CONF      0:0,0:1,0:0,0:1,0:0,0:2,0:1,1:-129.795,-138.605:8.80986

Next task is to dig into the poor ofloxacin sensitivity.

@mbhall88
Copy link
Owner Author

mbhall88 commented Oct 6, 2022

The OFX FNs are a fairly straightforward fix. It turns out that all of the FNs are gyrA D94X. What is happening here is that there is a silent mutation (does not confer resistance) at codon 95 (S95T) that occurs in the same allele as that variant in the VCF/PRG. Long story short, I end up combining these two variants and calling an unknown prediction for OFX with novel variant gyrA_DS94GT. So I just need to break these up and check whether any of them are in the panel and associated with resistance. Should hopefully finish the implementation tomorrow.

@iqbal-lab
Copy link
Collaborator

Great!

@mbhall88
Copy link
Owner Author

Here's the updated plots after the INH and OFX fixes listed above

Nanopore

image

Illumina

image


Some good improvements for nanopore. I'm going to have a look at the drprg STM, ETO, PZA and RIF sensitivity as mykrobe seems to be better than drprg. But pretty happy with the specificity of drprg at the moment.

@lachlancoin
Copy link
Collaborator

lachlancoin commented Oct 10, 2022 via email

@mbhall88
Copy link
Owner Author

Looks good. Not sure what going on with TB profiler with RIF

It's most likely an issue with the custom panel I built. I'll fix that up once I've finished debugging drprg. I'm assuming tb profiler is on par with mykrobe for RIF though

@iqbal-lab
Copy link
Collaborator

Looks great. I am actually surprised how good the specificity is for PZA nanopore, given it is dominated by indels. I had thought we had issues with indels

@mbhall88
Copy link
Owner Author

I've now gone through the remainder of the FNs and nearly all of the FPs for nanopore.

RIF

2 FNs where drprg didn't discover the variant

4 FPs where all three tools calls rpoB L430X
1 FP where mykrobe and drprg call rpoB L452X
1 FP that just scraped through the low depth cutoff of 3 in drprg - it had a depth of 3.

STM

1 FN where drprg calls 2 non-synonymous mutations (not in the panel) - these are also called by tb-profiler
1 FN where drprg calls the correct variant rpsL K43R but fails FRS (0.52)

14 FPs are called by all three callers. 9 were mutations in gid, 3 were in rrs, and 2 in rpsL. I'm not sure what we want to do here, because it is fairly reasonable this is a phenotyping problem. After a quick search, I found two references to support this [1, 2]. From 1

low-level streptomycin resistance mediated by gidB were frequently misclassified with respect to streptomycin resistance when using the WHO-recommended critical concentration of 2 μg/ml.

2 FPs were rpsL K88R which is very strongly associated with STM resistance. These were also called by mykrobe but not tb-profiler.

2 FPs were confident deletions in gid only called by drprg. mykrobe called one of them, but it was filtered due to low expected proportion of expected depth.

ETO

In total, there were 9 FNs which were all called by mykrobe, but not tb-profiler or drprg. They're all indels in ethA and de novo variant discovery was not triggered in drprg for any of them. In one of those FNs, there was a promoter mutation as well, which drprg did call, but it was filtered out for low FRS (0.55).

The other thing to note here is while mykrobe's sensitivity is much better than drprg and tbprofiler, it's specificity is terrible.

PZA

1 FN is pncA R154G which is called by mykrobe and tb-profiler. drprg calls the correct allele, but it is filtered out for low FRS (0.54)
2 FNs are indels called by mykrobe only. Both indels were null genotype (drprg calls this F for failed) as the depth was split evenly across two alleles.

INH

4 FPs were called confidently by all three tools (fabG1 C-15T)
3 FPs are deletions called by drprg not called by either tool. They had below 10x depth on drprg so probably not super confident


I don't think there is much more I can do to improve drprg's nanopore performance here. FRS could possibly be lowered? but it would only save a small number of FN/FPs

I'll fix up the tb-profiler RIF sensitivity and then get stuck into the Illumina results

(This text file is my notepad while I was investigating these FNs and FPs)

drprg_nanopore_fn_fp_investigation.txt

@mbhall88
Copy link
Owner Author

mbhall88 commented Oct 13, 2022

As our specificity is better or the same compared to mykrobe and tbprofiler for all drugs on Illumina, I only investigated the FNs.

tl;dr there are three things we may want to try to improve sensitivity as I suspect once we scale this analysis up to thousands of samples some of these problems will get bigger

  1. there are a noticeable amount of variants where de novo discovery in pandora cannot find a path between the start and end kmer of a candidate region. All of the FNs that we miss and mykrobe and/or tbprofiler get are due to this. Along with some RIF and ethionamide (ETO) FNs. The solution to this seems to be switching to Leandro's racon version of variant discovery in pandora. However, this is not very easy as that fork the current tip of pandora have diverged quite a bit. @iqbal-lab do you think it's realistic that Leandro could try and get this on master in the next few weeks? Or will I have to try and do it myself?
  2. There are 3 FNs in total that fail FRS with values 0.54, 0.53, and 0.58. Maybe we look to lower it even further? Does 0.50 or 0.51 sound fair?
  3. tbprofiler has an unfair advantage over mykrobe (and drprg). There were quite a few minor allele resistance calls. I have put mykrobe in haploid mode so it won't call minor resistance. drprg can't call minor resistance either way. Should I let mykrobe call minor variants on Illumina or make the minor allele frequency 0.5 for tbprofiler?

Aside from those overarching points, there were also some other FNs which were due to two variants right next to each other. For example, ERR2510154 has rpoB_S450F, which is actually caused by a 2bp MNP. One of these positions exists in the reference PRG drprg uses. The other variant gets discovered, but gets added in as a separate allele (bubble) in the PRG - I guess this is how make_prg update works though? Here it is

rpoB    1449    fa44b92a        C       T       .       ld;lgc  VC=SNP;GRAPHTYPE=SIMPLE;VARID=rpoB_ACTGTCGGCG1344A,rpoB_GTC1347G,rpoB_TC1348T,rpoB_S450X,rpoB_TCG1348T,rpoB_S450*,rpoB_C1349CA,rpoB_C1349CAA,rpoB_C1349CAC,rpoB_C1349CAG,rpoB_C1349CAT,rpoB_C1349CC,rpoB_C1349CCA,rpoB_C1349CCC,rpoB_C1349CCG,rpoB_C1349CCT,rpoB_C1349CG,rpoB_C1349CGA,rpoB_C1349CGC,rpoB_C1349CGG,rpoB_C1349CGT,rpoB_C1349CT,rpoB_C1349CTA,rpoB_C1349CTC,rpoB_C1349CTG,rpoB_C1349CTT,rpoB_CG1349C,rpoB_CGG1349C;PREDICT=F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F        GT:MEAN_FWD_COVG:MEAN_REV_COVG:MED_FWD_COVG:MED_REV_COVG:SUM_FWD_COVG:SUM_REV_COVG:GAPS:LIKELIHOOD:GT_CONF      .:0,0:0,0:0,0:0,0:0,0:0,1:1,1:-278,-278:0
rpoB    1450    9fbca785        G       T       .       ld;lgc  VC=SNP;GRAPHTYPE=SIMPLE;VARID=rpoB_ACTGTCGGCG1344A,rpoB_S450X,rpoB_TCG1348T,rpoB_S450*,rpoB_CG1349C,rpoB_CGG1349C,rpoB_G1350GA,rpoB_G1350GAA,rpoB_G1350GAC,rpoB_G1350GAG,rpoB_G1350GAT,rpoB_G1350GC,rpoB_G1350GCA,rpoB_G1350GCC,rpoB_G1350GCG,rpoB_G1350GCT,rpoB_G1350GG,rpoB_G1350GGA,rpoB_G1350GGC,rpoB_G1350GGG,rpoB_G1350GGT,rpoB_G1350GT,rpoB_G1350GTA,rpoB_G1350GTC,rpoB_G1350GTG,rpoB_G1350GTT,rpoB_GG1350G,rpoB_GGC1350G;PREDICT=F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F        GT:MEAN_FWD_COVG:MEAN_REV_COVG:MED_FWD_COVG:MED_REV_COVG:SUM_FWD_COVG:SUM_REV_COVG:GAPS:LIKELIHOOD:GT_CONF      .:0,0:0,0:0,0:0,0:0,0:0,0:1,1:-278,-278:0

The allele of this variant is TCG>TTT so pandora should be able to thread reads through both of these variants, but doesn't seem to be able to....??

A similar thing happened for a few other FNs. I'm wondering if I should run on the full dataset and then manually add to the reference PRG some of the common variants that cause this problem? I was thinking the "correct" way to do this @iqbal-lab would be to look through the cryptic metadata sheets and add some samples that contain those variants that cause some of these problems?

drprg_illumina_fn_investigation.txt

@iqbal-lab
Copy link
Collaborator

iqbal-lab commented Oct 13, 2022

Hi there, I'm a bit wiped out so will be brief

  1. Leandro is buried in 2 projects (plasmid stuff using Pandora, and Karel's mof prpject) and trying to extricate from the latter, so I think it would be hard for him to merge the racon branch soon. I must admit, I had forgotten it wasn't merged v sorry Michael. I think it would be great if you could do it, and I think this could help a lot, and might make redundant the issue about adjacent variants, as racon just does the whole gene .
  2. I agree use 0.51, we've moved to that with covid.
  3. I how about return to minor alleles after addressing the above?

In terms of adding more variants to the graph; we can do this, but racon might mean you don't need to

@mbhall88
Copy link
Owner Author

mbhall88 commented Oct 19, 2022

I've just realised, we should probably put some kind of minimum depth filter on these results too. i.e. samples with less than d depth are excluded from the sensitivity/specificity plots.

Does everyone agree? If so, does anyone have a preference for what d should be? I arbitrarily thought of 15x? (This is separate from the depth analysis in #3)

Here is the depth distribution for the 400 Illumina test set

image

and the full nanopore

image

Additionally, it might be wise to have a contamination proportion filter? For instance, when I align the reads to the decontamination database, I calculate the fraction of reads that we keep (i.e. MTB), fraction of reads that align to a contaminant, and a fraction of reads unmapped. Again, arbitrarily was thinking exclude samples with more than 5% contamination? Too harsh?

This is the fraction of contamination for Illumina

image

and nanopore

image

@iqbal-lab
Copy link
Collaborator

Both seem perfectly reasonable to me

@mbhall88
Copy link
Owner Author

So I have a working drprg branch adapted to use pandora with the racon denovo method (iqbal-lab-org/pandora#299).

I've tested it out on two Illumina runs listed in #2.

The first, ERR2510154, was the example VCF above. So, with the old pandora denovo process, at the allele for rpoB_S450F we had

rpoB    1449    fa44b92a        C       T       .       ld;lgc  VC=SNP;GRAPHTYPE=SIMPLE;VARID=rpoB_ACTGTCGGCG1344A,rpoB_GTC1347G,rpoB_TC1348T,rpoB_S450X,rpoB_TCG1348T,rpoB_S450*,rpoB_C1349CA,rpoB_C1349CAA,rpoB_C1349CAC,rpoB_C1349CAG,rpoB_C1349CAT,rpoB_C1349CC,rpoB_C1349CCA,rpoB_C1349CCC,rpoB_C1349CCG,rpoB_C1349CCT,rpoB_C1349CG,rpoB_C1349CGA,rpoB_C1349CGC,rpoB_C1349CGG,rpoB_C1349CGT,rpoB_C1349CT,rpoB_C1349CTA,rpoB_C1349CTC,rpoB_C1349CTG,rpoB_C1349CTT,rpoB_CG1349C,rpoB_CGG1349C;PREDICT=F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F        GT:MEAN_FWD_COVG:MEAN_REV_COVG:MED_FWD_COVG:MED_REV_COVG:SUM_FWD_COVG:SUM_REV_COVG:GAPS:LIKELIHOOD:GT_CONF      .:0,0:0,0:0,0:0,0:0,0:0,1:1,1:-278,-278:0
rpoB    1450    9fbca785        G       T       .       ld;lgc  VC=SNP;GRAPHTYPE=SIMPLE;VARID=rpoB_ACTGTCGGCG1344A,rpoB_S450X,rpoB_TCG1348T,rpoB_S450*,rpoB_CG1349C,rpoB_CGG1349C,rpoB_G1350GA,rpoB_G1350GAA,rpoB_G1350GAC,rpoB_G1350GAG,rpoB_G1350GAT,rpoB_G1350GC,rpoB_G1350GCA,rpoB_G1350GCC,rpoB_G1350GCG,rpoB_G1350GCT,rpoB_G1350GG,rpoB_G1350GGA,rpoB_G1350GGC,rpoB_G1350GGG,rpoB_G1350GGT,rpoB_G1350GT,rpoB_G1350GTA,rpoB_G1350GTC,rpoB_G1350GTG,rpoB_G1350GTT,rpoB_GG1350G,rpoB_GGC1350G;PREDICT=F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F        GT:MEAN_FWD_COVG:MEAN_REV_COVG:MED_FWD_COVG:MED_REV_COVG:SUM_FWD_COVG:SUM_REV_COVG:GAPS:LIKELIHOOD:GT_CONF      .:0,0:0,0:0,0:0,0:0,0:0,0:1,1:-278,-278:0

with the new denovo process, we get

rpoB    1449    ba1603d4        CG      TG,TT   .       PASS    VC=PH_SNPs;GRAPHTYPE=SIMPLE;VARID=rpoB_ACTGTCGGCG1344A,rpoB_GTC1347G,rpoB_TC1348T,rpoB_S450X,rpoB_TCG1348T,rpoB_S450*,rpoB_C1349CA,rpoB_C1349CAA,rpoB_C1349CAC,rpoB_C1349CAG,rpoB_C1349CAT,rpoB_C1349CC,rpoB_C1349CCA,rpoB_C1349CCC,rpoB_C1349CCG,rpoB_C1349CCT,rpoB_C1349CG,rpoB_C1349CGA,rpoB_C1349CGC,rpoB_C1349CGG,rpoB_C1349CGT,rpoB_C1349CT,rpoB_C1349CTA,rpoB_C1349CTC,rpoB_C1349CTG,rpoB_C1349CTT,rpoB_CG1349C,rpoB_CGG1349C,rpoB_G1350GA,rpoB_G1350GAA,rpoB_G1350GAC,rpoB_G1350GAG,rpoB_G1350GAT,rpoB_G1350GC,rpoB_G1350GCA,rpoB_G1350GCC,rpoB_G1350GCG,rpoB_G1350GCT,rpoB_G1350GG,rpoB_G1350GGA,rpoB_G1350GGC,rpoB_G1350GGG,rpoB_G1350GGT,rpoB_G1350GT,rpoB_G1350GTA,rpoB_G1350GTC,rpoB_G1350GTG,rpoB_G1350GTT,rpoB_GG1350G,rpoB_GGC1350G;PREDICT=S,S,S,R,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S,S     GT:MEAN_FWD_COVG:MEAN_REV_COVG:MED_FWD_COVG:MED_REV_COVG:SUM_FWD_COVG:SUM_REV_COVG:GAPS:LIKELIHOOD:GT_CONF   2:0,0,51:0,0,41:0,0,61:0,0,48:0,0,308:0,1,251:1,1,0.166667:-703.676,-703.676,-35.8875:667.788

I guess another reason this might have been fixed is that instead of using make_prg update to add the denovo sequences into the PRG, we recreate the MSAs and rebuild the PRGs for those genes with novel variants. So this particular case could be a weakness of make_prg update as it just updated with the novel variant - rpoB 1450 G>T - without combining it with the previous position into a single allele.

The second run, ERR4828599, had both a RIF FN and an INH FN. The RIF FN was an interesting case where the isolate has both L449M and S450F. drprg/pandora previously failed to find a novel variant. With the new pandora denovo process, we found (and called) both of these variants. The INH FN was katG S315N, which is a rarer mutation at that locus - normally S315T. Previously both mykrobe and drprg had no depth in this area and drprg did not find a novel variant. With the new pandora, we do find and call this mutation.

I'm going to run a few more of the Illumina FNs, but this is very promising!

@mbhall88
Copy link
Owner Author

Okay, since we had a last update of results we have switch to using racon for denovo discovery and dropped the old nanopore data. I have also increased the number of illumina samples to 8,587

Illumina

Note I am going to change the markers so you can see the error bars now that they are so small

image

Drug Tool FN(R) FP(S) Sensitivity (95% CI) Specificity (95% CI) MCC
Amikacin drprg 77(485) 50(6958) 84.1% (80.6-87.1%) 99.3% (99.1-99.5%) 0.857
Amikacin mykrobe 101(485) 46(6958) 79.2% (75.3-82.6%) 99.3% (99.1-99.5%) 0.831
Amikacin tbprofiler 62(485) 59(6958) 87.2% (83.9-89.9%) 99.2% (98.9-99.3%) 0.866
Capreomycin drprg 62(235) 92(2449) 73.6% (67.6-78.8%) 96.2% (95.4-96.9%) 0.662
Capreomycin mykrobe 78(235) 85(2449) 66.8% (60.6-72.5%) 96.5% (95.7-97.2%) 0.625
Capreomycin tbprofiler 54(235) 96(2449) 77.0% (71.2-81.9%) 96.1% (95.2-96.8%) 0.679
Delamanid drprg 111(116) 1(8152) 4.3% (1.9-9.7%) 100.0% (99.9-100.0%) 0.188
Delamanid mykrobe 111(116) 1(8152) 4.3% (1.9-9.7%) 100.0% (99.9-100.0%) 0.188
Delamanid tbprofiler 111(116) 2(8152) 4.3% (1.9-9.7%) 100.0% (99.9-100.0%) 0.173
Ethambutol drprg 146(1538) 736(4936) 90.5% (88.9-91.9%) 85.1% (84.1-86.1%) 0.685
Ethambutol mykrobe 149(1538) 728(4936) 90.3% (88.7-91.7%) 85.3% (84.2-86.2%) 0.686
Ethambutol tbprofiler 118(1538) 765(4936) 92.3% (90.9-93.6%) 84.5% (83.5-85.5%) 0.691
Ethionamide drprg 341(1104) 372(6105) 69.1% (66.3-71.8%) 93.9% (93.3-94.5%) 0.623
Ethionamide mykrobe 276(1104) 395(6105) 75.0% (72.4-77.5%) 93.5% (92.9-94.1%) 0.658
Ethionamide tbprofiler 272(1104) 414(6105) 75.4% (72.7-77.8%) 93.2% (92.6-93.8%) 0.653
Isoniazid drprg 362(3900) 164(4194) 90.7% (89.8-91.6%) 96.1% (95.5-96.6%) 0.871
Isoniazid mykrobe 366(3900) 163(4194) 90.6% (89.7-91.5%) 96.1% (95.5-96.7%) 0.87
Isoniazid tbprofiler 297(3900) 181(4194) 92.4% (91.5-93.2%) 95.7% (95.0-96.3%) 0.882
Kanamycin drprg 142(670) 101(6975) 78.8% (75.6-81.7%) 98.6% (98.2-98.8%) 0.796
Kanamycin mykrobe 166(670) 96(6975) 75.2% (71.8-78.3%) 98.6% (98.3-98.9%) 0.776
Kanamycin tbprofiler 122(670) 107(6975) 81.8% (78.7-84.5%) 98.5% (98.1-98.7%) 0.811
Levofloxacin drprg 105(1040) 97(5454) 89.9% (87.9-91.6%) 98.2% (97.8-98.5%) 0.884
Levofloxacin mykrobe 108(1040) 97(5454) 89.6% (87.6-91.3%) 98.2% (97.8-98.5%) 0.882
Levofloxacin tbprofiler 85(1040) 109(5454) 91.8% (90.0-93.3%) 98.0% (97.6-98.3%) 0.89
Linezolid drprg 49(65) 4(6110) 24.6% (15.8-36.3%) 99.9% (99.8-100.0%) 0.441
Linezolid mykrobe 49(65) 4(6110) 24.6% (15.8-36.3%) 99.9% (99.8-100.0%) 0.441
Linezolid tbprofiler 48(65) 5(6110) 26.2% (17.0-38.0%) 99.9% (99.8-100.0%) 0.447
Moxifloxacin drprg 60(603) 464(5431) 90.0% (87.4-92.2%) 91.5% (90.7-92.2%) 0.656
Moxifloxacin mykrobe 59(603) 460(5431) 90.2% (87.6-92.3%) 91.5% (90.8-92.2%) 0.658
Moxifloxacin tbprofiler 42(603) 482(5431) 93.0% (90.7-94.8%) 91.1% (90.3-91.9%) 0.668
Ofloxacin drprg 31(105) 4(424) 70.5% (61.2-78.4%) 99.1% (97.6-99.6%) 0.782
Ofloxacin mykrobe 32(105) 4(424) 69.5% (60.2-77.5%) 99.1% (97.6-99.6%) 0.776
Ofloxacin tbprofiler 26(105) 6(424) 75.2% (66.2-82.5%) 98.6% (96.9-99.3%) 0.802
Pyrazinamide drprg 75(341) 47(822) 78.0% (73.3-82.1%) 94.3% (92.5-95.7%) 0.742
Pyrazinamide mykrobe 73(341) 45(822) 78.6% (73.9-82.6%) 94.5% (92.8-95.9%) 0.751
Pyrazinamide tbprofiler 45(341) 62(822) 86.8% (82.8-90.0%) 92.5% (90.4-94.1%) 0.782
Rifampicin drprg 142(3222) 166(4586) 95.6% (94.8-96.2%) 96.4% (95.8-96.9%) 0.919
Rifampicin mykrobe 187(3222) 165(4586) 94.2% (93.3-95.0%) 96.4% (95.8-96.9%) 0.907
Rifampicin tbprofiler 102(3222) 177(4586) 96.8% (96.2-97.4%) 96.1% (95.5-96.7%) 0.927
Streptomycin drprg 278(1042) 130(1205) 73.3% (70.6-75.9%) 89.2% (87.3-90.8%) 0.637
Streptomycin mykrobe 295(1042) 132(1205) 71.7% (68.9-74.3%) 89.0% (87.2-90.7%) 0.621
Streptomycin tbprofiler 257(1042) 136(1205) 75.3% (72.6-77.9%) 88.7% (86.8-90.4%) 0.649

Nanopore

image

Drug Tool FN(R) FP(S) Sensitivity (95% CI) Specificity (95% CI) MCC
Amikacin drprg 0(11) 3(78) 100.0% (74.1-100.0%) 96.2% (89.3-98.7%) 0.869
Amikacin mykrobe 0(11) 3(78) 100.0% (74.1-100.0%) 96.2% (89.3-98.7%) 0.869
Amikacin tbprofiler 0(11) 3(78) 100.0% (74.1-100.0%) 96.2% (89.3-98.7%) 0.869
Capreomycin drprg 1(1) 1(51) 0.0% (0.0-79.3%) 98.0% (89.7-99.7%) -0.02
Capreomycin mykrobe 1(1) 1(51) 0.0% (0.0-79.3%) 98.0% (89.7-99.7%) -0.02
Capreomycin tbprofiler 1(1) 1(51) 0.0% (0.0-79.3%) 98.0% (89.7-99.7%) -0.02
Ethambutol drprg 4(14) 15(77) 71.4% (45.4-88.3%) 80.5% (70.3-87.8%) 0.42
Ethambutol mykrobe 4(14) 15(77) 71.4% (45.4-88.3%) 80.5% (70.3-87.8%) 0.42
Ethambutol tbprofiler 5(14) 15(77) 64.3% (38.8-83.7%) 80.5% (70.3-87.8%) 0.367
Ethionamide drprg 0(4) 1(9) 100.0% (51.0-100.0%) 88.9% (56.5-98.0%) 0.843
Ethionamide mykrobe 0(4) 1(9) 100.0% (51.0-100.0%) 88.9% (56.5-98.0%) 0.843
Ethionamide tbprofiler 0(4) 1(9) 100.0% (51.0-100.0%) 88.9% (56.5-98.0%) 0.843
Isoniazid drprg 9(51) 4(48) 82.4% (69.7-90.4%) 91.7% (80.4-96.7%) 0.742
Isoniazid mykrobe 9(51) 4(48) 82.4% (69.7-90.4%) 91.7% (80.4-96.7%) 0.742
Isoniazid tbprofiler 9(51) 3(48) 82.4% (69.7-90.4%) 93.8% (83.2-97.9%) 0.764
Kanamycin drprg 0(0) 1(52) - 98.1% (89.9-99.7%) -
Kanamycin mykrobe 0(0) 1(52) - 98.1% (89.9-99.7%) -
Kanamycin tbprofiler 0(0) 1(52) - 98.1% (89.9-99.7%) -
Moxifloxacin drprg 0(0) 1(1) - 0.0% (0.0-79.3%) -
Moxifloxacin mykrobe 0(0) 1(1) - 0.0% (0.0-79.3%) -
Moxifloxacin tbprofiler 0(0) 1(1) - 0.0% (0.0-79.3%) -
Ofloxacin drprg 0(10) 4(77) 100.0% (72.2-100.0%) 94.8% (87.4-98.0%) 0.823
Ofloxacin mykrobe 0(10) 4(77) 100.0% (72.2-100.0%) 94.8% (87.4-98.0%) 0.823
Ofloxacin tbprofiler 0(10) 3(77) 100.0% (72.2-100.0%) 96.1% (89.2-98.7%) 0.86
Pyrazinamide drprg 0(0) 0(1) - 100.0% (20.7-100.0%) -
Pyrazinamide mykrobe 0(0) 0(1) - 100.0% (20.7-100.0%) -
Pyrazinamide tbprofiler 0(0) 0(1) - 100.0% (20.7-100.0%) -
Rifampicin drprg 5(48) 1(44) 89.6% (77.8-95.5%) 97.7% (88.2-99.6%) 0.873
Rifampicin mykrobe 5(48) 1(44) 89.6% (77.8-95.5%) 97.7% (88.2-99.6%) 0.873
Rifampicin tbprofiler 5(48) 1(44) 89.6% (77.8-95.5%) 97.7% (88.2-99.6%) 0.873
Streptomycin drprg 2(8) 14(83) 75.0% (40.9-92.9%) 83.1% (73.7-89.7%) 0.398
Streptomycin mykrobe 2(8) 27(83) 75.0% (40.9-92.9%) 67.5% (56.8-76.6%) 0.25
Streptomycin tbprofiler 2(8) 12(83) 75.0% (40.9-92.9%) 85.5% (76.4-91.5%) 0.43

@lachlancoin
Copy link
Collaborator

lachlancoin commented Nov 24, 2022 via email

@iqbal-lab
Copy link
Collaborator

Yeah!

@mbhall88
Copy link
Owner Author

I disagree sadly haha. TBProfiler is beating us on a lot of drugs. Going to dig into why that is now

@iqbal-lab
Copy link
Collaborator

Looking again (now on laptop), if i was to summarise those results:

On illumina, tb-profiler often has the highest sensitivity. It does pay a very small price in specificity, but it's much less noticeable than the sensitivity increase. So i agree, good to look into that

On nanopore: sensitivity of all tools is essentially identical (except tb-profiler has a problem on EMB). Specificity is also essentially identical, although for two drugs (streptomycin and ofloxacin) tb-prof has a slightly increased specificity.
I'm quite impressed/surprised how well all 3 do on the 4 drugs where any frameshift in a gene causes a resistant call. It matches what you found for Mykrobe in your Lancet Microbe paper @mbhall88 , but am delighted it's also true for DrPrg; also a bit surprised that tb-profiler does that well too given it uses bcftools. We didn't find we could call indels with this level of specificity. (I guess, just refusing to make indel calls with nanopore would give v high specificity?)

@mbhall88
Copy link
Owner Author

mbhall88 commented Nov 25, 2022

I've been looking through the variants where drprg is FN but either of the other tools is TP (on Illumina) to see what variants we have missed. (I'm not finished yet) but a lot of the tbprofiler TPs where we are FN are to do with minor alleles. By default, TBProfiler will call anything with a fraction of 0.1 or more. This brings up point 3 from #2 again. We tell mykrobe to run in haploid mode and drprg only runs in haploid mode. The options forward I see are:

  1. Run mykrobe in diploid mode (call minor alleles) and take the sensitivity hit in drprg as we cant call minor alleles
  2. force fraction 0.5 in tbprofiler so that all tools are effectively in haploid mode
  3. implement a diploid model in pandora (not sure how much work this would be? will alert Leandro to get his input too)

Option 2 is obviously the easiest and most likely to make us look better, but it sits somewhat uncomfortably with me as we are kind of skewing the results in our favour right?

I will keep working through these results next week for other drugs as there are also a few cases on weird indels which I will document when I have a better understanding of what's going on.

@iqbal-lab
Copy link
Collaborator

I think detecting minors could easily be done directly in drprg, no need to implement in Pandora. You get coverage info on the S and R alleles right? Just ask if the coverage on any R allele is >0.1 of the total

@leoisl
Copy link
Collaborator

leoisl commented Nov 25, 2022

3. implement a diploid model in pandora (not sure how much work this would be? will alert Leandro to get his input too)

IDK neither how much work this would be, because the only experience I have with genotyping models actually is in pandora, which has a haploid model. If implementing a diploid model is simply calling the two most likely alleles, then maybe a simple implementation of getting the most likely allele (what is currently implemented) and the second most likely allele (remove/ignore the most likely and rerun the genotyping algorithm) is not hard. This can be easily generalised to n-ploid... but I don't think it is as simple as this...

@iqbal-lab
Copy link
Collaborator

iqbal-lab commented Nov 25, 2022

  1. The problem is its not really diploid. Minor alleles in bacteria sometimes occur at a few (or many) placrs across the genome,but at different frequencies. Diploid assumes 50/50. Mykrobe uses a kind if diploid model, but it's a hack, and the genotype confidence is not well calibrated.
  2. In Pandora, there are a bunch of things you could do, but you're describing whole genome variation. Maybe you'd say "looks like there is a mixture of 2 genomes at 70:30 ratio". Or "lots of mixed positions in this data, looks dodgy".
  3. In Mykrobe and drprg, you have positions you care about, and knowledge that low frequency Resistance alleles cause drug resistance. So you just need to spot them, independently, at each snp. Pandora makes a vcf which I believe drprg parses here (https://github.com/mbhall88/drprg/blob/265c25c9e027a26f8c671931a736e19da399142e/src/predict.rs#L402) The vcf has coverage on both alleles, or all alleles. So you can just parse that to spot minors.

@mbhall88
Copy link
Owner Author

I think detecting minors could easily be done directly in drprg, no need to implement in Pandora. You get coverage info on the S and R alleles right? Just ask if the coverage on any R allele is >0.1 of the total

True. I'll have to do some reimplementing though as I currently only pay attention to the called alleles. But it shouldn't take too long to get this working 🤞

@iqbal-lab
Copy link
Collaborator

Hurrah! I would reread the section on minor alleles here
https://wellcomeopenresearch.org/articles/4-191
I just reread it and it was informative, reminded me of differences between drugs

@iqbal-lab
Copy link
Collaborator

iqbal-lab commented Dec 15, 2022

Two solutions

  1. These minor alleles must by definition be catalogue snps. So they are in our graph, but the problem is that racon has found something new in our consensus, and that is close to the minor allele and the combination of both is not in the graph. Right? So all we need to do is rebuild the prg including catalogue and also the racon call?
  2. There is a clean solution the requires a fair bit of new code. At the end of drprg take the consensus and align to the reference so we know what positions in the consensus correspond to catalogue snp coords. Then minimap the reads to the consensus (I see a rust port of minimap exists, but looks maybe immature ? https://crates.io/crates/minimap2 ). Then just count minor allele bases in the pile up at the catalogue snp coords.

@iqbal-lab
Copy link
Collaborator

This is v exciting and good news really, there a lot of sensitivity gain to be had from the minor alleles and gene deletions

@mbhall88
Copy link
Owner Author

mbhall88 commented Dec 15, 2022

For point 1, no, that is not right. We don't have these variants in the graph, which is the problem. (Remember our graph is not the panel, but the sparse popn. PRG from randomly sampled cryptic samples). And racon can't find them in these samples beacsue they're only minor alleles. Racon will find the major allele - the reference.

Point 2 seems like it effectively does away with the need for pandora though - is almost basically what tbprofiler does? It will also dramatically increase our runtime and memory usage, which at the moment is our biggest selling point really.

@iqbal-lab
Copy link
Collaborator

  1. Ah, I did forget
  2. No, it doesn't at all do away with the need for Pandora, mapping to the consensus will be dominated by exact matches (for illumina), so much easier to spot minors

I need to think!

@iqbal-lab
Copy link
Collaborator

Follow up to 2. I'm not pushing for this solution, but just to say, we do this for covid, 30kb long, and use <500 mb ram and 45 seconds for the whole process. I think performance is not a barrier . But there are other,arguments not to do it

@mbhall88
Copy link
Owner Author

mbhall88 commented Jan 3, 2023

After closing mbhall88/drprg#23 the current (Illumina) results are

Drug Tool FN(R) FP(S) Sensitivity (95% CI) Specificity (95% CI) MCC
Amikacin drprg 68(484) 57(6958) 86.0% (82.6-88.8%) 99.2% (98.9-99.4%) 0.86
Amikacin mykrobe 93(484) 51(6958) 80.8% (77.0-84.0%) 99.3% (99.0-99.4%) 0.835
Amikacin tbprofiler 62(484) 59(6958) 87.2% (83.9-89.9%) 99.2% (98.9-99.3%) 0.866
Capreomycin drprg 57(235) 94(2448) 75.7% (69.9-80.8%) 96.2% (95.3-96.9%) 0.673
Capreomycin mykrobe 72(235) 87(2448) 69.4% (63.2-74.9%) 96.4% (95.6-97.1%) 0.64
Capreomycin tbprofiler 54(235) 95(2448) 77.0% (71.2-81.9%) 96.1% (95.3-96.8%) 0.681
Delamanid drprg 111(116) 5(8151) 4.3% (1.9-9.7%) 99.9% (99.9-100.0%) 0.144
Delamanid mykrobe 111(116) 2(8151) 4.3% (1.9-9.7%) 100.0% (99.9-100.0%) 0.173
Delamanid tbprofiler 111(116) 2(8151) 4.3% (1.9-9.7%) 100.0% (99.9-100.0%) 0.173
Ethambutol drprg 121(1537) 752(4935) 92.1% (90.7-93.4%) 84.8% (83.7-85.7%) 0.693
Ethambutol mykrobe 133(1537) 747(4935) 91.3% (89.8-92.7%) 84.9% (83.8-85.8%) 0.688
Ethambutol tbprofiler 118(1537) 765(4935) 92.3% (90.9-93.6%) 84.5% (83.5-85.5%) 0.691
Ethionamide drprg 273(1103) 417(6105) 75.2% (72.6-77.7%) 93.2% (92.5-93.8%) 0.651
Ethionamide mykrobe 265(1103) 413(6105) 76.0% (73.4-78.4%) 93.2% (92.6-93.8%) 0.658
Ethionamide tbprofiler 272(1103) 414(6105) 75.3% (72.7-77.8%) 93.2% (92.6-93.8%) 0.653
Isoniazid drprg 307(3899) 173(4193) 92.1% (91.2-92.9%) 95.9% (95.2-96.4%) 0.882
Isoniazid mykrobe 333(3899) 170(4193) 91.5% (90.5-92.3%) 95.9% (95.3-96.5%) 0.876
Isoniazid tbprofiler 297(3899) 181(4193) 92.4% (91.5-93.2%) 95.7% (95.0-96.3%) 0.882
Kanamycin drprg 128(669) 107(6975) 80.9% (77.7-83.7%) 98.5% (98.1-98.7%) 0.805
Kanamycin mykrobe 152(669) 98(6975) 77.3% (74.0-80.3%) 98.6% (98.3-98.8%) 0.788
Kanamycin tbprofiler 122(669) 107(6975) 81.8% (78.7-84.5%) 98.5% (98.1-98.7%) 0.811
Levofloxacin drprg 81(1040) 102(5454) 92.2% (90.4-93.7%) 98.1% (97.7-98.5%) 0.896
Levofloxacin mykrobe 88(1040) 102(5454) 91.5% (89.7-93.1%) 98.1% (97.7-98.5%) 0.892
Levofloxacin tbprofiler 85(1040) 109(5454) 91.8% (90.0-93.3%) 98.0% (97.6-98.3%) 0.89
Linezolid drprg 48(65) 4(6109) 26.2% (17.0-38.0%) 99.9% (99.8-100.0%) 0.457
Linezolid mykrobe 48(65) 4(6109) 26.2% (17.0-38.0%) 99.9% (99.8-100.0%) 0.457
Linezolid tbprofiler 48(65) 5(6109) 26.2% (17.0-38.0%) 99.9% (99.8-100.0%) 0.447
Moxifloxacin drprg 41(603) 478(5430) 93.2% (90.9-94.9%) 91.2% (90.4-91.9%) 0.67
Moxifloxacin mykrobe 44(603) 472(5430) 92.7% (90.3-94.5%) 91.3% (90.5-92.0%) 0.669
Moxifloxacin tbprofiler 42(603) 481(5430) 93.0% (90.7-94.8%) 91.1% (90.4-91.9%) 0.668
Ofloxacin drprg 24(104) 5(424) 76.9% (68.0-84.0%) 98.8% (97.3-99.5%) 0.82
Ofloxacin mykrobe 26(104) 5(424) 75.0% (65.9-82.3%) 98.8% (97.3-99.5%) 0.807
Ofloxacin tbprofiler 26(104) 6(424) 75.0% (65.9-82.3%) 98.6% (96.9-99.3%) 0.8
Pyrazinamide drprg 68(341) 53(820) 80.1% (75.5-84.0%) 93.5% (91.6-95.0%) 0.746
Pyrazinamide mykrobe 55(341) 56(820) 83.9% (79.6-87.4%) 93.2% (91.2-94.7%) 0.77
Pyrazinamide tbprofiler 45(341) 62(820) 86.8% (82.8-90.0%) 92.4% (90.4-94.1%) 0.782
Rifampicin drprg 133(3221) 166(4585) 95.9% (95.1-96.5%) 96.4% (95.8-96.9%) 0.921
Rifampicin mykrobe 164(3221) 169(4585) 94.9% (94.1-95.6%) 96.3% (95.7-96.8%) 0.912
Rifampicin tbprofiler 102(3221) 177(4585) 96.8% (96.2-97.4%) 96.1% (95.5-96.7%) 0.927
Streptomycin drprg 266(1041) 133(1205) 74.4% (71.7-77.0%) 89.0% (87.1-90.6%) 0.644
Streptomycin mykrobe 282(1041) 135(1205) 72.9% (70.1-75.5%) 88.8% (86.9-90.5%) 0.629
Streptomycin tbprofiler 257(1041) 136(1205) 75.3% (72.6-77.8%) 88.7% (86.8-90.4%) 0.649

illumina

The nanopore results remain unchanged

@mbhall88
Copy link
Owner Author

mbhall88 commented Jan 4, 2023

After the updates in minor allele calling in mbhall88/drprg#19 (comment)

Drug Tool FN(R) FP(S) Sensitivity (95% CI) Specificity (95% CI) MCC
Amikacin drprg 68(484) 57(6958) 86.0% (82.6-88.8%) 99.2% (98.9-99.4%) 0.86
Amikacin mykrobe 93(484) 51(6958) 80.8% (77.0-84.0%) 99.3% (99.0-99.4%) 0.835
Amikacin tbprofiler 62(484) 59(6958) 87.2% (83.9-89.9%) 99.2% (98.9-99.3%) 0.866
Capreomycin drprg 57(235) 94(2448) 75.7% (69.9-80.8%) 96.2% (95.3-96.9%) 0.673
Capreomycin mykrobe 72(235) 87(2448) 69.4% (63.2-74.9%) 96.4% (95.6-97.1%) 0.64
Capreomycin tbprofiler 54(235) 95(2448) 77.0% (71.2-81.9%) 96.1% (95.3-96.8%) 0.681
Delamanid drprg 111(116) 5(8151) 4.3% (1.9-9.7%) 99.9% (99.9-100.0%) 0.144
Delamanid mykrobe 111(116) 2(8151) 4.3% (1.9-9.7%) 100.0% (99.9-100.0%) 0.173
Delamanid tbprofiler 111(116) 2(8151) 4.3% (1.9-9.7%) 100.0% (99.9-100.0%) 0.173
Ethambutol drprg 121(1537) 752(4935) 92.1% (90.7-93.4%) 84.8% (83.7-85.7%) 0.693
Ethambutol mykrobe 133(1537) 747(4935) 91.3% (89.8-92.7%) 84.9% (83.8-85.8%) 0.688
Ethambutol tbprofiler 118(1537) 765(4935) 92.3% (90.9-93.6%) 84.5% (83.5-85.5%) 0.691
Ethionamide drprg 272(1103) 420(6105) 75.3% (72.7-77.8%) 93.1% (92.5-93.7%) 0.651
Ethionamide mykrobe 265(1103) 413(6105) 76.0% (73.4-78.4%) 93.2% (92.6-93.8%) 0.658
Ethionamide tbprofiler 272(1103) 414(6105) 75.3% (72.7-77.8%) 93.2% (92.6-93.8%) 0.653
Isoniazid drprg 307(3899) 173(4193) 92.1% (91.2-92.9%) 95.9% (95.2-96.4%) 0.882
Isoniazid mykrobe 333(3899) 170(4193) 91.5% (90.5-92.3%) 95.9% (95.3-96.5%) 0.876
Isoniazid tbprofiler 297(3899) 181(4193) 92.4% (91.5-93.2%) 95.7% (95.0-96.3%) 0.882
Kanamycin drprg 128(669) 107(6975) 80.9% (77.7-83.7%) 98.5% (98.1-98.7%) 0.805
Kanamycin mykrobe 152(669) 98(6975) 77.3% (74.0-80.3%) 98.6% (98.3-98.8%) 0.788
Kanamycin tbprofiler 122(669) 107(6975) 81.8% (78.7-84.5%) 98.5% (98.1-98.7%) 0.811
Levofloxacin drprg 79(1040) 104(5454) 92.4% (90.6-93.9%) 98.1% (97.7-98.4%) 0.896
Levofloxacin mykrobe 88(1040) 102(5454) 91.5% (89.7-93.1%) 98.1% (97.7-98.5%) 0.892
Levofloxacin tbprofiler 85(1040) 109(5454) 91.8% (90.0-93.3%) 98.0% (97.6-98.3%) 0.89
Linezolid drprg 48(65) 4(6109) 26.2% (17.0-38.0%) 99.9% (99.8-100.0%) 0.457
Linezolid mykrobe 48(65) 4(6109) 26.2% (17.0-38.0%) 99.9% (99.8-100.0%) 0.457
Linezolid tbprofiler 48(65) 5(6109) 26.2% (17.0-38.0%) 99.9% (99.8-100.0%) 0.447
Moxifloxacin drprg 40(603) 478(5430) 93.4% (91.1-95.1%) 91.2% (90.4-91.9%) 0.671
Moxifloxacin mykrobe 44(603) 472(5430) 92.7% (90.3-94.5%) 91.3% (90.5-92.0%) 0.669
Moxifloxacin tbprofiler 42(603) 481(5430) 93.0% (90.7-94.8%) 91.1% (90.4-91.9%) 0.668
Ofloxacin drprg 24(104) 5(424) 76.9% (68.0-84.0%) 98.8% (97.3-99.5%) 0.82
Ofloxacin mykrobe 26(104) 5(424) 75.0% (65.9-82.3%) 98.8% (97.3-99.5%) 0.807
Ofloxacin tbprofiler 26(104) 6(424) 75.0% (65.9-82.3%) 98.6% (96.9-99.3%) 0.8
Pyrazinamide drprg 67(341) 54(820) 80.4% (75.8-84.2%) 93.4% (91.5-94.9%) 0.746
Pyrazinamide mykrobe 55(341) 56(820) 83.9% (79.6-87.4%) 93.2% (91.2-94.7%) 0.77
Pyrazinamide tbprofiler 45(341) 62(820) 86.8% (82.8-90.0%) 92.4% (90.4-94.1%) 0.782
Rifampicin drprg 114(3221) 168(4585) 96.5% (95.8-97.0%) 96.3% (95.8-96.8%) 0.926
Rifampicin mykrobe 164(3221) 169(4585) 94.9% (94.1-95.6%) 96.3% (95.7-96.8%) 0.912
Rifampicin tbprofiler 102(3221) 177(4585) 96.8% (96.2-97.4%) 96.1% (95.5-96.7%) 0.927
Streptomycin drprg 267(1041) 134(1205) 74.4% (71.6-76.9%) 88.9% (87.0-90.5%) 0.643
Streptomycin mykrobe 282(1041) 135(1205) 72.9% (70.1-75.5%) 88.8% (86.9-90.5%) 0.629
Streptomycin tbprofiler 257(1041) 136(1205) 75.3% (72.6-77.8%) 88.7% (86.8-90.4%) 0.649

illumina

@iqbal-lab
Copy link
Collaborator

OK, so looking at those results now, we can definitely see a sensitive improvement over Mykrobe with no precision loss. Compared with tbprofiler we are broadly the same - tbprofiler mostly has slightly better recall and slightly worse precision (except for fluoroquinolones). The biggest difference is 7% higher recall for tbprofiler for pyrazinamide . Fair summary?

@mbhall88
Copy link
Owner Author

mbhall88 commented Jan 8, 2023

Yep, fair summary. The work in mbhall88/drprg#24 should improve the PZA recall slightly too.

@mbhall88
Copy link
Owner Author

mbhall88 commented Jan 12, 2023

After the work in mbhall88/drprg#26 , we get the following Illumina results (nanopore is unchanged). Note: only ETO and PZA change from last results

Drug Tool FN(R) FP(S) Sensitivity (95% CI) Specificity (95% CI) MCC
Amikacin drprg 66(484) 57(6958) 86.4% (83.0-89.1%) 99.2% (98.9-99.4%) 0.863
Amikacin mykrobe 93(484) 51(6958) 80.8% (77.0-84.0%) 99.3% (99.0-99.4%) 0.835
Amikacin tbprofiler 62(484) 59(6958) 87.2% (83.9-89.9%) 99.2% (98.9-99.3%) 0.866
Capreomycin drprg 56(235) 94(2448) 76.2% (70.3-81.2%) 96.2% (95.3-96.9%) 0.676
Capreomycin mykrobe 72(235) 87(2448) 69.4% (63.2-74.9%) 96.4% (95.6-97.1%) 0.64
Capreomycin tbprofiler 54(235) 95(2448) 77.0% (71.2-81.9%) 96.1% (95.3-96.8%) 0.681
Delamanid drprg 111(116) 4(8151) 4.3% (1.9-9.7%) 100.0% (99.9-100.0%) 0.152
Delamanid mykrobe 111(116) 2(8151) 4.3% (1.9-9.7%) 100.0% (99.9-100.0%) 0.173
Delamanid tbprofiler 111(116) 2(8151) 4.3% (1.9-9.7%) 100.0% (99.9-100.0%) 0.173
Ethambutol drprg 120(1537) 754(4935) 92.2% (90.7-93.4%) 84.7% (83.7-85.7%) 0.693
Ethambutol mykrobe 133(1537) 747(4935) 91.3% (89.8-92.7%) 84.9% (83.8-85.8%) 0.688
Ethambutol tbprofiler 118(1537) 765(4935) 92.3% (90.9-93.6%) 84.5% (83.5-85.5%) 0.691
Ethionamide drprg 245(1103) 418(6105) 77.8% (75.2-80.1%) 93.2% (92.5-93.8%) 0.669
Ethionamide mykrobe 265(1103) 413(6105) 76.0% (73.4-78.4%) 93.2% (92.6-93.8%) 0.658
Ethionamide tbprofiler 272(1103) 414(6105) 75.3% (72.7-77.8%) 93.2% (92.6-93.8%) 0.653
Isoniazid drprg 305(3899) 173(4193) 92.2% (91.3-93.0%) 95.9% (95.2-96.4%) 0.882
Isoniazid mykrobe 333(3899) 170(4193) 91.5% (90.5-92.3%) 95.9% (95.3-96.5%) 0.876
Isoniazid tbprofiler 297(3899) 181(4193) 92.4% (91.5-93.2%) 95.7% (95.0-96.3%) 0.882
Kanamycin drprg 126(669) 107(6975) 81.2% (78.0-83.9%) 98.5% (98.1-98.7%) 0.807
Kanamycin mykrobe 152(669) 98(6975) 77.3% (74.0-80.3%) 98.6% (98.3-98.8%) 0.788
Kanamycin tbprofiler 122(669) 107(6975) 81.8% (78.7-84.5%) 98.5% (98.1-98.7%) 0.811
Levofloxacin drprg 80(1040) 106(5454) 92.3% (90.5-93.8%) 98.1% (97.7-98.4%) 0.895
Levofloxacin mykrobe 88(1040) 102(5454) 91.5% (89.7-93.1%) 98.1% (97.7-98.5%) 0.892
Levofloxacin tbprofiler 85(1040) 109(5454) 91.8% (90.0-93.3%) 98.0% (97.6-98.3%) 0.89
Linezolid drprg 48(65) 4(6109) 26.2% (17.0-38.0%) 99.9% (99.8-100.0%) 0.457
Linezolid mykrobe 48(65) 4(6109) 26.2% (17.0-38.0%) 99.9% (99.8-100.0%) 0.457
Linezolid tbprofiler 48(65) 5(6109) 26.2% (17.0-38.0%) 99.9% (99.8-100.0%) 0.447
Moxifloxacin drprg 39(603) 477(5430) 93.5% (91.3-95.2%) 91.2% (90.4-91.9%) 0.673
Moxifloxacin mykrobe 44(603) 472(5430) 92.7% (90.3-94.5%) 91.3% (90.5-92.0%) 0.669
Moxifloxacin tbprofiler 42(603) 481(5430) 93.0% (90.7-94.8%) 91.1% (90.4-91.9%) 0.668
Ofloxacin drprg 25(104) 5(424) 76.0% (66.9-83.2%) 98.8% (97.3-99.5%) 0.813
Ofloxacin mykrobe 26(104) 5(424) 75.0% (65.9-82.3%) 98.8% (97.3-99.5%) 0.807
Ofloxacin tbprofiler 26(104) 6(424) 75.0% (65.9-82.3%) 98.6% (96.9-99.3%) 0.8
Pyrazinamide drprg 57(341) 55(820) 83.3% (79.0-86.9%) 93.3% (91.4-94.8%) 0.767
Pyrazinamide mykrobe 55(341) 56(820) 83.9% (79.6-87.4%) 93.2% (91.2-94.7%) 0.77
Pyrazinamide tbprofiler 45(341) 62(820) 86.8% (82.8-90.0%) 92.4% (90.4-94.1%) 0.782
Rifampicin drprg 112(3221) 168(4585) 96.5% (95.8-97.1%) 96.3% (95.8-96.8%) 0.926
Rifampicin mykrobe 164(3221) 169(4585) 94.9% (94.1-95.6%) 96.3% (95.7-96.8%) 0.912
Rifampicin tbprofiler 102(3221) 177(4585) 96.8% (96.2-97.4%) 96.1% (95.5-96.7%) 0.927
Streptomycin drprg 259(1041) 135(1205) 75.1% (72.4-77.7%) 88.8% (86.9-90.5%) 0.648
Streptomycin mykrobe 282(1041) 135(1205) 72.9% (70.1-75.5%) 88.8% (86.9-90.5%) 0.629
Streptomycin tbprofiler 257(1041) 136(1205) 75.3% (72.6-77.8%) 88.7% (86.8-90.4%) 0.649

illumina

PZA still isn't great, but there are just so many different mutations with minor alleles that we don't have in the graph and hand-picking them all could lead to a complicated graph. Although I can try adding them if we really want to try boosting PZA sensitivity...

@iqbal-lab
Copy link
Collaborator

I think those results are much improved, am wondering what the pitch is for drprg though. Illumina is better than Mykrobe and ~same as tbprofiler. Are the nanopore results really unchanged from before ? Leandros mapping fixes will help too

@mbhall88
Copy link
Owner Author

mbhall88 commented Jan 12, 2023

am wondering what the pitch is for drprg though

Yeah, this has been troubling me too...I mean we can notice gene deletions...We use a lot less resources....

Are the nanopore results really unchanged from before ?

Here are the current nanopore results

Drug Tool FN(R) FP(S) Sensitivity (95% CI) Specificity (95% CI) MCC
Amikacin drprg 0(11) 3(78) 100.0% (74.1-100.0%) 96.2% (89.3-98.7%) 0.869
Amikacin mykrobe 0(11) 3(78) 100.0% (74.1-100.0%) 96.2% (89.3-98.7%) 0.869
Amikacin tbprofiler 0(11) 3(78) 100.0% (74.1-100.0%) 96.2% (89.3-98.7%) 0.869
Capreomycin drprg 1(1) 1(51) 0.0% (0.0-79.3%) 98.0% (89.7-99.7%) -0.02
Capreomycin mykrobe 1(1) 1(51) 0.0% (0.0-79.3%) 98.0% (89.7-99.7%) -0.02
Capreomycin tbprofiler 1(1) 1(51) 0.0% (0.0-79.3%) 98.0% (89.7-99.7%) -0.02
Ethambutol drprg 4(14) 15(77) 71.4% (45.4-88.3%) 80.5% (70.3-87.8%) 0.42
Ethambutol mykrobe 4(14) 15(77) 71.4% (45.4-88.3%) 80.5% (70.3-87.8%) 0.42
Ethambutol tbprofiler 5(14) 15(77) 64.3% (38.8-83.7%) 80.5% (70.3-87.8%) 0.367
Ethionamide drprg 0(4) 1(9) 100.0% (51.0-100.0%) 88.9% (56.5-98.0%) 0.843
Ethionamide mykrobe 0(4) 1(9) 100.0% (51.0-100.0%) 88.9% (56.5-98.0%) 0.843
Ethionamide tbprofiler 0(4) 1(9) 100.0% (51.0-100.0%) 88.9% (56.5-98.0%) 0.843
Isoniazid drprg 9(51) 5(48) 82.4% (69.7-90.4%) 89.6% (77.8-95.5%) 0.72
Isoniazid mykrobe 9(51) 4(48) 82.4% (69.7-90.4%) 91.7% (80.4-96.7%) 0.742
Isoniazid tbprofiler 9(51) 3(48) 82.4% (69.7-90.4%) 93.8% (83.2-97.9%) 0.764
Kanamycin drprg 0(0) 1(52) - 98.1% (89.9-99.7%) -
Kanamycin mykrobe 0(0) 1(52) - 98.1% (89.9-99.7%) -
Kanamycin tbprofiler 0(0) 1(52) - 98.1% (89.9-99.7%) -
Moxifloxacin drprg 0(0) 1(1) - 0.0% (0.0-79.3%) -
Moxifloxacin mykrobe 0(0) 1(1) - 0.0% (0.0-79.3%) -
Moxifloxacin tbprofiler 0(0) 1(1) - 0.0% (0.0-79.3%) -
Ofloxacin drprg 0(10) 4(77) 100.0% (72.2-100.0%) 94.8% (87.4-98.0%) 0.823
Ofloxacin mykrobe 0(10) 4(77) 100.0% (72.2-100.0%) 94.8% (87.4-98.0%) 0.823
Ofloxacin tbprofiler 0(10) 3(77) 100.0% (72.2-100.0%) 96.1% (89.2-98.7%) 0.86
Pyrazinamide drprg 0(0) 0(1) - 100.0% (20.7-100.0%) -
Pyrazinamide mykrobe 0(0) 0(1) - 100.0% (20.7-100.0%) -
Pyrazinamide tbprofiler 0(0) 0(1) - 100.0% (20.7-100.0%) -
Rifampicin drprg 5(48) 1(44) 89.6% (77.8-95.5%) 97.7% (88.2-99.6%) 0.873
Rifampicin mykrobe 5(48) 1(44) 89.6% (77.8-95.5%) 97.7% (88.2-99.6%) 0.873
Rifampicin tbprofiler 5(48) 1(44) 89.6% (77.8-95.5%) 97.7% (88.2-99.6%) 0.873
Streptomycin drprg 2(8) 14(83) 75.0% (40.9-92.9%) 83.1% (73.7-89.7%) 0.398
Streptomycin mykrobe 2(8) 27(83) 75.0% (40.9-92.9%) 67.5% (56.8-76.6%) 0.25
Streptomycin tbprofiler 2(8) 12(83) 75.0% (40.9-92.9%) 85.5% (76.4-91.5%) 0.43

nanopore

Sample sizes are so small it makes it hard to get a clear picture for a lot of drugs.

@mbhall88
Copy link
Owner Author

Here are the Illumina results on the full dataset (45,193 samples)

illumina

Drug Tool FN(R) FP(S) Sensitivity (95% CI) Specificity (95% CI) MCC
Amikacin drprg 270(1864) 225(18732) 85.5% (83.8-87.0%) 98.8% (98.6-98.9%) 0.852
Amikacin mykrobe 358(1864) 195(18732) 80.8% (78.9-82.5%) 99.0% (98.8-99.1%) 0.831
Amikacin tbprofiler 269(1864) 227(18732) 85.6% (83.9-87.1%) 98.8% (98.6-98.9%) 0.852
Capreomycin drprg 293(1298) 300(13034) 77.4% (75.1-79.6%) 97.7% (97.4-97.9%) 0.749
Capreomycin mykrobe 367(1298) 265(13034) 71.7% (69.2-74.1%) 98.0% (97.7-98.2%) 0.723
Capreomycin tbprofiler 292(1298) 305(13034) 77.5% (75.2-79.7%) 97.7% (97.4-97.9%) 0.748
Delamanid drprg 111(116) 4(8151) 4.3% (1.9-9.7%) 100.0% (99.9-100.0%) 0.152
Delamanid mykrobe 111(116) 2(8151) 4.3% (1.9-9.7%) 100.0% (99.9-100.0%) 0.173
Delamanid tbprofiler 111(116) 2(8151) 4.3% (1.9-9.7%) 100.0% (99.9-100.0%) 0.173
Ethambutol drprg 484(5706) 2287(26863) 91.5% (90.8-92.2%) 91.5% (91.1-91.8%) 0.749
Ethambutol mykrobe 499(5706) 2265(26863) 91.3% (90.5-92.0%) 91.6% (91.2-91.9%) 0.749
Ethambutol tbprofiler 471(5706) 2290(26863) 91.7% (91.0-92.4%) 91.5% (91.1-91.8%) 0.751
Ethionamide drprg 672(2853) 992(11016) 76.4% (74.9-78.0%) 91.0% (90.4-91.5%) 0.649
Ethionamide mykrobe 772(2853) 960(11016) 72.9% (71.3-74.5%) 91.3% (90.7-91.8%) 0.627
Ethionamide tbprofiler 787(2853) 964(11016) 72.4% (70.7-74.0%) 91.2% (90.7-91.8%) 0.623
Isoniazid drprg 1016(14531) 593(25764) 93.0% (92.6-93.4%) 97.7% (97.5-97.9%) 0.913
Isoniazid mykrobe 1054(14531) 560(25764) 92.7% (92.3-93.2%) 97.8% (97.6-98.0%) 0.913
Isoniazid tbprofiler 987(14531) 648(25764) 93.2% (92.8-93.6%) 97.5% (97.3-97.7%) 0.912
Kanamycin drprg 359(2205) 316(17934) 83.7% (82.1-85.2%) 98.2% (98.0-98.4%) 0.827
Kanamycin mykrobe 437(2205) 300(17934) 80.2% (78.5-81.8%) 98.3% (98.1-98.5%) 0.808
Kanamycin tbprofiler 349(2205) 322(17934) 84.2% (82.6-85.6%) 98.2% (98.0-98.4%) 0.828
Levofloxacin drprg 272(3102) 355(14867) 91.2% (90.2-92.2%) 97.6% (97.4-97.8%) 0.879
Levofloxacin mykrobe 299(3102) 330(14867) 90.4% (89.3-91.4%) 97.8% (97.5-98.0%) 0.878
Levofloxacin tbprofiler 276(3102) 356(14867) 91.1% (90.0-92.1%) 97.6% (97.3-97.8%) 0.878
Linezolid drprg 104(152) 30(10911) 31.6% (24.7-39.3%) 99.7% (99.6-99.8%) 0.436
Linezolid mykrobe 105(152) 29(10911) 30.9% (24.1-38.7%) 99.7% (99.6-99.8%) 0.432
Linezolid tbprofiler 104(152) 31(10911) 31.6% (24.7-39.3%) 99.7% (99.6-99.8%) 0.433
Moxifloxacin drprg 178(2255) 1133(14696) 92.1% (90.9-93.1%) 92.3% (91.8-92.7%) 0.732
Moxifloxacin mykrobe 207(2255) 1113(14696) 90.8% (89.6-91.9%) 92.4% (92.0-92.8%) 0.726
Moxifloxacin tbprofiler 182(2255) 1141(14696) 91.9% (90.7-93.0%) 92.2% (91.8-92.7%) 0.729
Ofloxacin drprg 166(778) 68(6007) 78.7% (75.6-81.4%) 98.9% (98.6-99.1%) 0.823
Ofloxacin mykrobe 147(778) 62(6007) 81.1% (78.2-83.7%) 99.0% (98.7-99.2%) 0.842
Ofloxacin tbprofiler 138(778) 65(6007) 82.3% (79.4-84.8%) 98.9% (98.6-99.2%) 0.848
Pyrazinamide drprg 786(3682) 500(17748) 78.7% (77.3-79.9%) 97.2% (96.9-97.4%) 0.783
Pyrazinamide mykrobe 776(3682) 444(17748) 78.9% (77.6-80.2%) 97.5% (97.3-97.7%) 0.794
Pyrazinamide tbprofiler 715(3682) 502(17748) 80.6% (79.3-81.8%) 97.2% (96.9-97.4%) 0.796
Rifampicin drprg 576(11766) 593(28292) 95.1% (94.7-95.5%) 97.9% (97.7-98.1%) 0.93
Rifampicin mykrobe 523(11766) 604(28292) 95.6% (95.2-95.9%) 97.9% (97.7-98.0%) 0.932
Rifampicin tbprofiler 370(11766) 788(28292) 96.9% (96.5-97.2%) 97.2% (97.0-97.4%) 0.931
Streptomycin drprg 784(5362) 760(10179) 85.4% (84.4-86.3%) 92.5% (92.0-93.0%) 0.78
Streptomycin mykrobe 903(5362) 677(10179) 83.2% (82.1-84.1%) 93.3% (92.8-93.8%) 0.773
Streptomycin tbprofiler 778(5362) 662(10179) 85.5% (84.5-86.4%) 93.5% (93.0-94.0%) 0.794

I am currently working through the INH FNs and have learned a lot and fixed some bugs. Most important result to understand here though will be the RIF sensitivity which is significantly lower than tb-profiler

@mbhall88
Copy link
Owner Author

I think I might have gotten to the bottom of the RIF sensitivity issue (also impacts a decent amount of INH FNs).

tl;dr we need a smaller minimum cluster size for (some) Illumina reads in pandora.

Cluster size dictates whether we recognise a read as "hitting" a locus. The default is 10. But I was finding a lot of FNs where we just have these big random stretches of zero depth - generally in and around the RRDR. When I map these reads to H37Rv with minimap2 it was showing that we should definitely have depth over the RRDR and it's surrounding regions. Turns out most of them are unmapped in the pandora SAM file. In the end, most of these reads were getting ~4-6 hits, therefore they were being marked as unmapped because they're below the default of 10. I have also noticed a lot of the samples with this issue are Illumina HiSeq 2000 75bp reads. This relates back to mbhall88/drprg#12 (comment).

I've run on a few samples with the minimum cluster size set to 4 and it seems to have resolved the issue for those samples. So I'm going to rerun all samples and reasssess the results after than 🤞

@iqbal-lab
Copy link
Collaborator

Also relates to long reads that overlap a prg only at the end .

@mbhall88
Copy link
Owner Author

Changing the minimum cluster size to 4 we get the following diff for Illumina

Tool Drug ΔFN ΔFP
drprg Amikacin 0 0
drprg Capreomycin -1 1
drprg Delamanid 0 -1
drprg Ethambutol -5 -8
drprg Ethionamide -1 6
drprg Isoniazid -31 -6
drprg Kanamycin -5 4
drprg Levofloxacin -4 0
drprg Linezolid 0 0
drprg Moxifloxacin 0 2
drprg Ofloxacin -28 1
drprg Pyrazinamide -9 -29
drprg Rifampicin -137 54
drprg Streptomycin -1 -80

This is great, and the only real concern is 57 extra RIF FPs. I'll take a look at those and see if I can figure out if they're fixable or not.

The overall results now are

Drug Tool FN(R) FP(S) Sensitivity (95% CI) Specificity (95% CI) MCC
Amikacin drprg 270(1864) 225(18732) 85.5% (83.8-87.0%) 98.8% (98.6-98.9%) 0.852
Amikacin mykrobe 358(1864) 195(18732) 80.8% (78.9-82.5%) 99.0% (98.8-99.1%) 0.831
Amikacin tbprofiler 269(1864) 227(18732) 85.6% (83.9-87.1%) 98.8% (98.6-98.9%) 0.852
Capreomycin drprg 292(1298) 301(13034) 77.5% (75.2-79.7%) 97.7% (97.4-97.9%) 0.75
Capreomycin mykrobe 367(1298) 265(13034) 71.7% (69.2-74.1%) 98.0% (97.7-98.2%) 0.723
Capreomycin tbprofiler 292(1298) 305(13034) 77.5% (75.2-79.7%) 97.7% (97.4-97.9%) 0.748
Delamanid drprg 111(116) 3(8151) 4.3% (1.9-9.7%) 100.0% (99.9-100.0%) 0.162
Delamanid mykrobe 111(116) 2(8151) 4.3% (1.9-9.7%) 100.0% (99.9-100.0%) 0.173
Delamanid tbprofiler 111(116) 2(8151) 4.3% (1.9-9.7%) 100.0% (99.9-100.0%) 0.173
Ethambutol drprg 479(5706) 2279(26863) 91.6% (90.9-92.3%) 91.5% (91.2-91.8%) 0.75
Ethambutol mykrobe 499(5706) 2265(26863) 91.3% (90.5-92.0%) 91.6% (91.2-91.9%) 0.749
Ethambutol tbprofiler 471(5706) 2290(26863) 91.7% (91.0-92.4%) 91.5% (91.1-91.8%) 0.751
Ethionamide drprg 671(2853) 998(11016) 76.5% (74.9-78.0%) 90.9% (90.4-91.5%) 0.648
Ethionamide mykrobe 772(2853) 960(11016) 72.9% (71.3-74.5%) 91.3% (90.7-91.8%) 0.627
Ethionamide tbprofiler 787(2853) 964(11016) 72.4% (70.7-74.0%) 91.2% (90.7-91.8%) 0.623
Isoniazid drprg 985(14531) 587(25764) 93.2% (92.8-93.6%) 97.7% (97.5-97.9%) 0.915
Isoniazid mykrobe 1054(14531) 560(25764) 92.7% (92.3-93.2%) 97.8% (97.6-98.0%) 0.913
Isoniazid tbprofiler 987(14531) 648(25764) 93.2% (92.8-93.6%) 97.5% (97.3-97.7%) 0.912
Kanamycin drprg 354(2205) 320(17934) 83.9% (82.4-85.4%) 98.2% (98.0-98.4%) 0.827
Kanamycin mykrobe 437(2205) 300(17934) 80.2% (78.5-81.8%) 98.3% (98.1-98.5%) 0.808
Kanamycin tbprofiler 349(2205) 322(17934) 84.2% (82.6-85.6%) 98.2% (98.0-98.4%) 0.828
Levofloxacin drprg 268(3102) 355(14867) 91.4% (90.3-92.3%) 97.6% (97.4-97.8%) 0.88
Levofloxacin mykrobe 299(3102) 330(14867) 90.4% (89.3-91.4%) 97.8% (97.5-98.0%) 0.878
Levofloxacin tbprofiler 276(3102) 356(14867) 91.1% (90.0-92.1%) 97.6% (97.3-97.8%) 0.878
Linezolid drprg 104(152) 30(10911) 31.6% (24.7-39.3%) 99.7% (99.6-99.8%) 0.436
Linezolid mykrobe 105(152) 29(10911) 30.9% (24.1-38.7%) 99.7% (99.6-99.8%) 0.432
Linezolid tbprofiler 104(152) 31(10911) 31.6% (24.7-39.3%) 99.7% (99.6-99.8%) 0.433
Moxifloxacin drprg 178(2255) 1135(14696) 92.1% (90.9-93.1%) 92.3% (91.8-92.7%) 0.731
Moxifloxacin mykrobe 207(2255) 1113(14696) 90.8% (89.6-91.9%) 92.4% (92.0-92.8%) 0.726
Moxifloxacin tbprofiler 182(2255) 1141(14696) 91.9% (90.7-93.0%) 92.2% (91.8-92.7%) 0.729
Ofloxacin drprg 138(778) 69(6007) 82.3% (79.4-84.8%) 98.9% (98.5-99.1%) 0.845
Ofloxacin mykrobe 147(778) 62(6007) 81.1% (78.2-83.7%) 99.0% (98.7-99.2%) 0.842
Ofloxacin tbprofiler 138(778) 65(6007) 82.3% (79.4-84.8%) 98.9% (98.6-99.2%) 0.848
Pyrazinamide drprg 777(3682) 471(17748) 78.9% (77.5-80.2%) 97.3% (97.1-97.6%) 0.789
Pyrazinamide mykrobe 776(3682) 444(17748) 78.9% (77.6-80.2%) 97.5% (97.3-97.7%) 0.794
Pyrazinamide tbprofiler 715(3682) 502(17748) 80.6% (79.3-81.8%) 97.2% (96.9-97.4%) 0.796
Rifampicin drprg 439(11766) 647(28292) 96.3% (95.9-96.6%) 97.7% (97.5-97.9%) 0.935
Rifampicin mykrobe 523(11766) 604(28292) 95.6% (95.2-95.9%) 97.9% (97.7-98.0%) 0.932
Rifampicin tbprofiler 370(11766) 788(28292) 96.9% (96.5-97.2%) 97.2% (97.0-97.4%) 0.931
Streptomycin drprg 783(5362) 680(10179) 85.4% (84.4-86.3%) 93.3% (92.8-93.8%) 0.791
Streptomycin mykrobe 903(5362) 677(10179) 83.2% (82.1-84.1%) 93.3% (92.8-93.8%) 0.773
Streptomycin tbprofiler 778(5362) 662(10179) 85.5% (84.5-86.4%) 93.5% (93.0-94.0%) 0.794

illumina

Here is a table of the drug, tool combinations where the CIs don't overlap

Metric Drug Tool1 Tool1 CI Tool2 Tool2 CI
sensitivity Capreomycin drprg (75.2, 79.7) mykrobe (69.2, 74.1)
sensitivity Capreomycin mykrobe (69.2, 74.1) tbprofiler (75.2, 79.7)
sensitivity Amikacin drprg (83.8, 87.0) mykrobe (78.9, 82.5)
sensitivity Amikacin mykrobe (78.9, 82.5) tbprofiler (83.9, 87.1)
sensitivity Ethionamide drprg (74.9, 78.0) mykrobe (71.3, 74.5)
sensitivity Ethionamide drprg (74.9, 78.0) tbprofiler (70.7, 74.0)
sensitivity Streptomycin drprg (84.4, 86.3) mykrobe (82.1, 84.1)
sensitivity Streptomycin mykrobe (82.1, 84.1) tbprofiler (84.5, 86.4)
sensitivity Kanamycin drprg (82.4, 85.4) mykrobe (78.5, 81.8)
sensitivity Kanamycin mykrobe (78.5, 81.8) tbprofiler (82.6, 85.6)
specificity Rifampicin drprg (97.5, 97.9) tbprofiler (97.0, 97.4)
sensitivity Rifampicin mykrobe (95.2, 95.9) tbprofiler (96.5, 97.2)
specificity Rifampicin mykrobe (97.7, 98.0) tbprofiler (97.0, 97.4)

I think we're very close to done. I just want to do a last check of discrepancies and see if I can salvage some more FNs and FPs.

@iqbal-lab
Copy link
Collaborator

Looking good! What does that change if cluster size do to nanopore though

@mbhall88
Copy link
Owner Author

mbhall88 commented Feb 23, 2023

What does that change if cluster size do to nanopore though

No change in results for nanopore

@iqbal-lab
Copy link
Collaborator

Woah

@mbhall88
Copy link
Owner Author

mbhall88 commented Feb 28, 2023

I rerun the pipeline again after fixing a couple of bugs and adding/removing some more mutations from the graph.

Here is the diff between this latest run and the run above

Tool Drug ΔFN ΔFP
drprg Amikacin 0 0
drprg Capreomycin 0 0
drprg Delamanid 0 0
drprg Ethambutol -2 6
drprg Ethionamide -1 0
drprg Isoniazid 0 -1
drprg Kanamycin 0 0
drprg Levofloxacin 0 0
drprg Linezolid 0 0
drprg Moxifloxacin 0 0
drprg Ofloxacin 0 0
drprg Pyrazinamide -16 13
drprg Rifampicin -13 -27
drprg Streptomycin 0 0

And the overall results

Drug Tool FN(R) FP(S) Sensitivity (95% CI) Specificity (95% CI) MCC
Amikacin drprg 270(1864) 225(18732) 85.5% (83.8-87.0%) 98.8% (98.6-98.9%) 0.852
Amikacin mykrobe 358(1864) 195(18732) 80.8% (78.9-82.5%) 99.0% (98.8-99.1%) 0.831
Amikacin tbprofiler 269(1864) 227(18732) 85.6% (83.9-87.1%) 98.8% (98.6-98.9%) 0.852
Capreomycin drprg 292(1298) 301(13034) 77.5% (75.2-79.7%) 97.7% (97.4-97.9%) 0.75
Capreomycin mykrobe 367(1298) 265(13034) 71.7% (69.2-74.1%) 98.0% (97.7-98.2%) 0.723
Capreomycin tbprofiler 292(1298) 305(13034) 77.5% (75.2-79.7%) 97.7% (97.4-97.9%) 0.748
Delamanid drprg 111(116) 3(8151) 4.3% (1.9-9.7%) 100.0% (99.9-100.0%) 0.162
Delamanid mykrobe 111(116) 2(8151) 4.3% (1.9-9.7%) 100.0% (99.9-100.0%) 0.173
Delamanid tbprofiler 111(116) 2(8151) 4.3% (1.9-9.7%) 100.0% (99.9-100.0%) 0.173
Ethambutol drprg 477(5706) 2285(26863) 91.6% (90.9-92.3%) 91.5% (91.2-91.8%) 0.75
Ethambutol mykrobe 499(5706) 2265(26863) 91.3% (90.5-92.0%) 91.6% (91.2-91.9%) 0.749
Ethambutol tbprofiler 471(5706) 2290(26863) 91.7% (91.0-92.4%) 91.5% (91.1-91.8%) 0.751
Ethionamide drprg 670(2853) 998(11016) 76.5% (74.9-78.0%) 90.9% (90.4-91.5%) 0.649
Ethionamide mykrobe 772(2853) 960(11016) 72.9% (71.3-74.5%) 91.3% (90.7-91.8%) 0.627
Ethionamide tbprofiler 787(2853) 964(11016) 72.4% (70.7-74.0%) 91.2% (90.7-91.8%) 0.623
Isoniazid drprg 985(14531) 586(25764) 93.2% (92.8-93.6%) 97.7% (97.5-97.9%) 0.915
Isoniazid mykrobe 1054(14531) 560(25764) 92.7% (92.3-93.2%) 97.8% (97.6-98.0%) 0.913
Isoniazid tbprofiler 987(14531) 648(25764) 93.2% (92.8-93.6%) 97.5% (97.3-97.7%) 0.912
Kanamycin drprg 354(2205) 320(17934) 83.9% (82.4-85.4%) 98.2% (98.0-98.4%) 0.827
Kanamycin mykrobe 437(2205) 300(17934) 80.2% (78.5-81.8%) 98.3% (98.1-98.5%) 0.808
Kanamycin tbprofiler 349(2205) 322(17934) 84.2% (82.6-85.6%) 98.2% (98.0-98.4%) 0.828
Levofloxacin drprg 268(3102) 355(14867) 91.4% (90.3-92.3%) 97.6% (97.4-97.8%) 0.88
Levofloxacin mykrobe 299(3102) 330(14867) 90.4% (89.3-91.4%) 97.8% (97.5-98.0%) 0.878
Levofloxacin tbprofiler 276(3102) 356(14867) 91.1% (90.0-92.1%) 97.6% (97.3-97.8%) 0.878
Linezolid drprg 104(152) 30(10911) 31.6% (24.7-39.3%) 99.7% (99.6-99.8%) 0.436
Linezolid mykrobe 105(152) 29(10911) 30.9% (24.1-38.7%) 99.7% (99.6-99.8%) 0.432
Linezolid tbprofiler 104(152) 31(10911) 31.6% (24.7-39.3%) 99.7% (99.6-99.8%) 0.433
Moxifloxacin drprg 178(2255) 1135(14696) 92.1% (90.9-93.1%) 92.3% (91.8-92.7%) 0.731
Moxifloxacin mykrobe 207(2255) 1113(14696) 90.8% (89.6-91.9%) 92.4% (92.0-92.8%) 0.726
Moxifloxacin tbprofiler 182(2255) 1141(14696) 91.9% (90.7-93.0%) 92.2% (91.8-92.7%) 0.729
Ofloxacin drprg 138(778) 69(6007) 82.3% (79.4-84.8%) 98.9% (98.5-99.1%) 0.845
Ofloxacin mykrobe 147(778) 62(6007) 81.1% (78.2-83.7%) 99.0% (98.7-99.2%) 0.842
Ofloxacin tbprofiler 138(778) 65(6007) 82.3% (79.4-84.8%) 98.9% (98.6-99.2%) 0.848
Pyrazinamide drprg 761(3682) 484(17748) 79.3% (78.0-80.6%) 97.3% (97.0-97.5%) 0.79
Pyrazinamide mykrobe 776(3682) 444(17748) 78.9% (77.6-80.2%) 97.5% (97.3-97.7%) 0.794
Pyrazinamide tbprofiler 715(3682) 502(17748) 80.6% (79.3-81.8%) 97.2% (96.9-97.4%) 0.796
Rifampicin drprg 426(11766) 620(28292) 96.4% (96.0-96.7%) 97.8% (97.6-98.0%) 0.937
Rifampicin mykrobe 523(11766) 604(28292) 95.6% (95.2-95.9%) 97.9% (97.7-98.0%) 0.932
Rifampicin tbprofiler 370(11766) 788(28292) 96.9% (96.5-97.2%) 97.2% (97.0-97.4%) 0.931
Streptomycin drprg 783(5362) 680(10179) 85.4% (84.4-86.3%) 93.3% (92.8-93.8%) 0.791
Streptomycin mykrobe 903(5362) 677(10179) 83.2% (82.1-84.1%) 93.3% (92.8-93.8%) 0.773
Streptomycin tbprofiler 778(5362) 662(10179) 85.5% (84.5-86.4%) 93.5% (93.0-94.0%) 0.794

And the table of the drug, tool combinations where the CIs don't overlap

Metric Drug Tool1 Tool1 CI Tool2 Tool2 CI
sensitivity Amikacin tbprofiler (83.9, 87.1) mykrobe (78.9, 82.5)
sensitivity Amikacin mykrobe (78.9, 82.5) drprg (83.8, 87.0)
sensitivity Streptomycin tbprofiler (84.5, 86.4) mykrobe (82.1, 84.1)
sensitivity Streptomycin mykrobe (82.1, 84.1) drprg (84.4, 86.3)
sensitivity Kanamycin tbprofiler (82.6, 85.6) mykrobe (78.5, 81.8)
sensitivity Kanamycin mykrobe (78.5, 81.8) drprg (82.4, 85.4)
sensitivity Ethionamide tbprofiler (70.7, 74.0) drprg (74.9, 78.0)
sensitivity Ethionamide mykrobe (71.3, 74.5) drprg (74.9, 78.0)
sensitivity Capreomycin tbprofiler (75.2, 79.7) mykrobe (69.2, 74.1)
sensitivity Capreomycin mykrobe (69.2, 74.1) drprg (75.2, 79.7)
sensitivity Rifampicin tbprofiler (96.5, 97.2) mykrobe (95.2, 95.9)
specificity Rifampicin tbprofiler (97.0, 97.4) mykrobe (97.7, 98.0)
specificity Rifampicin tbprofiler (97.0, 97.4) drprg (97.6, 98.0)
sensitivity Rifampicin mykrobe (95.2, 95.9) drprg (96.0, 96.7)

I've been through and made a couple more changes to the pipeline/mutations added and have rerun the pipeline. Fingers crossed this might be the last run. Although there is iqbal-lab-org/make_prg#55 which would also improve results if we can get a fix inplace.

@mbhall88
Copy link
Owner Author

mbhall88 commented Mar 9, 2023

I've been reworking the sensitivity/specificity plots a little as I don't love them in their current form. Now that the CIs are so small it makes it hard to see some. In particular this is because we have both Sn and Sp in the same plot and their scales probably aren't matched well.

As such, I have split them into separate plots and used a white background for easier determination of colours. I've also added a red, dashed line for the WHO target product profiles for both sensitivity and specificity (see here).

Sensitivity

image

Specificity

image

Feedback is very welcome

@iqbal-lab
Copy link
Collaborator

These are great 👍

@iqbal-lab
Copy link
Collaborator

That said, the y axis on the sensitivity plot is weird, right?

@iqbal-lab
Copy link
Collaborator

so, mykrobe is ~always the most specific but at a loss in sensitivity compared to the other two tools. Drprg retains almost as high specificity, but improves recall to the extent that is has the best recall of all tools (except only for Rif)

@mbhall88
Copy link
Owner Author

That said, the y axis on the sensitivity plot is weird, right?

How so? It's a logit (logistic regression) scale. This scale is similar to a log scale close to zero and to one, and almost linear around 0.5. Seemed the best fit given we have stuff near 100% that we want to zoom in on, but we also have stuff well below. Without it the CI's are basically invisible for a lot of the drugs

so, mykrobe is ~always the most specific but at a loss in sensitivity compared to the other two tools. Drprg retains almost as high specificity, but improves recall to the extent that is has the best recall of all tools (except only for Rif)

Fair summary for the most part. Although mykrobe's specificity is never significantly better than drprg or tbprofiler

@mbhall88 mbhall88 transferred this issue from mbhall88/drprg Mar 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants