#Sensitivity and specificity

Sensitivity and specificity are statistical measures of the performance of a binary classification test that are widely used in medicine:

Sensitivity measures the proportion of positives that are correctly identified (e.g., the percentage of sick people who are correctly identified as having some illness).

Specificity measures the proportion of negatives that are correctly identified (e.g., the percentage of healthy people who are correctly identified as not having some illness).

The terms "positive" and "negative" do not refer to benefit, but to the presence or absence of a condition; for example if the condition is a disease, "positive" means "diseased" and "negative" means "healthy".
https://en.wikipedia.org/wiki/Sensitivity_and_specificity

![](https://static.healthcare.siemens.com/siemens_hwem-hwem_ssxa_websites-context-root/wcm/idc/groups/public/@global/@lab/documents/image/mda5/ody3/~edisp/highly_accurate_tests_covid-19_siemens_healthineers_16x9-07370965/~renditions/highly_accurate_tests_covid-19_siemens_healthineers_16x9-07370965~8.jpg)Siemens Healthineers

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

#Sensitivity, specificity and predictive values of molecular and serological tests for COVID-19. A longitudinal study in #emergency room.

Authors: Zeno Bisoffi, ELENA POMARI, Michela Deiana, Chiara Piubelli, Niccolo Ronzoni, Anna Beltrame, Giulia Bertoli, Niccolo Riccardi, Francesca Perandin, Fabio Formenti, Federico Gobbi, Dora Buonfrate, Ronaldo Silva
doi: https://doi.org/10.1101/2020.08.09.20171355
Now published in Diagnostics doi: 10.3390/diagnostics10090669

Accuracy of diagnostic tests is essential for suspected cases of Coronavirus Disease 2019 (COVID-19). This study aimed to assess the sensitivity, specificity and positive and negative predictive value (PPV and NPV) of molecular and serological tests for the diagnosis of SARS-CoV-2 infection.

The authors evaluated three RT-PCR methods including six different gene targets; five serologic rapid diagnostic tests (RDT); one ELISA test. The final classification of infected/not infected patients was performed using Latent Class Analysis in combination with clinical re-assessment of incongruous cases and was the basis for the main analysis of accuracy. Of 346 patients consecutively enrolled, 85 (24.6%) were classified as infected.

The molecular test with the highest sensitivity, specificity, PPV and NPV was RQ-SARS-nCoV-2 with 91.8% (C.I. 83.8-96.6), 100% (C.I. 98.6-100.0), 100.0% (C.I. 95.4-100.0) and 97.4% (C.I. 94.7-98.9) respectively, followed by CDC 2019-nCoV with 76.2% (C.I. 65.7-84.8), 99.6% (C.I. 97.9-100.0), 98.5% (C.I. 91.7-100.0) and 92.9% (C.I. 89.2-95.6) and by in-house test targeting E-RdRp with 61.2% (C.I. 50.0-71.6), 99.6% (C.I. 97.9-100.0), 98.1% (C.I. 89.9-100.0) and 88.7% (C.I. 84.6-92.1).
 
The analyses on single gene targets found the highest sensitivity for S and RdRp of the RQ-SARS-nCoV-2 (both with sensitivity 94.1%, C.I. 86.8-98.1). The in-house RdRp had the lowest sensitivity (62.4%, C.I. 51.2-72.6). The specificity ranged from 99.2% (C.I. 97.3-99.9) for in-house RdRp and N2 to 95.0% (C.I. 91.6-97.3) for E. The PPV ranged from 97.1% (C.I. 89.8-99.6) of N2 to 85.4% (C.I. 76.3-92.00) of E, and the NPV from 98.1% (C.I. 95.5-99.4) of gene S to 89.0% (C.I. 84.8-92.4) of in-house RdRp. 

All serological tests had <50% sensitivity and low PPV and NPV. One RDT (VivaDiag IgM) had high specificity (98.5%, with PPV 84.0%), but poor sensitivity (24.7%). Molecular tests for SARS-CoV-2 infection showed excellent specificity, but significant differences in sensitivity. As expected, serological tests have limited utility in a clinical context.
https://www.medrxiv.org/content/10.1101/2020.08.09.20171355v1

In [None]:
from fastai.vision.all import *
from fastai.imports import *
from fastai.vision.data import *
from fastai import *
import numpy as np
import fastai
import matplotlib.pyplot as plt

In [None]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline

#Why Specificity Matters in COVID-19 antibody testing

Test quality and accuracy are paramount to minimize risks of inaccurate results.

Sensitivity vs Specificity

Test sensitivity indicates the ability of the test to correctly identify patients that have the disease. A test’s sensitivity is also known as the true positive rate. If a diagnostic test correctly identified 100% of all positive results, it would be as sensitive as possible.
Test specificity indicates the ability of the test to correctly identify patients that do not have the disease. If a test correctly identifies all people without the disease as negative, it would be as specific as possible.

Specificity in COVID-19 testing
For SARS-CoV-2 antibody testing, the CDC suggests use of tests with a specificity ≥99.5% to minimize the potential for false-positive results.1

There are numerous tests that claim to detect antibodies to the SARS-CoV-2 virus; only a few are highly accurate.
https://www.siemens-healthineers.com/br/laboratory-diagnostics/assays-by-diseases-conditions/infectious-disease-assays/specificity-matters

In [None]:
path = Path("/kaggle/input/ai4all-project/figures/classifier")
path.ls()

#Dataloader has no Batches, therefore I couldn't run some cells. Go to Plan B.

In [None]:
def _add1(x): return x+1
dumb_tfm = RandTransform(enc=_add1, p=0.5)
start,d1,d2 = 2,False,False
for _ in range(40):
    t = dumb_tfm(start, split_idx=0)
    if dumb_tfm.do: test_eq(t, start+1); d1=True
    else:           test_eq(t, start)  ; d2=True
assert d1 and d2
dumb_tfm

#How do highly specific antibody tests support good performance, even with low disease prevalence?

Let’s look at what happens in two cities, one with a 5% disease prevalence, another with 10% prevalence.

What happens when the population is tested with an assay that has higher specificity of 99.8%?

A highly specific test minimizes inaccurate results. With a higher disease prevalence, fewer people will experience incorrect results.

![](https://static.healthcare.siemens.com/siemens_hwem-hwem_ssxa_websites-context-root/wcm/idc/groups/public/@global/@lab/documents/image/mda5/odky/~edisp/citya_cityb-07401378/~renditions/citya_cityb-07401378~8.jpg)https://www.siemens-healthineers.com/br/laboratory-diagnostics/assays-by-diseases-conditions/infectious-disease-assays/specificity-matters

In [None]:
from PIL import Image

img = Image.open("../input/ai4all-project/figures/classifier/lassoRandomForest_5gene_roc.png")
img

In [None]:
TensorTypes = (TensorImage,TensorMask,TensorPoint,TensorBBox)

In [None]:
_,axs = subplots(1,2)
show_image(img, ctx=axs[0], title='original')
show_image(img.flip_lr(), ctx=axs[1], title='flipped');

#Randomly flip with probability p

In [None]:
img.resize((64,64))

![](https://static.healthcare.siemens.com/siemens_hwem-hwem_ssxa_websites-context-root/wcm/idc/groups/public/@global/@lab/documents/image/mda5/odky/~edisp/5_vs_10_percent_prevalence-07401380/~renditions/5_vs_10_percent_prevalence-07401380~8.jpg)

What happens when the specificity is reduced to 96%?

A 96% specificity may seem high, but a difference of as small as 3 to 4 percent can create dramatic changes in test results.

https://www.siemens-healthineers.com/br/laboratory-diagnostics/assays-by-diseases-conditions/infectious-disease-assays/specificity-matters

In [None]:
timg = TensorImage(image2tensor(img))
tpil = PILImage.create(timg)

#Image resize with tpil

In [None]:
tpil.resize((64,64))

#Evaluation of sensitivity and specificity of four commercially available SARS-CoV-2 antibody immunoassays

Public Health England, Porton Down - Nuffield Department of Medicine, University of Oxford - Oxford University Hospitals NHS Foundation Trust 

To cope with the demand for serological diagnosis, several manufacturers have developed immunoassays that are compatible with current global laboratory infrastructures, including high-throughput analyzers. However, assembling appropriate and large sets of samples to thoroughly test the performance of these assays has been difficult within the very short time frames of assay development and release, and direct comparisons of platforms have been limited.

To directly evaluate and compare the sensitivity and specificity of four commercial immunoassays for SARS-CoV-2 antibody 
, the authors formed a collaboration between Public Health England -Porton Down, Oxford University Hospitals NHS Foundation Trust, and the University of Oxford. Using a large collection of serum/plasma samples from individuals with SARS-CoV-2 infection confirmed by RT-PCR, and a bank of known negative samples collected pre-pandemic, they ran the same samples across all four platforms in a ‘head-to-head’ evaluation.
https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/898437/Evaluation__of_sensitivity_and_specificity_of_4_commercially_available_SARS-CoV-2_antibody_immunoassays.pdf 

In [None]:
TensorTypes = (TensorImage,TensorMask,TensorPoint,TensorBBox)

def flip_lr(x:Image.Image): return x.transpose(Image.FLIP_LEFT_RIGHT)
def flip_lr(x:TensorImageBase): return x.flip(-1)
def flip_lr(x:TensorPoint): return TensorPoint(_neg_axis(x.clone(), 0))
def flip_lr(x:TensorBBox):  return TensorBBox(TensorPoint(x.view(-1,2)).flip_lr().view(-1,4))

#Sensitivity and specificity plotted for each assay (Results: Primary analysis)

Sensitivity and specificity (95% confidence intervals) plotted for each assay using the current MHRA TPP criteria: ≥20 days post-symptom onset in confirmed laboratory cases of SARS-CoV-2 for positive cases, and >6 months prior to the first known COVID-19 cases for negatives.

The MHRA TPP target performance is shown (dashed line) including the required lower bound of the 95% confidence interval (dotted line) for both sensitivity and specificity. Data are presented for 994 known negative samples and 536 known positive samples run on each assay; equivocal results were excluded from the calculation of sensitivity and specificity for the DiaSorin assay (n=9).

![](https://els-jbs-prod-cdn.jbs.elsevierhealth.com/cms/attachment/b133bd71-362b-48ab-bbc5-5ddf9ecd437d/gr1_lrg.jpg)
https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/898437/Evaluation__of_sensitivity_and_specificity_of_4_commercially_available_SARS-CoV-2_antibody_immunoassays.pdf

In [None]:
img = PILImage(PILImage.create(timg).resize((600,400)))
img

In [None]:
_,axs = plt.subplots(1,3,figsize=(12,4))
for ax,sz in zip(axs.flatten(), [300, 500, 700]):
    show_image(img.crop_pad(sz), ctx=ax, title=f'Size {sz}');

#Results: Secondary analysis

They used ROC curves to investigate the performance of each assay. ROC curves evaluate the trade-off between true positive 
rates (ie assay sensitivity) versus false positive rates (ie 1-specificity) at a given assay threshold.
 
Distribution of numerical results obtained for each assay, using samples defined according to the current MHRA TPP criteria. Assay thresholds (set by the manufacturers) are shown as dashed lines. For the purposes of plotting values on a log scale, values of zero were set to the lowest non-zero value and results of greater or less than the largest or smallest values were truncated to the largest and smallest values. Data are presented for 994 known negative samples and 536 known positive samples run on each assay.
 
![](https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcTnJoJj-mDvtFtHQ1Tg1gow-93rMzOBb5MPog&usqp=CAU) 
https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/898437/Evaluation__of_sensitivity_and_specificity_of_4_commercially_available_SARS-CoV-2_antibody_immunoassays.pdf

In [None]:
_,axs = plt.subplots(1,3,figsize=(12,4))
for ax,mode in zip(axs.flatten(), [PadMode.Zeros, PadMode.Border, PadMode.Reflection]):
    show_image(img.crop_pad((600,700), pad_mode=mode), ctx=ax, title=mode);

#False-positive COVID-19 results: hidden problems and costs

Authors: Elena Surkova; Vladyslav Nikolayevskyy; Francis Drobniewski
Published:September 29, 2020DOI:https://doi.org/10.1016/S2213-2600(20)30453-7

RT-PCR tests to detect severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) RNA are the operational gold standard for detecting COVID-19 disease in clinical practice. RT-PCR assays in the UK have analytical sensitivity and specificity of greater than 95%, but no single gold standard assay exists.

New assays are verified across panels of material, confirmed as COVID-19 by multiple testing with other assays, together with a consistent clinical and radiological picture. These new assays are often tested under idealised conditions with hospital samples containing higher viral loads than those from asymptomatic individuals living in the community. As such, diagnostic or operational performance of swab tests in the real world might differ substantially from the analytical sensitivity and specificity.https://www.thelancet.com/journals/lanres/article/PIIS2213-2600(20)30453-7/fulltext

In [None]:
#Randomly crop an image to size

_,axs = plt.subplots(1,3,figsize=(12,4))
f = RandomCrop(200)
for ax in axs: show_image(f(img), ctx=ax);

#Asymptomatic people affects the other key parameter of testing

Although testing capacity and therefore the rate of testing in the UK and worldwide has continued to increase, more and more asymptomatic individuals have undergone testing.

This growing inclusion of asymptomatic people affects the other key parameter of testing, the pretest probability, which underpins the veracity of the testing strategy. In March and early April, 2020, most people tested in the UK were severely ill patients admitted to hospitals with a high probability of infection.

Since then, the number of COVID-19-related hospital admissions has decreased markedly from more than 3000 per day at the peak of the first wave, to just more than 100 in August, while the number of daily tests jumped from 11 896 on April 1, 2020, to 190 220 on Aug 1, 2020. 

In other words, the pretest probability will have steadily decreased as the proportion of asymptomatic cases screened increased against a background of physical distancing, lockdown, cleaning, and masks, which have reduced viral transmission to the general population. 

At present, only about a third of swab tests are done in those with clinical needs or in health-care workers (defined as the pillar 1 community in the UK), while the majority are done in wider community settings (pillar 2). At the end of July, 2020, the positivity rate of swab tests within both pillar 1 (1·7%) and pillar 2 (0·5%) remained significantly lower than those in early April, when positivity rates reached 50%.https://www.thelancet.com/journals/lanres/article/PIIS2213-2600(20)30453-7/fulltext

#Center crop

In [None]:
_,axs = plt.subplots(1,3,figsize=(12,4))
for ax in axs: show_image(f(img, split_idx=1), ctx=ax);

#False-negative tests - Asymptomatic or mildly symptomatic patients

Globally, most effort so far has been invested in turnaround times and low test sensitivity (ie, false negatives); one systematic review reported false-negative rates of between 2% and 33% in repeat sample testing.

Although false-negative tests have until now had priority due to the devastating consequences of undetected cases in health-care and social care settings, and the propagation of the epidemic especially by asymptomatic or mildly symptomatic patients, the consequences of a false-positive result are not benign from various perspectives, in particular among health-care workers.
https://www.thelancet.com/journals/lanres/article/PIIS2213-2600(20)30453-7/fulltext

In [None]:
test_eq(ResizeMethod.Squish, 'squish')

In [None]:
Resize(224)

#Potential consequences of false-positive COVID-19 swab test results

INDIVIDUAL PERSPECTIVE

Health-related

For swab tests taken for screening purposes before elective procedures or surgeries: unnecessary treatment cancellation or postponement

For swab tests taken for screening purposes during urgent hospital admissions: potential exposure to infection following a wrong pathway in hospital settings as an in-patient.



Financial

Financial losses related to self-isolation, income losses, and cancelled travel, among other factors


Psychological

Psychological damage due to misdiagnosis or fear of infecting others, isolation, or stigmatisation.
https://www.thelancet.com/journals/lanres/article/PIIS2213-2600(20)30453-7/fulltext

In [None]:
_,axs = plt.subplots(1,3,figsize=(12,4))
for ax,method in zip(axs.flatten(), [ResizeMethod.Squish, ResizeMethod.Pad, ResizeMethod.Crop]):
    rsz = Resize(256, method=method)
    show_image(rsz(img, split_idx=0), ctx=ax, title=method);

##Potential consequences of false-positive COVID-19 swab test results

GLOBAL PERSPECTIVE

Financial

Misspent funding (often originating from taxpayers) and human resources for test and trace./Unnecessary testing/
Funding replacements in the workplace/Various business losses.

Epidemiological and diagnostic performance

Overestimating COVID-19 incidence and the extent of asymptomatic infection/ 
Misleading diagnostic performance, potentially leading to mistaken purchasing or investment decisions if a new test shows high performance by identification of negative reference samples as positive (ie, is it a false positive or does the test show higher sensitivity than the other comparator tests used to establish the negativity of the test sample?)

Societal

Misdirection of policies regarding lockdowns and school closures/Increased depression and domestic violence (eg, due to lockdown, isolation, and loss of earnings after a positive test).

https://www.thelancet.com/journals/lanres/article/PIIS2213-2600(20)30453-7/fulltext

In [None]:
crop = RandomResizedCrop(256)
_,axs = plt.subplots(3,3,figsize=(9,9))
for ax in axs.flatten():
    cropped = crop(img)
    show_image(cropped, ctx=ax);

Summarising, false-positive COVID-19 swab test results might be increasingly likely in the current epidemiological climate in the UK, with substantial consequences at the personal, health system, and societal levels.

Several measures might help to minimise false-positive results and mitigate possible consequences. Firstly, stricter standards should be imposed in laboratory testing. This includes the development and implementation of external quality assessment schemes and internal quality systems, such as automatic blinded replication of a small number of tests for performance monitoring to ensure false-positive and false-negative rates remain low, and to permit withdrawal of a malfunctioning test at the earliest possibility.

Secondly, pretest probability assessments should be considered, and clear evidence-based guidelines on interpretation of test results developed. 

Thirdly, policies regarding the testing and prevention of virus transmission in health-care workers might need adjustments, with an immediate second test implemented for any health-care worker testing positive. Finally, research is urgently required into the clinical and epidemiological significance of prolonged virus shedding and the role of people recovering from COVID-19 in disease transmission.
https://www.thelancet.com/journals/lanres/article/PIIS2213-2600(20)30453-7/fulltext

In [None]:
#Code by Olga Belitskaya https://www.kaggle.com/olgabelitskaya/sequential-data/comments
from IPython.display import display,HTML
c1,c2,f1,f2,fs1,fs2=\
'#eb3434','#eb3446','Akronim','Smokum',30,15
def dhtml(string,fontcolor=c1,font=f1,fontsize=fs1):
    display(HTML("""<style>
    @import 'https://fonts.googleapis.com/css?family="""\
    +font+"""&effect=3d-float';</style>
    <h1 class='font-effect-3d-float' style='font-family:"""+\
    font+"""; color:"""+fontcolor+"""; font-size:"""+\
    str(fontsize)+"""px;'>%s</h1>"""%string))
    
    
dhtml('Marília Prata, @mpwolke Was here' )