# COVID-19 Testing

How useful are tests for determining who has a disease? It depends on both the test and how common the disease is in the population being tested.

[This article](https://theconversation.com/coronavirus-surprisingly-big-problems-caused-by-small-errors-in-testing-136700) in the Conversation presents a great overview of the counterintuitive results that can occour when testing populations for diseases.


When testing the general population, the problem is correctly identifying people who have a rare disease - the true positives can easily be swamped by the false positives. You will find more false positives than true positives.

If we instead think about identifying people who are admitted to hospitals, the problem is reversed. Most people have the disease, so the false negatives will swamp the true negatives. You will find more false negatives than true negatives.


## Sensitivity and Specificty of a test

These two terms, sensitivity and specificity, have very particular meanings. 

- Sensitivity measures the proportion of true positives that are correctly identified (you have the disease, and the test correctly gives you a positive result)
- Specificty measures the proportion of true negatives that are correctly identified (you don't have the disease, and the test correctly gives you a negative result)

The Wikipedia article on this [topic](https://en.wikipedia.org/wiki/Sensitivity_and_specificity) is good, but dense.

## To use this notebook
Click on "Cell" in the menu across the top, then, click "Run All". You can then adjust the sliders near the bottom of the notebook by clicking on them and dragging them side-to-side.

In [1]:
import matplotlib.pyplot as plt
import ipywidgets
import locale
locale.setlocale(locale.LC_ALL, '')  # Use '' for auto, or force e.g. to 'en_US.UTF-8'



%matplotlib inline

Let's define a funtion tht returns the number of true positives, false positives, true negatives, and false negatives for a given set of test parameters.

In [2]:
def test_results(n_tests, prop_pos, sensitivity, specificity):
    TP = n_tests*prop_pos*sensitivity
    FN = n_tests*prop_pos*(1-sensitivity)
    
    TN = n_tests*(1-prop_pos)*specificity
    FP = n_tests*(1-prop_pos)*(1-specificity)
    
    return TP, FP, TN, FN
    

Now define a function to plot these results.

In [3]:
def plot_results(n_tests=1e4, prop_pos=0.03, sensitivity=0.938, specificity=0.956):
    
    TP, FP, TN, FN = test_results(n_tests, prop_pos, sensitivity, specificity)

    plt.figure(figsize=(10,10))
    # population
    plt.barh(0, n_tests, label='Total tests administered = {:,}'.format(n_tests))
    plt.text(0.1*n_tests,0, 'Total tests administered {:,}'.format(n_tests), fontsize=20)

    # population distribution
    plt.barh(-1, n_tests*prop_pos, left=-n_tests*0.025,
             label='Have disease = {:n}'.format(round(n_tests*prop_pos)))
    if prop_pos >= 0.4:
        plt.text(0, -1, 'Have disease \n{:,}'.format(round(n_tests*prop_pos)), fontsize=20)


    plt.barh(-1, n_tests*(1-prop_pos), left=n_tests*prop_pos + n_tests*0.025,
             label='Do not have disease = {:n}'.format(round(n_tests*(1-prop_pos))))
    if prop_pos < 0.5:
        plt.text(0.05*n_tests + n_tests*prop_pos, -1,
                 'Do not have disease \n{:,}'.format(round(n_tests*(1-prop_pos))), fontsize=20)



    # test results
    plt.barh(-2, TP, left=-0.075*n_tests, label='True positives = {:n}'.format(round(TP)))
    plt.text(-0.1*n_tests + TP/2, -1.55, 'TP', fontsize=18)
    
    plt.barh(-2, FN, left=TP-0.025*n_tests, label='False Negatives = {:n}'.format(round(FN)))
    plt.text(TP-0.05*n_tests + FN/2, -1.55, 'FN', fontsize=18)

    plt.barh(-2, TN, left=TP+FN + 0.025*n_tests, label='True Negatives = {:n}'.format(round(TN)))
    plt.text(TP+FN + TN/2, -1.55, 'TN', fontsize=18)

    plt.barh(-2, FP, left=TP+FN+TN + 0.075*n_tests, label='False Positives = {:n}'.format(round(FP)))
    plt.text(TP+FN+TN+0.05*n_tests + FP/2, -1.55, 'FP', fontsize=18)

    plt.legend(fontsize=20, bbox_to_anchor=(1.,0.5))

    
    plt.text(0, -2.6,
             'Percentage of positive results that are true positives = {:2.2f}%'.format(100*TP/(TP+FP)),
             fontsize=18)

    plt.text(0, -2.8,
             'Percentage of negative results that are true negatives = {:2.2f}%'.format(100*TN/(TN+FN)),
             fontsize=18)

    
    plt.xlim(-0.1*n_tests, 1.15*n_tests)
    plt.axis('off')
    plt.show()


Remember these definitions:

how many tests are administered: **n_tests**

true proportion of population with the disease: **prop_pos**

**sensitivity** is the proportion of people with the disease who are correctly identified

**specificity** is the proportion of people without the disease who are correctly identified

In [4]:
ipywidgets.interact(plot_results,
                    n_tests=(1000,100000),
                    prop_pos=(0.0,1,0.01),
                    sensitivity=(0.7,1,0.01),
                    specificity=(0.7,1,0.01)
                   );

interactive(children=(IntSlider(value=10000, description='n_tests', max=100000, min=1000), FloatSlider(value=0…