# Assessing Classification Accuracy

## [Signal Detection Theory](http://gureckislab.org/courses/fall19/labincp/chapters/10/00-sdt.html)

- Why do we need Signal Detection Theory?
- What are the roots of signal detection theory?

## What is an ROC Curve?
- How is it generated?
- What are its characteristics?

In [1]:
from cdsutils.mutils import *
from cdsutils.sdt import *
#%matplotlib inline

## Univariate Binary Decisions

We are going to explore the simplest decision case: There is a single observable variable, for example, [temperature](https://medlineplus.gov/ency/article/001982.htm) or [white blood count](https://medlineplus.gov/ency/article/003643.htm) that has a continuous value and a binary classification that is being made based on that variable. For example, the patient has __infection__ (which we will refer to as __positive__) or __no infection__ (which we will refer to as __negative__). Further, we are going to assume that greater (more positive) values for the observable variable are associated with __positive disease state__.

## Sensitivity and Specificity

The __sensitivity__ (true positive fraction (TPF)) of a test is the probability that an actually positive case will be labeled as positive by the test. It can be computed as the number of __actually positive__ cases __labeled__ as positive by the test divided by the __total number of positive cases__.

The __specificity__ (true negative fraction (TNF)) of a test is the probability that an actually negative case will be labeled as negative by the test. It can be computed as the number of __actually negative__ cases __labeled__ as negative by the test divided byt he __total number of negative cases__.



In [2]:
SensSpecCheck()


SensSpecCheck(children=(HBox(children=(VBox(children=(Button(description='Check Scores', style=ButtonStyle()),…

## [Confusion Matrix](https://en.wikipedia.org/wiki/Confusion_matrix)

Computing the sensitivity and specificity is easier if we first compute the __confusion matrix__.

In [3]:
SensSpecCheck2()

SensSpecCheck2(children=(HBox(children=(VBox(children=(Button(description='Check Scores', style=ButtonStyle())…

## Key Points for Sensitivity and Specificity

- Sensitivity and Specificity are __insensitive__ to the disease prevalence
- Sensitivity and specificity depend to the arbitrary choice of the __threshold__.
- For an imperfect test, when I change the threshold to __increase sensitivity__ I will necessary __decrease specificity__. Similarly, any threshold change to __increase specificity__ will necessarily __decrease sensitivity__. This will be explored more below with ROC curves.


## PPV and NPV

The predictive value of a test depends on the prevalence of the disease in the population being tested. The positive predictive value (PPV) and the negative predictive value (NPV) capture this dependency.

The PPV is the probability that someone with a positive test is actually disease positive. PPV is computed as the number of actually positive cases with a positive test divided by the total number of cases with a positive test.

Similarly the NPV is the probability that someone with a negative test is actually disease negative. NPV is computed as the number of actually negative cases with a negative test value divided by the total number of cases with a negative test.

## Receiver Operating Characteristic (ROC) Curves

Because sensitivity and specificity are dependent on the arbitrary choice of the threshold, it is often desirable to have a measure that summarizes a test across all possible choices of a threshold. A common way of doing this is with ROC curves, where we can compute the area under the curve (AUC) as a summary statistic of the test's performance. in the cells below, you can explore these ideas with different populations. 

### Let's Generate some random populations

In my analyses I assume that __positive__ cases have a more __positive__ value of the test result.

In [4]:
ExploreStats()

ExploreStats(children=(HBox(children=(VBox(children=(Dropdown(description='Number:', options=('uniform', 'gaus…