# False Positive (False alarm) test for SEED

This experiment is based on the papar "Detecting Volatility Shift in Data Stream".

## General idea for this experiment

A stationary binary data stream under Bernoulli distribution is generated and feeds SEED drift detector with particular parameters setting, then any false alarms would be counted to the false positive rate.

## Details for this experiment

This experiment is conducted with exactly the same sets of combination on parameters $\mu$ and $\delta$, and other parameters take the best sample on the paper. However, this implement only takes 100 trials for each parameter combination instead of 1000 trials, because there is significately longer execution time (much more than half hour) on my PC.

## Setup and Configure SEED change detector

In [1]:
# import required package
from scipy.stats import bernoulli
from skmultiflow.drift_detection import SEED
from skmultiflow.drift_detection import ADWIN
import numpy as np

# We use SEED best parameter with blocksize = 32, compression = 75, 
# epsilon = 0.01 and alpha = 0.8

# delta = 0.05
seed05 = SEED(0.05)

# delta = 0.1
seed1 = SEED(0.1)

# delta = 0.3
seed3 = SEED(0.3)

## Details for implements

For each trial

    clear up number of false alarm
    create new data stream
    create new drift detector instance
    
    for each data instance
    
        feed drift detector with data
        check change happening and increment false alarm number
        
    calculate false positive rate (false alarm number / data instance number)

Report the final false positive rate (sum of all cases)

## Experiment begin (Each block for test on one particular combination)

In [15]:
# mu = 0.01 and delta = 0.05

false_rate = []
for i in range (100):
    false_alarm = 0
    data01 = bernoulli.rvs(size=100000, p=0.01)
    seed05 = SEED()
    for j in range(100000):
        seed05.add_element(data01[j])
        if seed05.detected_change():
            false_alarm += 1
    false_rate.append(false_alarm / 100000.0)
print("The false positive rate for mu = 0.01 and delta = 0.05 is", sum(false_rate))

The false positive rate for mu = 0.01 and delta = 0.05 is 8e-05


In [14]:
# mu = 0.01 and delta = 0.1

false_rate = []
for i in range (100):
    false_alarm = 0
    data01 = bernoulli.rvs(size=100000, p=0.01)
    seed1 = SEED(delta=0.1)
    for j in range(100000):
        seed1.add_element(data01[j])
        if seed1.detected_change():
            false_alarm += 1
    false_rate.append(false_alarm / 100000.0)
print("The false positive rate for mu = 0.01 and delta = 0.1 is", sum(false_rate))

The false positive rate for mu = 0.01 and delta = 0.1 is 0.00027


In [13]:
# mu = 0.01 and delta = 0.3

false_rate = []
for i in range (100):
    false_alarm = 0
    data01 = bernoulli.rvs(size=100000, p=0.01)
    seed3 = SEED(0.3)
    for j in range(100000):
        seed3.add_element(data01[j])
        if seed3.detected_change():
            false_alarm += 1
    false_rate.append(false_alarm / 100000.0)
print("The false positive rate for mu = 0.01 and delta = 0.3 is", sum(false_rate))

The false positive rate for mu = 0.01 and delta = 0.3 is 0.0017900000000000019


In [12]:
# mu = 0.1 and delta = 0.05

false_rate = []
for i in range (100):
    false_alarm = 0
    data1 = bernoulli.rvs(size=100000, p=0.1)
    seed05 = SEED(0.05)
    for j in range(100000):
        seed05.add_element(data1[j])
        if seed05.detected_change():
            false_alarm += 1
    false_rate.append(false_alarm / 100000.0)
print("The false positive rate for mu = 0.1 and delta = 0.05 is", sum(false_rate))

The false positive rate for mu = 0.1 and delta = 0.05 is 0.0008200000000000007


In [11]:
# mu = 0.1 and delta = 0.1

false_rate = []
for i in range (100):
    false_alarm = 0
    data1 = bernoulli.rvs(size=100000, p=0.1)
    seed1 = SEED(0.1)
    for j in range(100000):
        seed1.add_element(data1[j])
        if seed1.detected_change():
            false_alarm += 1
    false_rate.append(false_alarm / 100000.0)
print("The false positive rate for mu = 0.1 and delta = 0.1 is", sum(false_rate))

The false positive rate for mu = 0.1 and delta = 0.1 is 0.002120000000000002


In [10]:
# mu = 0.1 and delta = 0.3

false_rate = []
for i in range (100):
    false_alarm = 0
    data1 = bernoulli.rvs(size=100000, p=0.1)
    seed3 = SEED(0.3)
    for j in range(100000):
        seed3.add_element(data1[j])
        if seed3.detected_change():
            false_alarm += 1
    false_rate.append(false_alarm / 100000.0)
print("The false positive rate for mu = 0.1 and delta = 0.3 is", sum(false_rate))

The false positive rate for mu = 0.1 and delta = 0.3 is 0.010140000000000005


In [9]:
# mu = 0.3 and delta = 0.05

false_rate = []
for i in range (100):
    false_alarm = 0
    data3 = bernoulli.rvs(size=100000, p=0.3)
    seed05 = SEED(0.05)
    for j in range(100000):
        seed05.add_element(data3[j])
        if seed05.detected_change():
            false_alarm += 1
    false_rate.append(false_alarm / 100000.0)
print("The false positive rate for mu = 0.3 and delta = 0.05 is", sum(false_rate))

The false positive rate for mu = 0.3 and delta = 0.05 is 0.0012800000000000014


In [7]:
# mu = 0.3 and delta = 0.1

false_rate = []
for i in range (100):
    false_alarm = 0
    data3 = bernoulli.rvs(size=100000, p=0.3)
    seed1 = SEED(0.1)
    for j in range(100000):
        seed1.add_element(data3[j])
        if seed1.detected_change():
            false_alarm += 1
    false_rate.append(false_alarm / 100000.0)
print("The false positive rate for mu = 0.3 and delta = 0.1 is", sum(false_rate))

The false positive rate for mu = 0.3 and delta = 0.1 is 0.004360000000000003


In [6]:
# mu = 0.3 and delta = 0.3

false_rate = []
for i in range (100):
    false_alarm = 0
    data3 = bernoulli.rvs(size=100000, p=0.3)
    seed3 = SEED(0.3)
    for j in range(100000):
        seed3.add_element(data3[j])
        if seed3.detected_change():
            false_alarm += 1
    false_rate.append(false_alarm / 100000.0)
print("The false positive rate for mu = 0.3 and delta = 0.3 is", sum(false_rate))

The false positive rate for mu = 0.3 and delta = 0.3 is 0.01625


In [4]:
# mu = 0.5 and delta = 0.05

false_rate = []
for i in range (100):
    false_alarm = 0
    data5 = bernoulli.rvs(size=100000, p=0.5)
    seed05 = SEED(0.05)
    for j in range(100000):
        seed05.add_element(data5[j])
        if seed05.detected_change():
            false_alarm += 1
    false_rate.append(false_alarm / 100000.0)
print("The false positive rate for mu = 0.5 and delta = 0.05 is", sum(false_rate))

The false positive rate for mu = 0.5 and delta = 0.05 is 0.0018800000000000021


In [3]:
# mu = 0.5 and delta = 0.1

false_rate = []
for i in range (100):
    false_alarm = 0
    data5 = bernoulli.rvs(size=100000, p=0.5)
    seed1 = SEED(0.1)
    for j in range(100000):
        seed1.add_element(data5[j])
        if seed1.detected_change():
            false_alarm += 1
    false_rate.append(false_alarm / 100000.0)
print("The false positive rate for mu = 0.5 and delta = 0.1 is", sum(false_rate))

The false positive rate for mu = 0.5 and delta = 0.1 is 0.004160000000000006


In [2]:
# mu = 0.5 and delta = 0.3

false_rate = []
for i in range (100):
    false_alarm = 0
    data5 = bernoulli.rvs(size=100000, p=0.5)
    seed3 = SEED(0.3)
    for j in range(100000):
        seed3.add_element(data5[j])
        if seed3.detected_change():
            false_alarm += 1
    false_rate.append(false_alarm / 100000.0)
print("The false positive rate for mu = 0.3 and delta = 0.3 is", sum(false_rate))

The false positive rate for mu = 0.3 and delta = 0.3 is 0.01777


## Result table of this experiment

| $\mu$/$\delta$ | 0.05 | 0.1 | 0.3|
| --- | --- | --- | --- |
| 0.01 | 0.0001 | 0.0003 | 0.0018 |
| 0.1 | 0.0008 | 0.0021 | 0.0101 |
| 0.3 | 0.0013 | 0.0044 | 0.0163 |
| 0.5 | 0.0019 | 0.0042 | 0.0178 |

Each block of code executes for about 5 minutes.