# True change point detection of ANGLE change detector

This experiment is based on the paper "Interpreting Intermittent Bugs in Mozilla Applications Using Change Angle".

## General idea for this experiment

Drift detection algorithm ANGLE is feeding with a stationary data stream with noise followed by gradual changing concept, and any true alarm happened would be recorded and counted to the true positive rate, and any true change point detected would be recorded and computed the PWM rate, as well as detection delay and execution time would be also computed in this experiment.

## Details for this experiment

This experiment is conducted with exactly the same sets of changing slopes of parameters $\mu$ and magnitude of variance $\sigma$. However, this implement only tests the ANGLE performance other than comparison between online and offline true change point detection. Due to the fact of the minimum delay time at least 300 steps, I used 300 steps as the bound of true change points instead of 100 steps in the paper.

## Setup and generate different slope with noise data stream

In [1]:
from scipy.stats import bernoulli
from skmultiflow.drift_detection import ANGLE
from operator import add
import numpy as np
import time
import random

# We use confidence parameter delta = 0.05 for ANGLE

In [2]:
# data stream generated for slope = 0.0001
def stream1(variance):
    ran_nums = np.random.uniform(low=0, high=1, size=100000).tolist()
    data_stream = [(i*2-1)*variance+0.2 for i in ran_nums]
    slope = 0
    for i in range(99000, 100000):
        slope += 0.0001
        data_stream[i] += slope
    return data_stream

In [3]:
# data stream generated for slope = 0.0002
def stream2(variance):
    ran_nums = np.random.uniform(low=0, high=1, size=100000).tolist()
    data_stream = [(i*2-1)*variance+0.2 for i in ran_nums]
    slope = 0
    for i in range(99000, 100000):
        slope += 0.0002
        data_stream[i] += slope
    return data_stream

In [4]:
# data stream generated for slope = 0.0003
def stream3(variance):
    ran_nums = np.random.uniform(low=0, high=1, size=100000).tolist()
    data_stream = [(i*2-1)*variance+0.2 for i in ran_nums]
    slope = 0
    for i in range(99000, 100000):
        slope += 0.0003
        data_stream[i] += slope
    return data_stream

In [5]:
# data stream generated for slope = 0.0004
def stream4(variance):
    ran_nums = np.random.uniform(low=0, high=1, size=100000).tolist()
    data_stream = [(i*2-1)*variance+0.2 for i in ran_nums]
    slope = 0
    for i in range(99000, 100000):
        slope += 0.0004
        data_stream[i] += slope
    return data_stream    

## Details for implements
For each trial

    create new drift detector instance
    create new data stream


    for each data instance

        feed drift detector with data
        check change happening and increment true alarm number
        check correct true change point detected within specific steps

calculate and report PWM, true alarm number, average delay time and execution time.

## Data stream with variance of 0.005

In [6]:
# slope = 0.0001
true_alarm = 0
true_change = 0
true_change_points = []
delay_time = []
start_time = time.time()

for i in range(100):
    angle = ANGLE()
    data = stream1(0.005)
    for j in range(100000):
        angle.add_element(data[j])
        if angle.detected_change() and j >= 99000:
            delay_time.append(j-99000)
            true_alarm += 1
            true_change_point = j - angle.get_drift_location()
            if true_change_point <=99100 and true_change_point >= 98900:
                true_change += 1
                true_change_points.append(true_change_point-99000)
            break

print("The PWM (within 100 steps) among 100 trails is", true_change * 1.0 / true_alarm)
print("The change detected number among 100 trails is", true_alarm)
print("The average delay time is", sum(delay_time) * 1.0 / true_alarm)
print("The mean value of steps in change angle is", abs(sum(true_change_points) * 1.0 / true_change))
print("The time for the whole tests in seconds is", time.time() - start_time)


The PWM (within 100 steps) among 100 trails is 0.67
The change detected number among 100 trails is 100
The average delay time is 295.0
The mean value of steps in change angle is 55.08955223880597
The time for the whole tests in seconds is 64.13860702514648


In [7]:
# slope = 0.0002
true_alarm = 0
true_change = 0
true_change_points = []
delay_time = []
start_time = time.time()

for i in range(100):
    angle = ANGLE()
    data = stream2(0.005)
    for j in range(100000):
        angle.add_element(data[j])
        if angle.detected_change() and j >= 99000:
            delay_time.append(j-99000)
            true_alarm += 1
            true_change_point = j - angle.get_drift_location()
            if true_change_point <=99100 and true_change_point >= 98900:
                true_change += 1
                true_change_points.append(true_change_point-99000)
            break

print("The PWM (within 100 steps) among 100 trails is", true_change * 1.0 / true_alarm)
print("The change detected number among 100 trails is", true_alarm)
print("The average delay time is", sum(delay_time) * 1.0 / true_alarm)
print("The mean value of steps in change angle is", abs(sum(true_change_points) * 1.0 / true_change))
print("The time for the whole tests in seconds is", time.time() - start_time)


The PWM (within 100 steps) among 100 trails is 1.0
The change detected number among 100 trails is 100
The average delay time is 231.0
The mean value of steps in change angle is 25.0
The time for the whole tests in seconds is 63.54783487319946


In [8]:
# slope = 0.0003
true_alarm = 0
true_change = 0
true_change_points = []
delay_time = []
start_time = time.time()

for i in range(100):
    angle = ANGLE()
    data = stream3(0.005)
    for j in range(100000):
        angle.add_element(data[j])
        if angle.detected_change() and j >= 99000:
            delay_time.append(j-99000)
            true_alarm += 1
            true_change_point = j - angle.get_drift_location()
            if true_change_point <=99100 and true_change_point >= 98900:
                true_change += 1
                true_change_points.append(true_change_point-99000)
            break

print("The PWM (within 100 steps) among 100 trails is", true_change * 1.0 / true_alarm)
print("The change detected number among 100 trails is", true_alarm)
print("The average delay time is", sum(delay_time) * 1.0 / true_alarm)
print("The mean value of steps in change angle is", abs(sum(true_change_points) * 1.0 / true_change))
print("The time for the whole tests in seconds is", time.time() - start_time)


The PWM (within 100 steps) among 100 trails is 0.51
The change detected number among 100 trails is 100
The average delay time is 174.04
The mean value of steps in change angle is 51.35294117647059
The time for the whole tests in seconds is 63.54183506965637


In [9]:
# slope = 0.0004
true_alarm = 0
true_change = 0
true_change_points = []
delay_time = []
start_time = time.time()

for i in range(100):
    angle = ANGLE()
    data = stream4(0.005)
    for j in range(100000):
        angle.add_element(data[j])
        if angle.detected_change() and j >= 99000:
            delay_time.append(j-99000)
            true_alarm += 1
            true_change_point = j - angle.get_drift_location()
            if true_change_point <=99100 and true_change_point >= 98900:
                true_change += 1
                true_change_points.append(true_change_point-99000)
            break

print("The PWM (within 100 steps) among 100 trails is", true_change * 1.0 / true_alarm)
print("The change detected number among 100 trails is", true_alarm)
print("The average delay time is", sum(delay_time) * 1.0 / true_alarm)
print("The mean value of steps in change angle is", abs(sum(true_change_points) * 1.0 / true_change))
print("The time for the whole tests in seconds is", time.time() - start_time)


The PWM (within 100 steps) among 100 trails is 1.0
The change detected number among 100 trails is 100
The average delay time is 167.0
The mean value of steps in change angle is 25.0
The time for the whole tests in seconds is 63.39467740058899


## Data stream with variance of 0.01

In [10]:
# slope = 0.0001
true_alarm = 0
true_change = 0
true_change_points = []
delay_time = []
start_time = time.time()

for i in range(100):
    angle = ANGLE()
    data = stream1(0.01)
    for j in range(100000):
        angle.add_element(data[j])
        if angle.detected_change() and j >= 99000:
            delay_time.append(j-99000)
            true_alarm += 1
            true_change_point = j - angle.get_drift_location()
            if true_change_point <=99100 and true_change_point >= 98900:
                true_change += 1
                true_change_points.append(true_change_point-99000)
            break

print("The PWM (within 100 steps) among 100 trails is", true_change * 1.0 / true_alarm)
print("The change detected number among 100 trails is", true_alarm)
print("The average delay time is", sum(delay_time) * 1.0 / true_alarm)
print("The mean value of steps in change angle is", abs(sum(true_change_points) * 1.0 / true_change))
print("The time for the whole tests in seconds is", time.time() - start_time)

The PWM (within 100 steps) among 100 trails is 0.73
The change detected number among 100 trails is 100
The average delay time is 315.16
The mean value of steps in change angle is 36.83561643835616
The time for the whole tests in seconds is 63.66015100479126


In [11]:
# slope = 0.0002
true_alarm = 0
true_change = 0
true_change_points = []
delay_time = []
start_time = time.time()

for i in range(100):
    angle = ANGLE()
    data = stream2(0.01)
    for j in range(100000):
        angle.add_element(data[j])
        if angle.detected_change() and j >= 99000:
            delay_time.append(j-99000)
            true_alarm += 1
            true_change_point = j - angle.get_drift_location()
            if true_change_point <=99100 and true_change_point >= 98900:
                true_change += 1
                true_change_points.append(true_change_point-99000)
            break

print("The PWM (within 100 steps) among 100 trails is", true_change * 1.0 / true_alarm)
print("The change detected number among 100 trails is", true_alarm)
print("The average delay time is", sum(delay_time) * 1.0 / true_alarm)
print("The mean value of steps in change angle is", abs(sum(true_change_points) * 1.0 / true_change))
print("The time for the whole tests in seconds is", time.time() - start_time)


The PWM (within 100 steps) among 100 trails is 1.0
The change detected number among 100 trails is 100
The average delay time is 231.0
The mean value of steps in change angle is 32.68
The time for the whole tests in seconds is 63.569565296173096


In [12]:
# slope = 0.0003
true_alarm = 0
true_change = 0
true_change_points = []
delay_time = []
start_time = time.time()

for i in range(100):
    angle = ANGLE()
    data = stream3(0.01)
    for j in range(100000):
        angle.add_element(data[j])
        if angle.detected_change() and j >= 99000:
            delay_time.append(j-99000)
            true_alarm += 1
            true_change_point = j - angle.get_drift_location()
            if true_change_point <=99100 and true_change_point >= 98900:
                true_change += 1
                true_change_points.append(true_change_point-99000)
            break

print("The PWM (within 100 steps) among 100 trails is", true_change * 1.0 / true_alarm)
print("The change detected number among 100 trails is", true_alarm)
print("The average delay time is", sum(delay_time) * 1.0 / true_alarm)
print("The mean value of steps in change angle is", abs(sum(true_change_points) * 1.0 / true_change))
print("The time for the whole tests in seconds is", time.time() - start_time)

The PWM (within 100 steps) among 100 trails is 0.93
The change detected number among 100 trails is 100
The average delay time is 194.84
The mean value of steps in change angle is 26.72043010752688
The time for the whole tests in seconds is 63.37738394737244


In [13]:
# slope = 0.0004
true_alarm = 0
true_change = 0
true_change_points = []
delay_time = []
start_time = time.time()

for i in range(100):
    angle = ANGLE()
    data = stream4(0.01)
    for j in range(100000):
        angle.add_element(data[j])
        if angle.detected_change() and j >= 99000:
            delay_time.append(j-99000)
            true_alarm += 1
            true_change_point = j - angle.get_drift_location()
            if true_change_point <=99100 and true_change_point >= 98900:
                true_change += 1
                true_change_points.append(true_change_point-99000)
            break

print("The PWM (within 100 steps) among 100 trails is", true_change * 1.0 / true_alarm)
print("The change detected number among 100 trails is", true_alarm)
print("The average delay time is", sum(delay_time) * 1.0 / true_alarm)
print("The mean value of steps in change angle is", abs(sum(true_change_points) * 1.0 / true_change))
print("The time for the whole tests in seconds is", time.time() - start_time)

The PWM (within 100 steps) among 100 trails is 1.0
The change detected number among 100 trails is 100
The average delay time is 167.0
The mean value of steps in change angle is 25.96
The time for the whole tests in seconds is 63.790430784225464


## Data stream with variance of 0.015

In [14]:
# slope = 0.0001
true_alarm = 0
true_change = 0
true_change_points = []
delay_time = []
start_time = time.time()

for i in range(100):
    angle = ANGLE()
    data = stream1(0.015)
    for j in range(100000):
        angle.add_element(data[j])
        if angle.detected_change() and j >= 99000:
            delay_time.append(j-99000)
            true_alarm += 1
            true_change_point = j - angle.get_drift_location()
            if true_change_point <=99100 and true_change_point >= 98900:
                true_change += 1
                true_change_points.append(true_change_point-99000)
            break

print("The PWM (within 100 steps) among 100 trails is", true_change * 1.0 / true_alarm)
print("The change detected number among 100 trails is", true_alarm)
print("The average delay time is", sum(delay_time) * 1.0 / true_alarm)
print("The mean value of steps in change angle is", abs(sum(true_change_points) * 1.0 / true_change))
print("The time for the whole tests in seconds is", time.time() - start_time)

The PWM (within 100 steps) among 100 trails is 0.78
The change detected number among 100 trails is 100
The average delay time is 322.52
The mean value of steps in change angle is 41.82051282051282
The time for the whole tests in seconds is 63.598029136657715


In [15]:
# slope = 0.0002
true_alarm = 0
true_change = 0
true_change_points = []
delay_time = []
start_time = time.time()

for i in range(100):
    angle = ANGLE()
    data = stream2(0.015)
    for j in range(100000):
        angle.add_element(data[j])
        if angle.detected_change() and j >= 99000:
            delay_time.append(j-99000)
            true_alarm += 1
            true_change_point = j - angle.get_drift_location()
            if true_change_point <=99100 and true_change_point >= 98900:
                true_change += 1
                true_change_points.append(true_change_point-99000)
            break

print("The PWM (within 100 steps) among 100 trails is", true_change * 1.0 / true_alarm)
print("The change detected number among 100 trails is", true_alarm)
print("The average delay time is", sum(delay_time) * 1.0 / true_alarm)
print("The mean value of steps in change angle is", abs(sum(true_change_points) * 1.0 / true_change))
print("The time for the whole tests in seconds is", time.time() - start_time)

The PWM (within 100 steps) among 100 trails is 0.93
The change detected number among 100 trails is 100
The average delay time is 231.0
The mean value of steps in change angle is 40.82795698924731
The time for the whole tests in seconds is 63.60980772972107


In [16]:
# slope = 0.0003
true_alarm = 0
true_change = 0
true_change_points = []
delay_time = []
start_time = time.time()

for i in range(100):
    angle = ANGLE()
    data = stream3(0.015)
    for j in range(100000):
        angle.add_element(data[j])
        if angle.detected_change() and j >= 99000:
            delay_time.append(j-99000)
            true_alarm += 1
            true_change_point = j - angle.get_drift_location()
            if true_change_point <=99100 and true_change_point >= 98900:
                true_change += 1
                true_change_points.append(true_change_point-99000)
            break

print("The PWM (within 100 steps) among 100 trails is", true_change * 1.0 / true_alarm)
print("The change detected number among 100 trails is", true_alarm)
print("The average delay time is", sum(delay_time) * 1.0 / true_alarm)
print("The mean value of steps in change angle is", abs(sum(true_change_points) * 1.0 / true_change))
print("The time for the whole tests in seconds is", time.time() - start_time)

The PWM (within 100 steps) among 100 trails is 0.99
The change detected number among 100 trails is 100
The average delay time is 198.68
The mean value of steps in change angle is 34.37373737373738
The time for the whole tests in seconds is 63.58575391769409


In [17]:
# slope = 0.0004
true_alarm = 0
true_change = 0
true_change_points = []
delay_time = []
start_time = time.time()

for i in range(100):
    angle = ANGLE()
    data = stream4(0.015)
    for j in range(100000):
        angle.add_element(data[j])
        if angle.detected_change() and j >= 99000:
            delay_time.append(j-99000)
            true_alarm += 1
            true_change_point = j - angle.get_drift_location()
            if true_change_point <=99100 and true_change_point >= 98900:
                true_change += 1
                true_change_points.append(true_change_point-99000)
            break

print("The PWM (within 100 steps) among 100 trails is", true_change * 1.0 / true_alarm)
print("The change detected number among 100 trails is", true_alarm)
print("The average delay time is", sum(delay_time) * 1.0 / true_alarm)
print("The mean value of steps in change angle is", abs(sum(true_change_points) * 1.0 / true_change))
print("The time for the whole tests in seconds is", time.time() - start_time)

The PWM (within 100 steps) among 100 trails is 1.0
The change detected number among 100 trails is 100
The average delay time is 167.0
The mean value of steps in change angle is 29.48
The time for the whole tests in seconds is 63.6290168762207


## Data stream with variance of 0.02

In [18]:
# slope = 0.0001
true_alarm = 0
true_change = 0
true_change_points = []
delay_time = []
start_time = time.time()

for i in range(100):
    angle = ANGLE()
    data = stream1(0.02)
    for j in range(100000):
        angle.add_element(data[j])
        if angle.detected_change() and j >= 99000:
            delay_time.append(j-99000)
            true_alarm += 1
            true_change_point = j - angle.get_drift_location()
            if true_change_point <=99100 and true_change_point >= 98900:
                true_change += 1
                true_change_points.append(true_change_point-99000)
            break

print("The PWM (within 100 steps) among 100 trails is", true_change * 1.0 / true_alarm)
print("The change detected number among 100 trails is", true_alarm)
print("The average delay time is", sum(delay_time) * 1.0 / true_alarm)
print("The mean value of steps in change angle is", abs(sum(true_change_points) * 1.0 / true_change))
print("The time for the whole tests in seconds is", time.time() - start_time)

The PWM (within 100 steps) among 100 trails is 0.72
The change detected number among 100 trails is 100
The average delay time is 325.72
The mean value of steps in change angle is 37.44444444444444
The time for the whole tests in seconds is 63.616175174713135


In [19]:
# slope = 0.0002
true_alarm = 0
true_change = 0
true_change_points = []
delay_time = []
start_time = time.time()

for i in range(100):
    angle = ANGLE()
    data = stream2(0.02)
    for j in range(100000):
        angle.add_element(data[j])
        if angle.detected_change() and j >= 99000:
            delay_time.append(j-99000)
            true_alarm += 1
            true_change_point = j - angle.get_drift_location()
            if true_change_point <=99100 and true_change_point >= 98900:
                true_change += 1
                true_change_points.append(true_change_point-99000)
            break

print("The PWM (within 100 steps) among 100 trails is", true_change * 1.0 / true_alarm)
print("The change detected number among 100 trails is", true_alarm)
print("The average delay time is", sum(delay_time) * 1.0 / true_alarm)
print("The mean value of steps in change angle is", abs(sum(true_change_points) * 1.0 / true_change))
print("The time for the whole tests in seconds is", time.time() - start_time)

The PWM (within 100 steps) among 100 trails is 0.9
The change detected number among 100 trails is 100
The average delay time is 231.0
The mean value of steps in change angle is 41.355555555555554
The time for the whole tests in seconds is 63.5576069355011


In [20]:
# slope = 0.0003
true_alarm = 0
true_change = 0
true_change_points = []
delay_time = []
start_time = time.time()

for i in range(100):
    angle = ANGLE()
    data = stream3(0.02)
    for j in range(100000):
        angle.add_element(data[j])
        if angle.detected_change() and j >= 99000:
            delay_time.append(j-99000)
            true_alarm += 1
            true_change_point = j - angle.get_drift_location()
            if true_change_point <=99100 and true_change_point >= 98900:
                true_change += 1
                true_change_points.append(true_change_point-99000)
            break

print("The PWM (within 100 steps) among 100 trails is", true_change * 1.0 / true_alarm)
print("The change detected number among 100 trails is", true_alarm)
print("The average delay time is", sum(delay_time) * 1.0 / true_alarm)
print("The mean value of steps in change angle is", abs(sum(true_change_points) * 1.0 / true_change))
print("The time for the whole tests in seconds is", time.time() - start_time)

The PWM (within 100 steps) among 100 trails is 0.99
The change detected number among 100 trails is 100
The average delay time is 199.0
The mean value of steps in change angle is 35.02020202020202
The time for the whole tests in seconds is 63.761653900146484


In [21]:
# slope = 0.0004
true_alarm = 0
true_change = 0
true_change_points = []
delay_time = []
start_time = time.time()

for i in range(100):
    angle = ANGLE()
    data = stream4(0.02)
    for j in range(100000):
        angle.add_element(data[j])
        if angle.detected_change() and j >= 99000:
            delay_time.append(j-99000)
            true_alarm += 1
            true_change_point = j - angle.get_drift_location()
            if true_change_point <=99100 and true_change_point >= 98900:
                true_change += 1
                true_change_points.append(true_change_point-99000)
            break

print("The PWM (within 300 steps) among 100 trails is", true_change * 1.0 / true_alarm)
print("The change detected number among 100 trails is", true_alarm)
print("The average delay time is", sum(delay_time) * 1.0 / true_alarm)
print("The mean value of steps in change angle is", abs(sum(true_change_points) * 1.0 / true_change))
print("The time for the whole tests in seconds is", time.time() - start_time)

The PWM (within 300 steps) among 100 trails is 0.98
The change detected number among 100 trails is 100
The average delay time is 167.0
The mean value of steps in change angle is 29.897959183673468
The time for the whole tests in seconds is 63.585938930511475


## Result table of this experiment

$\sigma$ = 0.005

| slope | SEED TP rate | SEED Delay | ANGLE PWM | ANGLE TC point |
| --- | --- | --- | --- | --- |
| 0.0001 | 100 | 295.32 | 0.66 | 52.15 |
| 0.0002 | 100 | 231.00 | 1.00 | 25.32 |
| 0.0003 | 100 | 176.28 | 0.50 | 44.20 |
| 0.0004 | 100 | 167.00 | 1.00 | 25.00 |

$\sigma$ = 0.01

| slope | SEED TP rate | SEED Delay | ANGLE PWM | ANGLE TC point |
| --- | --- | --- | --- | --- |
| 0.0001 | 100 | 316.76 | 0.80 | 38.20 |
| 0.0002 | 100 | 231.00 | 1.00 | 33.73 |
| 0.0003 | 100 | 193.24 | 0.93 | 26.51 |
| 0.0004 | 100 | 167.00 | 1.00 | 25.64 |

$\sigma$ = 0.015

| slope | SEED TP rate | SEED Delay | ANGLE PWM | ANGLE TC point |
| --- | --- | --- | --- | --- |
| 0.0001 | 100 | 323.16 | 0.78 | 38.97 |
| 0.0002 | 100 | 231.00 | 0.93 | 42.93 |
| 0.0003 | 100 | 198.04 | 0.99 | 27.94 |
| 0.0004 | 100 | 167.00 | 1.00 | 27.24 |

$\sigma$ = 0.02

| slope | SEED TP rate | SEED Delay | ANGLE PWM | ANGLE TC point |
| --- | --- | --- | --- | --- |
| 0.0001 | 100 | 326.04 | 0.72 | 44.01 |
| 0.0002 | 100 | 231.00 | 0.90 | 41.71 |
| 0.0003 | 100 | 198.68 | 0.98 | 35.78 |
| 0.0004 | 100 | 167.00 | 0.99 | 30.17 |