# Batch Testing Tutorial

This tutorial is served as two roles:

1. Let our audiences be familar with our code.
2. Reproduce the results on [Modeling and Computation of High Efficiency and Efficacy Multi-Step Batch Testing for Infectious Diseases] (https://arxiv.org/abs/2006.16079)




In [1]:
import numpy as np
import pandas as pd
from numba import njit, jit
from sklearn.metrics import recall_score, precision_score
import time
import numba
import fast_btk as fbtk
from numba.core.errors import NumbaDeprecationWarning, NumbaPendingDeprecationWarning, NumbaPerformanceWarning
import warnings
warnings.filterwarnings('ignore')
warnings.filterwarnings("ignore", category=NumbaPerformanceWarning)
warnings.simplefilter('ignore', category=NumbaDeprecationWarning)
warnings.simplefilter('ignore', category=NumbaPendingDeprecationWarning)
%load_ext autoreload
%autoreload 2

# Data generation

The function `data_gen` can generate a population with a certain size $N$ and infection rate $p$

In [2]:
np.random.seed(0)
a = fbtk.data_gen(size = 10, p = 0.1)
print(a)

[[0 0]
 [1 0]
 [2 0]
 [3 0]
 [4 0]
 [5 0]
 [6 0]
 [7 0]
 [8 1]
 [9 0]]


# Conventional Test

`conventional_test` gives the test results to a subject array given the probability of type II error, the probability of Type I error, and the number of repeatition, and setting of sequence testing or not.


In [3]:
subject_array = fbtk.data_gen(10, 0.1)
test_result, consum = fbtk.conventional_test(subject_array, typeII_error = 0.15,
typeI_error=0.01, repeat= 1)
print(f'accuracy: {np.mean(subject_array[:,1] == test_result[:,1])}')
print(f'test consumption {consum}')

accuracy: 1.0
test consumption 10


## Multi-step Batch Testing

`seq_test` gives the test results to a subject array and the total number of 
test-kit consumption and the individual testing number given the subject array,
the stop rule, the batch size, the probability of type II error, the probability of Type I error, and the number of repeatition, the probability threshold, and 
setting of sequence testing or not.

The following code will generate a population with size 100000 and the infection rate is 0.01. The setting of this multi-step batch testing is up to 3 sequential individual tests for 3 batch positives.

In [4]:
subject_array = fbtk.data_gen(100000, 0.01)
batch_size = fbtk.one_batch_test_int_solver(0.01, 0.15, 0.01)
test_result, consum, ind_consum = fbtk.seq_test(subject_array, batch_size = batch_size,stop_rule = 3,p = 0.01, typeII_error = 0.15, typeI_error=0.01, repeat= 3, seq = True)
print(f'accuracy: {np.mean(subject_array[:,1] == test_result[:,1])}')
print(f'test consumption {consum}')

accuracy: 0.99907
test consumption 27994.0


# Reproduce Results

The following code is to produce results on Table 7 and Table 8 on Appendix. We will go through the table 7(a) and show the output. 

In [6]:
# table 7 (a)
time_start = time.time()
np.random.seed(0)
df = pd.DataFrame([], columns = ['Infection_rate', 'Acc', 'Acc_SD', 'Sens', 'Sens_SD', 'Spec','Spec_SD','PPV', 'PPV_SD',
'NPV', 'NPV_SD', 'Test_consum', 'Test_consum_SD'])
for prob in [0.001, 0.01, 0.03, 0.05, 0.10]:
    temp_data = [fbtk.data_gen(100000, prob) for _ in range(100)]
    length = len(temp_data)
    acc = np.zeros(length)
    sens = np.zeros(length)
    spec = np.zeros(length)
    ppv = np.zeros(length)
    npv = np.zeros(length)
    test_consum = np.zeros(length)
    for i in range(length):
        pred, consum = fbtk.conventional_test(temp_data[i], typeII_error= 0.15, typeI_error=0.01)
        acc[i] = np.mean(pred[:,1] == temp_data[i][:, 1])
        sens[i] = recall_score(temp_data[i][:, 1], pred[:, 1])
        spec[i] = fbtk.specificity_score(temp_data[i][:, 1], pred[:, 1])
        ppv[i] = precision_score(temp_data[i][:, 1], pred[:, 1])
        npv[i] = fbtk.npv_score(temp_data[i][:, 1], pred[:, 1])
        test_consum[i] = consum
    result = {
        'acc': acc,
        'sens': sens,
        'spec': spec,
        'PPV': ppv,
        'NPV': npv,
        'test_consum': test_consum
    
    }
    result = pd.DataFrame(result)
    result_mean = result.mean()
    result_std = result.std()
    temp_df = [prob, result_mean['acc'], result_std['acc'], result_mean['sens'], result_std['sens'],
    result_mean['spec'], result_std['spec'], result_mean['PPV'], result_std['PPV'], result_mean['NPV'],
    result_std['NPV'], result_mean['test_consum'], result_std['test_consum']]
    temp_df = pd.DataFrame(temp_df)
    temp_df = temp_df.T
    temp_df.columns = df.columns
    df = pd.concat([df, temp_df])


  
time_end = time.time()
print('time cost:', time_end - time_start, 's')

time cost: 152.85753273963928 s


In [7]:
# Show the result
df

Unnamed: 0,Infection_rate,Acc,Acc_SD,Sens,Sens_SD,Spec,Spec_SD,PPV,PPV_SD,NPV,NPV_SD,Test_consum,Test_consum_SD
0,0.001,0.989861,0.000327,0.852501,0.035635,0.990001,0.000326,0.079851,0.008803,0.999849,3.9e-05,100000.0,0.0
0,0.01,0.988598,0.000316,0.849064,0.01049,0.990009,0.0003,0.462084,0.010728,0.998461,0.000114,100000.0,0.0
0,0.03,0.985887,0.000393,0.849974,0.006703,0.990073,0.000319,0.725062,0.007459,0.995355,0.000219,100000.0,0.0
0,0.05,0.983061,0.000396,0.85093,0.005089,0.990012,0.000326,0.817558,0.005445,0.992142,0.0003,100000.0,0.0
0,0.1,0.975959,0.000442,0.849704,0.003048,0.98999,0.000335,0.904148,0.003147,0.983409,0.000367,100000.0,0.0


For table 7 (b)

In [5]:
# table 7 (b)
time_start = time.time()
np.random.seed(0)
df = pd.DataFrame([], columns = ['Infection_rate', 'Sequential_test', 'Stop_rule', 'Repeat', 'Prob_threshold', 'Acc', 'Acc_SD', 'Sens', 'Sens_SD', 'Spec','Spec_SD','PPV', 'PPV_SD',
'NPV', 'NPV_SD', 'Test_consum', 'Test_consum_SD', 'Ind_consum', 'Ind_consum_SD', 'Batch_consum','Batch_consum_SD'])
for prob in [0.001, 0.01, 0.03, 0.05, 0.10]:
    temp_data = [fbtk.data_gen(100000, prob) for _ in range(100)]
    for i in [True]:
        for j in [1]:
            for k in [1]:
                
                kwargs = {'stop_rule': j, 'p': prob, 'batch_size': 10,
                'typeII_error': 0.15, 'typeI_error': 0.01, 'repeat': k,
                'prob_threshold': 0.3, 'seq': i}
                test_1 = fbtk.test_result(temp_data, fbtk.seq_test, **kwargs)
                temp_mean = test_1.mean()
                temp_std = test_1.std()
                temp = [kwargs['p'], kwargs['seq'], kwargs['stop_rule'], kwargs['repeat'], kwargs['prob_threshold'],temp_mean['acc'], temp_std['acc'], temp_mean['sens'], temp_std['sens'], temp_mean['spec'], temp_std['spec'], temp_mean['PPV'], temp_std['PPV'], temp_mean['NPV'], temp_std['NPV'], temp_mean['test_consum'], temp_std['test_consum'], temp_mean['ind_consum'], temp_std['ind_consum'], temp_mean['batch_consum'], temp_std['batch_consum']]
                temp_df = pd.DataFrame(temp)
                temp_df = temp_df.T
                temp_df.columns = ['Infection_rate', 'Sequential_test', 'Stop_rule', 'Repeat','Prob_threshold', 'Acc', 'Acc_SD', 'Sens', 'Sens_SD', 'Spec','Spec_SD','PPV', 'PPV_SD',
    'NPV', 'NPV_SD', 'Test_consum', 'Test_consum_SD', 'Ind_consum', 'Ind_consum_SD', 'Batch_consum','Batch_consum_SD']
                df = pd.concat([df, temp_df])
            
time_end = time.time()
print('time cost:', time_end - time_start, 's')

time cost: 112.00877380371094 s


In [6]:
df

Unnamed: 0,Infection_rate,Sequential_test,Stop_rule,Repeat,Prob_threshold,Acc,Acc_SD,Sens,Sens_SD,Spec,...,PPV,PPV_SD,NPV,NPV_SD,Test_consum,Test_consum_SD,Ind_consum,Ind_consum_SD,Batch_consum,Batch_consum_SD
0,0.001,True,1,1,0.3,0.999545,6.9e-05,0.72611,0.041315,0.999823,...,0.807392,0.044873,0.999721,5.2e-05,11840.5,133.366473,1840.5,133.366473,10000.0,0.0
0,0.01,True,1,1,0.3,0.996432,0.000197,0.721298,0.01629,0.999193,...,0.899801,0.010102,0.997208,0.000178,18981.1,259.295452,8981.1,259.295452,10000.0,0.0
0,0.03,True,1,1,0.3,0.989654,0.000349,0.724059,0.009308,0.997879,...,0.913588,0.005681,0.99151,0.00031,33107.5,438.317202,23107.5,438.317202,10000.0,0.0
0,0.05,True,1,1,0.3,0.983083,0.000464,0.721876,0.007492,0.996817,...,0.922653,0.003949,0.985541,0.000424,44693.3,500.375728,34693.3,500.375728,10000.0,0.0
0,0.1,True,1,1,0.3,0.967491,0.000608,0.72208,0.005245,0.994774,...,0.938878,0.002826,0.969876,0.000563,65671.4,519.189384,55671.4,519.189384,10000.0,0.0


In [218]:
df.to_csv('table7_b.csv')

For table 7 (c)

In [203]:
time_start = time.time()
np.random.seed(0)
df = pd.DataFrame([], columns = ['Infection_rate', 'Sequential_test', 'Stop_rule', 'Repeat', 'Prob_threshold', 'Acc', 'Acc_SD', 'Sens', 'Sens_SD', 'Spec','Spec_SD','PPV', 'PPV_SD',
'NPV', 'NPV_SD', 'Test_consum', 'Test_consum_SD', 'Ind_consum', 'Ind_consum_SD', 'Batch_consum','Batch_consum_SD'])
for prob in [0.001, 0.01, 0.03, 0.05, 0.10]:
    temp_data = [fbtk.data_gen(100000, prob) for _ in range(100)]
    for i in [True]:
        for j in [1]:
            for k in [3]:
                batch_size = fbtk.one_batch_test_int_solver(prob, 0.15, 0.01)
                kwargs = {'stop_rule': j, 'p': prob, 'batch_size': batch_size,
                'typeII_error': 0.15, 'typeI_error': 0.01, 'repeat': k,
                'prob_threshold': 0.3, 'seq': i}
                test_1 = fbtk.test_result(temp_data, fbtk.seq_test, **kwargs)
                temp_mean = test_1.mean()
                temp_std = test_1.std()
                temp = [kwargs['p'], kwargs['seq'], kwargs['stop_rule'], kwargs['repeat'], kwargs['prob_threshold'],temp_mean['acc'], temp_std['acc'], temp_mean['sens'], temp_std['sens'], temp_mean['spec'], temp_std['spec'], temp_mean['PPV'], temp_std['PPV'], temp_mean['NPV'], temp_std['NPV'], temp_mean['test_consum'], temp_std['test_consum'], temp_mean['ind_consum'], temp_std['ind_consum'], temp_mean['batch_consum'], temp_std['batch_consum']]
                temp_df = pd.DataFrame(temp)
                temp_df = temp_df.T
                temp_df.columns = ['Infection_rate', 'Sequential_test', 'Stop_rule', 'Repeat','Prob_threshold', 'Acc', 'Acc_SD', 'Sens', 'Sens_SD', 'Spec','Spec_SD','PPV', 'PPV_SD',
    'NPV', 'NPV_SD', 'Test_consum', 'Test_consum_SD', 'Ind_consum', 'Ind_consum_SD', 'Batch_consum','Batch_consum_SD']
                df = pd.concat([df, temp_df])
            
time_end = time.time()
print('time cost:', time_end - time_start, 's')

time cost: 132.43640065193176 s


In [233]:
# table 7 d
time_start = time.time()
np.random.seed(0)
df = pd.DataFrame([], columns = ['Infection_rate', 'Sq_Repeat', 'Ind_Repeat', 'Acc', 'Acc_SD', 'Sens', 'Sens_SD', 'Spec','Spec_SD','PPV', 'PPV_SD',
'NPV', 'NPV_SD', 'Test_consum', 'Test_consum_SD', 'Ind_consum', 'Ind_consum_SD', 'Batch_consum','Batch_consum_SD'])
for prob in [0.001, 0.01, 0.03, 0.05, 0.10]:
    temp_data = [fbtk.data_gen(100000, prob) for _ in range(100)]
    for i in [3]: # sq_repeat
        for j in [3]: # ind_repeat
            kwargs = {
                'side_length': 12,
                'typeII_error': 0.15,
                'typeI_error': 0.01,
                'sq_repeat': i,
                'ind_repeat': j
            }
            test_1 = fbtk.test_result(temp_data, fbtk.matrix_test, **kwargs)
            temp_mean = test_1.mean()
            temp_std = test_1.std()
            temp = [prob, kwargs['sq_repeat'], kwargs['ind_repeat'], temp_mean['acc'], temp_std['acc'], temp_mean['sens'], temp_std['sens'], temp_mean['spec'], temp_std['spec'], temp_mean['PPV'], temp_std['PPV'], temp_mean['NPV'], temp_std['NPV'], temp_mean['test_consum'], temp_std['test_consum'], temp_mean['ind_consum'], temp_std['ind_consum'], temp_mean['batch_consum'], temp_std['batch_consum']]
            temp_df = pd.DataFrame(temp)
            temp_df = temp_df.T
            temp_df.columns = ['Infection_rate', 'Sq_Repeat', 'Ind_Repeat', 'Acc', 'Acc_SD', 'Sens', 'Sens_SD', 'Spec','Spec_SD','PPV', 'PPV_SD',
'NPV', 'NPV_SD', 'Test_consum', 'Test_consum_SD', 'Ind_consum', 'Ind_consum_SD', 'Batch_consum','Batch_consum_SD']
            df = pd.concat([df, temp_df])

            
                
               
            
time_end = time.time()
print('time cost:', time_end - time_start, 's')

time cost: 153.5919063091278 s


table 7 (E)

In [208]:
time_start = time.time()
np.random.seed(0)
df = pd.DataFrame([], columns = ['Infection_rate', 'Sequential_test', 'Stop_rule', 'Repeat', 'Prob_threshold', 'Acc', 'Acc_SD', 'Sens', 'Sens_SD', 'Spec','Spec_SD','PPV', 'PPV_SD',
'NPV', 'NPV_SD', 'Test_consum', 'Test_consum_SD', 'Ind_consum', 'Ind_consum_SD', 'Batch_consum','Batch_consum_SD'])
for prob in [0.001, 0.01, 0.03, 0.05, 0.10]:
    temp_data = [fbtk.data_gen(100000, prob) for _ in range(100)]
    for i in [True]:
        for j in [3]: # stop_rule
            for k in [1]: # repeat
                batch_size = fbtk.one_batch_test_int_solver(prob, 0.15, 0.01)
                kwargs = {'stop_rule': j, 'p': prob, 'batch_size': batch_size,
                'typeII_error': 0.15, 'typeI_error': 0.01, 'repeat': k,
                'prob_threshold': 0.3, 'seq': i}
                test_1 = fbtk.test_result(temp_data, fbtk.seq_test, **kwargs)
                temp_mean = test_1.mean()
                temp_std = test_1.std()
                temp = [kwargs['p'], kwargs['seq'], kwargs['stop_rule'], kwargs['repeat'], kwargs['prob_threshold'],temp_mean['acc'], temp_std['acc'], temp_mean['sens'], temp_std['sens'], temp_mean['spec'], temp_std['spec'], temp_mean['PPV'], temp_std['PPV'], temp_mean['NPV'], temp_std['NPV'], temp_mean['test_consum'], temp_std['test_consum'], temp_mean['ind_consum'], temp_std['ind_consum'], temp_mean['batch_consum'], temp_std['batch_consum']]
                temp_df = pd.DataFrame(temp)
                temp_df = temp_df.T
                temp_df.columns = ['Infection_rate', 'Sequential_test', 'Stop_rule', 'Repeat','Prob_threshold', 'Acc', 'Acc_SD', 'Sens', 'Sens_SD', 'Spec','Spec_SD','PPV', 'PPV_SD',
    'NPV', 'NPV_SD', 'Test_consum', 'Test_consum_SD', 'Ind_consum', 'Ind_consum_SD', 'Batch_consum','Batch_consum_SD']
                df = pd.concat([df, temp_df])
            
time_end = time.time()
print('time cost:', time_end - time_start, 's')

time cost: 162.91796922683716 s


In [210]:
df.to_csv('table7_e.csv')

table 7 (f)

In [8]:
time_start = time.time()
np.random.seed(0)
df = pd.DataFrame([], columns = ['Infection_rate', 'Sequential_test', 'Stop_rule', 'Repeat', 'Prob_threshold', 'Acc', 'Acc_SD', 'Sens', 'Sens_SD', 'Spec','Spec_SD','PPV', 'PPV_SD',
'NPV', 'NPV_SD', 'Test_consum', 'Test_consum_SD', 'Ind_consum', 'Ind_consum_SD', 'Batch_consum','Batch_consum_SD'])
for prob in [0.001, 0.01, 0.03, 0.05, 0.10]:
    temp_data = [fbtk.data_gen(100000, prob) for _ in range(100)]
    for i in [True]:
        for j in [3]: # stop_rule
            for k in [3]: # repeat
                batch_size = fbtk.one_batch_test_int_solver(prob, 0.15, 0.01)
                kwargs = {'stop_rule': j, 'p': prob, 'batch_size': batch_size,
                'typeII_error': 0.15, 'typeI_error': 0.01, 'repeat': k,
                'prob_threshold': 0.3, 'seq': i}
                test_1 = fbtk.test_result(temp_data, fbtk.seq_test, **kwargs)
                temp_mean = test_1.mean()
                temp_std = test_1.std()
                temp = [kwargs['p'], kwargs['seq'], kwargs['stop_rule'], kwargs['repeat'], kwargs['prob_threshold'],temp_mean['acc'], temp_std['acc'], temp_mean['sens'], temp_std['sens'], temp_mean['spec'], temp_std['spec'], temp_mean['PPV'], temp_std['PPV'], temp_mean['NPV'], temp_std['NPV'], temp_mean['test_consum'], temp_std['test_consum'], temp_mean['ind_consum'], temp_std['ind_consum'], temp_mean['batch_consum'], temp_std['batch_consum']]
                temp_df = pd.DataFrame(temp)
                temp_df = temp_df.T
                temp_df.columns = ['Infection_rate', 'Sequential_test', 'Stop_rule', 'Repeat','Prob_threshold', 'Acc', 'Acc_SD', 'Sens', 'Sens_SD', 'Spec','Spec_SD','PPV', 'PPV_SD',
    'NPV', 'NPV_SD', 'Test_consum', 'Test_consum_SD', 'Ind_consum', 'Ind_consum_SD', 'Batch_consum','Batch_consum_SD']
                df = pd.concat([df, temp_df])
            
time_end = time.time()
print('time cost:', time_end - time_start, 's')

time cost: 168.18494725227356 s


In [213]:
df.to_csv('table7_f.csv')

In [236]:
# appendix A
# table 7 (a)
time_start = time.time()
np.random.seed(0)
df = pd.DataFrame([], columns = ['Infection_rate', 'Acc', 'Acc_SD', 'Sens', 'Sens_SD', 'Spec','Spec_SD','PPV', 'PPV_SD',
'NPV', 'NPV_SD', 'Test_consum', 'Test_consum_SD'])
for prob in [0.001, 0.01, 0.03, 0.05, 0.10]:
    temp_data = [fbtk.data_gen(100000, prob) for _ in range(100)]
    length = len(temp_data)
    acc = np.zeros(length)
    sens = np.zeros(length)
    spec = np.zeros(length)
    ppv = np.zeros(length)
    npv = np.zeros(length)
    test_consum = np.zeros(length)
    for i in range(length):
        pred, consum = fbtk.conventional_test(temp_data[i], typeII_error= 0.25, typeI_error=0.03)
        acc[i] = np.mean(pred[:,1] == temp_data[i][:, 1])
        sens[i] = recall_score(temp_data[i][:, 1], pred[:, 1])
        spec[i] = fbtk.specificity_score(temp_data[i][:, 1], pred[:, 1])
        ppv[i] = precision_score(temp_data[i][:, 1], pred[:, 1])
        npv[i] = fbtk.npv_score(temp_data[i][:, 1], pred[:, 1])
        test_consum[i] = consum
    result = {
        'acc': acc,
        'sens': sens,
        'spec': spec,
        'PPV': ppv,
        'NPV': npv,
        'test_consum': test_consum
    
    }
    result = pd.DataFrame(result)
    result_mean = result.mean()
    result_std = result.std()
    temp_df = [prob, result_mean['acc'], result_std['acc'], result_mean['sens'], result_std['sens'],
    result_mean['spec'], result_std['spec'], result_mean['PPV'], result_std['PPV'], result_mean['NPV'],
    result_std['NPV'], result_mean['test_consum'], result_std['test_consum']]
    temp_df = pd.DataFrame(temp_df)
    temp_df = temp_df.T
    temp_df.columns = df.columns
    df = pd.concat([df, temp_df])


  
time_end = time.time()
print('time cost:', time_end - time_start, 's')

time cost: 154.51919507980347 s


In [238]:
df.to_csv('appendix_a.csv')

In [239]:
# Appendix (b)
time_start = time.time()
np.random.seed(0)
df = pd.DataFrame([], columns = ['Infection_rate', 'Sequential_test', 'Stop_rule', 'Repeat', 'Prob_threshold', 'Acc', 'Acc_SD', 'Sens', 'Sens_SD', 'Spec','Spec_SD','PPV', 'PPV_SD',
'NPV', 'NPV_SD', 'Test_consum', 'Test_consum_SD', 'Ind_consum', 'Ind_consum_SD', 'Batch_consum','Batch_consum_SD'])
for prob in [0.001, 0.01, 0.03, 0.05, 0.10]:
    temp_data = [fbtk.data_gen(100000, prob) for _ in range(100)]
    for i in [True]:
        for j in [1]:
            for k in [1]:
                
                kwargs = {'stop_rule': j, 'p': prob, 'batch_size': 10,
                'typeII_error': 0.25, 'typeI_error': 0.03, 'repeat': k,
                'prob_threshold': 0.3, 'seq': i}
                test_1 = fbtk.test_result(temp_data, fbtk.seq_test, **kwargs)
                temp_mean = test_1.mean()
                temp_std = test_1.std()
                temp = [kwargs['p'], kwargs['seq'], kwargs['stop_rule'], kwargs['repeat'], kwargs['prob_threshold'],temp_mean['acc'], temp_std['acc'], temp_mean['sens'], temp_std['sens'], temp_mean['spec'], temp_std['spec'], temp_mean['PPV'], temp_std['PPV'], temp_mean['NPV'], temp_std['NPV'], temp_mean['test_consum'], temp_std['test_consum'], temp_mean['ind_consum'], temp_std['ind_consum'], temp_mean['batch_consum'], temp_std['batch_consum']]
                temp_df = pd.DataFrame(temp)
                temp_df = temp_df.T
                temp_df.columns = ['Infection_rate', 'Sequential_test', 'Stop_rule', 'Repeat','Prob_threshold', 'Acc', 'Acc_SD', 'Sens', 'Sens_SD', 'Spec','Spec_SD','PPV', 'PPV_SD',
    'NPV', 'NPV_SD', 'Test_consum', 'Test_consum_SD', 'Ind_consum', 'Ind_consum_SD', 'Batch_consum','Batch_consum_SD']
                df = pd.concat([df, temp_df])
            
time_end = time.time()
print('time cost:', time_end - time_start, 's')

time cost: 117.64089822769165 s


In [242]:
df.to_csv('appendix_b.csv')

In [245]:
# Appendix (c)
time_start = time.time()
np.random.seed(0)
df = pd.DataFrame([], columns = ['Infection_rate', 'Sequential_test', 'Stop_rule', 'Repeat', 'Prob_threshold', 'Acc', 'Acc_SD', 'Sens', 'Sens_SD', 'Spec','Spec_SD','PPV', 'PPV_SD',
'NPV', 'NPV_SD', 'Test_consum', 'Test_consum_SD', 'Ind_consum', 'Ind_consum_SD', 'Batch_consum','Batch_consum_SD'])
for prob in [0.001, 0.01, 0.03, 0.05, 0.10]:
    temp_data = [fbtk.data_gen(100000, prob) for _ in range(100)]
    for i in [True]:
        for j in [1]:
            for k in [3]:
                batch_size = fbtk.one_batch_test_int_solver(prob, 0.25, 0.03)
                kwargs = {'stop_rule': j, 'p': prob, 'batch_size': batch_size,
                'typeII_error': 0.25, 'typeI_error': 0.03, 'repeat': k,
                'prob_threshold': 0.3, 'seq': i}
                test_1 = fbtk.test_result(temp_data, fbtk.seq_test, **kwargs)
                temp_mean = test_1.mean()
                temp_std = test_1.std()
                temp = [kwargs['p'], kwargs['seq'], kwargs['stop_rule'], kwargs['repeat'], kwargs['prob_threshold'],temp_mean['acc'], temp_std['acc'], temp_mean['sens'], temp_std['sens'], temp_mean['spec'], temp_std['spec'], temp_mean['PPV'], temp_std['PPV'], temp_mean['NPV'], temp_std['NPV'], temp_mean['test_consum'], temp_std['test_consum'], temp_mean['ind_consum'], temp_std['ind_consum'], temp_mean['batch_consum'], temp_std['batch_consum']]
                temp_df = pd.DataFrame(temp)
                temp_df = temp_df.T
                temp_df.columns = ['Infection_rate', 'Sequential_test', 'Stop_rule', 'Repeat','Prob_threshold', 'Acc', 'Acc_SD', 'Sens', 'Sens_SD', 'Spec','Spec_SD','PPV', 'PPV_SD',
    'NPV', 'NPV_SD', 'Test_consum', 'Test_consum_SD', 'Ind_consum', 'Ind_consum_SD', 'Batch_consum','Batch_consum_SD']
                df = pd.concat([df, temp_df])
            
time_end = time.time()
print('time cost:', time_end - time_start, 's')

time cost: 132.2724049091339 s


In [247]:
df.to_csv('appendix_c.csv')

In [248]:
# Appendix (d)
time_start = time.time()
np.random.seed(0)
df = pd.DataFrame([], columns = ['Infection_rate', 'Sq_Repeat', 'Ind_Repeat', 'Acc', 'Acc_SD', 'Sens', 'Sens_SD', 'Spec','Spec_SD','PPV', 'PPV_SD',
'NPV', 'NPV_SD', 'Test_consum', 'Test_consum_SD', 'Ind_consum', 'Ind_consum_SD', 'Batch_consum','Batch_consum_SD'])
for prob in [0.001, 0.01, 0.03, 0.05, 0.10]:
    temp_data = [fbtk.data_gen(100000, prob) for _ in range(100)]
    for i in [3]: # sq_repeat
        for j in [3]: # ind_repeat
            kwargs = {
                'side_length': 12,
                'typeII_error': 0.25,
                'typeI_error': 0.03,
                'sq_repeat': i,
                'ind_repeat': j
            }
            test_1 = fbtk.test_result(temp_data, fbtk.matrix_test, **kwargs)
            temp_mean = test_1.mean()
            temp_std = test_1.std()
            temp = [prob, kwargs['sq_repeat'], kwargs['ind_repeat'], temp_mean['acc'], temp_std['acc'], temp_mean['sens'], temp_std['sens'], temp_mean['spec'], temp_std['spec'], temp_mean['PPV'], temp_std['PPV'], temp_mean['NPV'], temp_std['NPV'], temp_mean['test_consum'], temp_std['test_consum'], temp_mean['ind_consum'], temp_std['ind_consum'], temp_mean['batch_consum'], temp_std['batch_consum']]
            temp_df = pd.DataFrame(temp)
            temp_df = temp_df.T
            temp_df.columns = ['Infection_rate', 'Sq_Repeat', 'Ind_Repeat', 'Acc', 'Acc_SD', 'Sens', 'Sens_SD', 'Spec','Spec_SD','PPV', 'PPV_SD',
'NPV', 'NPV_SD', 'Test_consum', 'Test_consum_SD', 'Ind_consum', 'Ind_consum_SD', 'Batch_consum','Batch_consum_SD']
            df = pd.concat([df, temp_df])

            
                
               
            
time_end = time.time()
print('time cost:', time_end - time_start, 's')

time cost: 156.62915325164795 s


In [250]:
df.to_csv('appendix_d.csv')

In [251]:
# Appendix e
time_start = time.time()
np.random.seed(0)
df = pd.DataFrame([], columns = ['Infection_rate', 'Sequential_test', 'Stop_rule', 'Repeat', 'Prob_threshold', 'Acc', 'Acc_SD', 'Sens', 'Sens_SD', 'Spec','Spec_SD','PPV', 'PPV_SD',
'NPV', 'NPV_SD', 'Test_consum', 'Test_consum_SD', 'Ind_consum', 'Ind_consum_SD', 'Batch_consum','Batch_consum_SD'])
for prob in [0.001, 0.01, 0.03, 0.05, 0.10]:
    temp_data = [fbtk.data_gen(100000, prob) for _ in range(100)]
    for i in [True]:
        for j in [3]: # stop_rule
            for k in [1]: # repeat
                batch_size = fbtk.one_batch_test_int_solver(prob, 0.25, 0.03)
                kwargs = {'stop_rule': j, 'p': prob, 'batch_size': batch_size,
                'typeII_error': 0.25, 'typeI_error': 0.03, 'repeat': k,
                'prob_threshold': 0.3, 'seq': i}
                test_1 = fbtk.test_result(temp_data, fbtk.seq_test, **kwargs)
                temp_mean = test_1.mean()
                temp_std = test_1.std()
                temp = [kwargs['p'], kwargs['seq'], kwargs['stop_rule'], kwargs['repeat'], kwargs['prob_threshold'],temp_mean['acc'], temp_std['acc'], temp_mean['sens'], temp_std['sens'], temp_mean['spec'], temp_std['spec'], temp_mean['PPV'], temp_std['PPV'], temp_mean['NPV'], temp_std['NPV'], temp_mean['test_consum'], temp_std['test_consum'], temp_mean['ind_consum'], temp_std['ind_consum'], temp_mean['batch_consum'], temp_std['batch_consum']]
                temp_df = pd.DataFrame(temp)
                temp_df = temp_df.T
                temp_df.columns = ['Infection_rate', 'Sequential_test', 'Stop_rule', 'Repeat','Prob_threshold', 'Acc', 'Acc_SD', 'Sens', 'Sens_SD', 'Spec','Spec_SD','PPV', 'PPV_SD',
    'NPV', 'NPV_SD', 'Test_consum', 'Test_consum_SD', 'Ind_consum', 'Ind_consum_SD', 'Batch_consum','Batch_consum_SD']
                df = pd.concat([df, temp_df])
            
time_end = time.time()
print('time cost:', time_end - time_start, 's')

time cost: 167.72911620140076 s


In [254]:
# Appendix f
time_start = time.time()
np.random.seed(0)
df = pd.DataFrame([], columns = ['Infection_rate', 'Sequential_test', 'Stop_rule', 'Repeat', 'Prob_threshold', 'Acc', 'Acc_SD', 'Sens', 'Sens_SD', 'Spec','Spec_SD','PPV', 'PPV_SD',
'NPV', 'NPV_SD', 'Test_consum', 'Test_consum_SD', 'Ind_consum', 'Ind_consum_SD', 'Batch_consum','Batch_consum_SD'])
for prob in [0.001, 0.01, 0.03, 0.05, 0.10]:
    temp_data = [fbtk.data_gen(100000, prob) for _ in range(100)]
    for i in [True]:
        for j in [3]: # stop_rule
            for k in [3]: # repeat
                batch_size = fbtk.one_batch_test_int_solver(prob, 0.25, 0.03)
                kwargs = {'stop_rule': j, 'p': prob, 'batch_size': batch_size,
                'typeII_error': 0.25, 'typeI_error': 0.03, 'repeat': k,
                'prob_threshold': 0.3, 'seq': i}
                test_1 = fbtk.test_result(temp_data, fbtk.seq_test, **kwargs)
                temp_mean = test_1.mean()
                temp_std = test_1.std()
                temp = [kwargs['p'], kwargs['seq'], kwargs['stop_rule'], kwargs['repeat'], kwargs['prob_threshold'],temp_mean['acc'], temp_std['acc'], temp_mean['sens'], temp_std['sens'], temp_mean['spec'], temp_std['spec'], temp_mean['PPV'], temp_std['PPV'], temp_mean['NPV'], temp_std['NPV'], temp_mean['test_consum'], temp_std['test_consum'], temp_mean['ind_consum'], temp_std['ind_consum'], temp_mean['batch_consum'], temp_std['batch_consum']]
                temp_df = pd.DataFrame(temp)
                temp_df = temp_df.T
                temp_df.columns = ['Infection_rate', 'Sequential_test', 'Stop_rule', 'Repeat','Prob_threshold', 'Acc', 'Acc_SD', 'Sens', 'Sens_SD', 'Spec','Spec_SD','PPV', 'PPV_SD',
    'NPV', 'NPV_SD', 'Test_consum', 'Test_consum_SD', 'Ind_consum', 'Ind_consum_SD', 'Batch_consum','Batch_consum_SD']
                df = pd.concat([df, temp_df])
            
time_end = time.time()
print('time cost:', time_end - time_start, 's')

time cost: 177.40325736999512 s
