## Testing efficient methods for evaluation

* Using prequential, test-then-train and windowed evaluation. We show how the *_fast versions can performance. 
* The *_fast methods rely on the MOA implementation ```PrequentialEvaluation(stream, learner, ...)``` from ```moa.evaluation.EfficientEvaluationLoops```
* Performance may be affected due to data reading and where data sits (i.e. in memory or in disk).
  * When working with CSV files, the conversion process involves using NumpyStream, resulting in data being stored in memory as an Instances object (ARFF). As a result, the *_fast evaluation methods do not yield any performance improvement.
  * When ARFF files are employed, the data resides on disk, and each call to ```next_instance()``` necessitates reading a new row from the file. This can be perceived as a potential bottleneck, and future implementations could explore ways to enhance data access efficiency (like caching).

* [EXPERIMENTAL] Changed the behavior of *_evaluation() and *_evaluation_fast() such that if NumpyStream is used, then defaults to *_evaluation() and if not, *_evaluation_fast() is used. **Setting optimise=False disables this behavior.**

**Notebook last updated on 08/12/2023**

In [1]:
# Create the JVM and add the MOA jar to the classpath
from prepare_jpype import start_jpype
start_jpype()

MOA jar path location (config.ini): ./jar/moa.jar
JVM Location (system): 
/Users/gomeshe/Library/Java/JavaVirtualMachines/openjdk-20.0.1/Contents/Home
JVM args: ['-Xmx8g', '-Xss10M']
Sucessfully started the JVM and added MOA jar to the class path


# File paths

In [2]:
hyper_arff_file_path = './data/Hyper100k.arff'
# Stream with 580k instances and around 100 features (csv)
covtfd_arff_file_path = './data/covtFD.arff'
covtfd_csv_file_path = './data/covtFD.csv'

## Real stream (arff) with 100k and ARF

```EvaluatePrequential -l (meta.AdaptiveRandomForest -l (ARFHoeffdingTree -k 6 -e 2000000 -g 50 -c 0.01) -s 30 -x (ADWINChangeDetector -a 0.001) -p (ADWINChangeDetector -a 0.01)) -s (ArffFileStream -f /Users/gomeshe/Desktop/data/Hyper100k.arff) -e BasicClassificationPerformanceEvaluator -f 1000000```

In [3]:
from stream import stream_from_file
from learners import MOAClassifier
from evaluation import test_then_train_evaluation_fast
from ensembles import AdaptiveRandomForest

learner = AdaptiveRandomForest(ensemble_size=30)

stream = stream_from_file(path_to_csv_or_arff=hyper_arff_file_path, class_index=-1)

results = test_then_train_evaluation_fast(stream=stream, learner=learner, max_instances=None, sample_frequency=None)

display(results)
display(results['cumulative'].metrics_per_window())
print(results['cumulative'].accuracy())

{'learner': 'AdaptiveRandomForest',
 'cumulative': <evaluation.ClassificationEvaluator at 0x12811f550>,
 'wallclock': 57.29525828361511,
 'cpu_time': 69.27170899999999}

Unnamed: 0,classified instances,classifications correct (percent),Kappa Statistic (percent),Kappa Temporal Statistic (percent),Kappa M Statistic (percent)


87.51299999999999


In [4]:
from stream import stream_from_file
from learners import MOAClassifier
from evaluation import test_then_train_evaluation
from ensembles import AdaptiveRandomForest

learner = AdaptiveRandomForest(ensemble_size=30)

stream = stream_from_file(path_to_csv_or_arff=hyper_arff_file_path, class_index=-1)

results = test_then_train_evaluation(stream=stream, learner=learner, max_instances=None, sample_frequency=None, optimise=False)

display(results)
display(results['cumulative'].metrics_per_window())
print(results['cumulative'].accuracy())

{'learner': 'AdaptiveRandomForest',
 'cumulative': <evaluation.ClassificationEvaluator at 0x11fc76b10>,
 'wallclock': 61.14156198501587,
 'cpu_time': 65.82365700000001}

Unnamed: 0,classified instances,classifications correct (percent),Kappa Statistic (percent),Kappa Temporal Statistic (percent),Kappa M Statistic (percent)


87.51299999999999


## Synthetic 10kk stream (RTG) with Naive Bayes

```EvaluatePrequential -l bayes.NaiveBayes -e BasicClassificationPerformanceEvaluator -i 10000000 -f 10000000```
* Using a synthetic generator with a massive amount of instances (e.g. 10 million) has a huge impact whether we use the fast or *python* version of the evaluation. It goes from 12s to 140s/150s.  

In [5]:
from stream import RandomTreeGenerator
from learners import MOAClassifier
from evaluation import test_then_train_evaluation_fast
from moa.classifiers.bayes import NaiveBayes

learner = MOAClassifier(moa_learner=NaiveBayes())
stream = RandomTreeGenerator()

results = test_then_train_evaluation_fast(stream=stream, learner=learner, max_instances=10000000, sample_frequency=None)

display(results)
display(results['cumulative'].metrics_per_window())
print(results['cumulative'].accuracy())

{'learner': 'NaiveBayes',
 'cumulative': <evaluation.ClassificationEvaluator at 0x107b54090>,
 'wallclock': 13.216187715530396,
 'cpu_time': 13.94817599999999}

Unnamed: 0,classified instances,classifications correct (percent),Kappa Statistic (percent),Kappa Temporal Statistic (percent),Kappa M Statistic (percent)


73.64807


In [6]:
from stream import RandomTreeGenerator
from learners import MOAClassifier
from evaluation import test_then_train_evaluation
from moa.classifiers.bayes import NaiveBayes

learner = MOAClassifier(moa_learner=NaiveBayes())
stream = RandomTreeGenerator()

results = test_then_train_evaluation(stream=stream, learner=learner, max_instances=10000000, sample_frequency=None, optimise=False)

display(results)
display(results['cumulative'].metrics_per_window())
print(results['cumulative'].accuracy())

{'learner': 'NaiveBayes',
 'cumulative': <evaluation.ClassificationEvaluator at 0x10801b850>,
 'wallclock': 154.34650802612305,
 'cpu_time': 151.51606600000002}

Unnamed: 0,classified instances,classifications correct (percent),Kappa Statistic (percent),Kappa Temporal Statistic (percent),Kappa M Statistic (percent)


73.64807


## Real stream (covtFD) 580k+ instances with more than 100 features. 

* Using an arff file (which triggers ARFFFileReader internally) shows a 20% improvement in run time when using the fast version in comparison to the *python* one. It goes from 40s to 50s. 

In [7]:
from stream import stream_from_file
from learners import MOAClassifier
from evaluation import test_then_train_evaluation_fast
from moa.classifiers.bayes import NaiveBayes

learner = MOAClassifier(moa_learner=NaiveBayes())

stream = stream_from_file(path_to_csv_or_arff=covtfd_arff_file_path, class_index=-1)

results = test_then_train_evaluation_fast(stream=stream, learner=learner, max_instances=None, sample_frequency=None)

display(results)
display(results['cumulative'].metrics_per_window())
print(results['cumulative'].accuracy())

{'learner': 'NaiveBayes',
 'cumulative': <evaluation.ClassificationEvaluator at 0x128156590>,
 'wallclock': 42.74231219291687,
 'cpu_time': 47.62111900000002}

Unnamed: 0,classified instances,classifications correct (percent),Kappa Statistic (percent),Kappa Temporal Statistic (percent),Kappa M Statistic (percent)


52.24272862303812


In [8]:
from stream import stream_from_file
from learners import MOAClassifier
from evaluation import test_then_train_evaluation
from moa.classifiers.bayes import NaiveBayes

learner = MOAClassifier(moa_learner=NaiveBayes())

stream = stream_from_file(path_to_csv_or_arff=covtfd_arff_file_path, class_index=-1)

results = test_then_train_evaluation(stream=stream, learner=learner, max_instances=None, sample_frequency=None, optimise=False)

display(results)
display(results['cumulative'].metrics_per_window())
print(results['cumulative'].accuracy())

{'learner': 'NaiveBayes',
 'cumulative': <evaluation.ClassificationEvaluator at 0x109963750>,
 'wallclock': 57.021140813827515,
 'cpu_time': 56.200113999999985}

Unnamed: 0,classified instances,classifications correct (percent),Kappa Statistic (percent),Kappa Temporal Statistic (percent),Kappa M Statistic (percent)


52.24272862303812


# Testing  ```prequential_evaluation_fast``` 
## SRP100 + real dataset with 100k instances

```EvaluatePrequential -l (meta.StreamingRandomPatches -s 100) -s (ArffFileStream -f ./data/RBFm/RBFm_100k.arff) -f 1000```
* When using a more demanding classifier (such as StreamingRandomPatches with 100 learners) with an ARFF file the impact of the evaluation function becomes negligiable (obtained about the same wallclock time of 200s).

In [9]:
from stream import stream_from_file
from learners import MOAClassifier
from moa.classifiers.meta import StreamingRandomPatches
from evaluation import prequential_evaluation_fast

cl_SRP = MOAClassifier(moa_learner=StreamingRandomPatches(), CLI='-s 100', random_seed=1)

stream = stream_from_file(path_to_csv_or_arff=hyper_arff_file_path, class_index=-1)

results = prequential_evaluation_fast(stream=stream, learner=cl_SRP, max_instances=None, window_size=10000)

display(results)
display(results['windowed'].metrics_per_window())
print(results['cumulative'].accuracy())

{'learner': 'StreamingRandomPatches',
 'cumulative': <evaluation.ClassificationEvaluator at 0x12815ef10>,
 'windowed': <evaluation.ClassificationWindowedEvaluator at 0x12816db90>,
 'wallclock': 227.7408480644226,
 'cpu_time': 259.27962400000007}

Unnamed: 0,classified instances,classifications correct (percent),Kappa Statistic (percent),Kappa Temporal Statistic (percent),Kappa M Statistic (percent)
0,10000.0,87.48,74.961242,75.019952,74.808853
1,20000.0,88.91,77.796914,78.087335,77.362727
2,30000.0,88.73,77.468328,77.4284,77.585521
3,40000.0,89.73,79.457065,79.731597,79.298529
4,50000.0,89.88,79.758769,80.191818,79.51417
5,60000.0,90.13,80.252137,80.072683,79.836568
6,70000.0,89.99,79.981095,79.931836,79.98
7,80000.0,90.29,80.582614,80.707332,80.252186
8,90000.0,90.36,80.718084,80.350591,80.819737
9,100000.0,90.08,80.154011,80.227227,79.951496


89.558


In [10]:
from stream import stream_from_file
from learners import MOAClassifier
from moa.classifiers.meta import StreamingRandomPatches
from evaluation import prequential_evaluation

cl_SRP = MOAClassifier(moa_learner=StreamingRandomPatches(), CLI='-s 100', random_seed=1)

stream = stream_from_file(path_to_csv_or_arff=hyper_arff_file_path, class_index=-1)

results = prequential_evaluation(stream=stream, learner=cl_SRP, max_instances=None, window_size=10000, optimise=False)

display(results)
display(results['windowed'].metrics_per_window())
print(results['cumulative'].accuracy())

{'learner': 'StreamingRandomPatches',
 'cumulative': <evaluation.ClassificationEvaluator at 0x12815d110>,
 'windowed': <evaluation.ClassificationWindowedEvaluator at 0x12815fed0>,
 'wallclock': 241.547532081604,
 'cpu_time': 268.73538499999995}

Unnamed: 0,classified instances,classifications correct (percent),Kappa Statistic (percent),Kappa Temporal Statistic (percent),Kappa M Statistic (percent)
0,10000.0,87.48,74.961242,75.019952,74.808853
1,20000.0,88.91,77.796914,78.087335,77.362727
2,30000.0,88.73,77.468328,77.4284,77.585521
3,40000.0,89.73,79.457065,79.731597,79.298529
4,50000.0,89.88,79.758769,80.191818,79.51417
5,60000.0,90.13,80.252137,80.072683,79.836568
6,70000.0,89.99,79.981095,79.931836,79.98
7,80000.0,90.29,80.582614,80.707332,80.252186
8,90000.0,90.36,80.718084,80.350591,80.819737
9,100000.0,90.08,80.154011,80.227227,79.951496


89.558


# Testing with a large CSV file

* We can observe that there is not much difference between the evaluation functions when we are using the CSV which is eventually loaded to memory.
* Loading the CSV to memory may take a while if it is a massive dataset (this one has about 600k instances and more than 100 features). 

In [11]:
%%time
from stream import stream_from_file

# Loads the csv data to memory, it should take from 1 minute to 2 minutes. 
stream = stream_from_file(path_to_csv_or_arff=covtfd_csv_file_path, class_index=-1)

CPU times: user 7min 23s, sys: 6.77 s, total: 7min 29s
Wall time: 2min 38s


In [12]:
from learners import MOAClassifier
from evaluation import test_then_train_evaluation
from moa.classifiers.bayes import NaiveBayes

learner = MOAClassifier(moa_learner=NaiveBayes())

results = test_then_train_evaluation_fast(stream=stream, learner=learner, max_instances=None, sample_frequency=None)

display(results)
display(results['cumulative'].metrics_per_window())
print(results['cumulative'].accuracy())

{'learner': 'NaiveBayes',
 'cumulative': <evaluation.ClassificationEvaluator at 0x1080b2350>,
 'wallclock': 42.66289830207825,
 'cpu_time': 42.10631599999988}

Unnamed: 0,classified instances,classifications correct (percent),Kappa Statistic (percent),Kappa Temporal Statistic (percent),Kappa M Statistic (percent)


57.97274061936866


In [13]:
from learners import MOAClassifier
from evaluation import test_then_train_evaluation
from moa.classifiers.bayes import NaiveBayes

learner = MOAClassifier(moa_learner=NaiveBayes())

results = test_then_train_evaluation(stream=stream, learner=learner, max_instances=None, sample_frequency=None, optimise=False)

display(results)
display(results['cumulative'].metrics_per_window())
print(results['cumulative'].accuracy())

{'learner': 'NaiveBayes',
 'cumulative': <evaluation.ClassificationEvaluator at 0x12815f3d0>,
 'wallclock': 43.035099029541016,
 'cpu_time': 42.23444399999994}

Unnamed: 0,classified instances,classifications correct (percent),Kappa Statistic (percent),Kappa Temporal Statistic (percent),Kappa M Statistic (percent)


57.97274061936866


## Using prequential_evaluation_fast

In [14]:
from learners import MOAClassifier
from evaluation import test_then_train_evaluation
from moa.classifiers.bayes import NaiveBayes

learner = MOAClassifier(moa_learner=NaiveBayes())

results = prequential_evaluation_fast(stream=stream, learner=learner, max_instances=None, window_size=None)

display(results)
display(results['cumulative'].metrics_per_window())
print(results['cumulative'].accuracy())

{'learner': 'NaiveBayes',
 'cumulative': <evaluation.ClassificationEvaluator at 0x11fca2190>,
 'windowed': None,
 'wallclock': 43.3126118183136,
 'cpu_time': 42.54233700000009}

Unnamed: 0,classified instances,classifications correct (percent),Kappa Statistic (percent),Kappa Temporal Statistic (percent),Kappa M Statistic (percent)


57.97274061936866


In [15]:
from learners import MOAClassifier
from evaluation import test_then_train_evaluation
from moa.classifiers.bayes import NaiveBayes

learner = MOAClassifier(moa_learner=NaiveBayes())

results = prequential_evaluation(stream=stream, learner=learner, max_instances=None, window_size=None, optimise=False)

display(results)
display(results['cumulative'].metrics_per_window())
print(results['cumulative'].accuracy())

{'learner': 'NaiveBayes',
 'cumulative': <evaluation.ClassificationEvaluator at 0x12815f410>,
 'windowed': None,
 'wallclock': 42.468923807144165,
 'cpu_time': 42.22687700000006}

Unnamed: 0,classified instances,classifications correct (percent),Kappa Statistic (percent),Kappa Temporal Statistic (percent),Kappa M Statistic (percent)


57.97274061936866
