## Preprocessing using MOA

* Includes an example of how preprocessing (from MOA) can be used.
* ```x()``` is read-only as of now, so one cannot preprocess instances
* **TODO**: Allow modifying ```x()``` so that python-based preprocessing can be used. 

## 1. Running OnlineBagging without any preprocessing
* This is just to give us a baseline to compare with 

In [1]:
from capymoa.stream import stream_from_file
from capymoa.classifier import OnlineBagging
from capymoa.evaluation import ClassificationEvaluator

DATA_PATH = "../data/"

## Opening a file as a stream
elec_stream = stream_from_file(path_to_csv_or_arff=DATA_PATH+"electricity.csv")

# Creating a learner
ob_learner = OnlineBagging(schema=elec_stream.get_schema(), ensemble_size=5)

# Creating the evaluator
ob_evaluator = ClassificationEvaluator(schema=elec_stream.get_schema())

## Test-then-train loop
while elec_stream.has_more_instances():
    instance = elec_stream.next_instance()
    
    prediction = ob_learner.predict(instance)
    ob_evaluator.update(instance.y_index, prediction)
    ob_learner.train(instance)

ob_evaluator.accuracy()

capymoa_root: /Users/ng98/Desktop/CODE/CapyMOA_Latest/src/capymoa
MOA jar path location (config.ini): /Users/ng98/Desktop/CODE/CapyMOA_Latest/src/capymoa/jar/moa.jar
JVM Location (system): 
JAVA_HOME: /Users/ng98/Library/Java/JavaVirtualMachines/openjdk-14.0.1/Contents/Home
JVM args: ['-Xmx8g', '-Xss10M']
Sucessfully started the JVM and added MOA jar to the class path


79.05190677966102

### OnlineBagging using the preprocessing method from MOA
* Here we use ```NormalisationFilter``` filter from MOA to online normalise instances.
* The API is still a bit rough

In [2]:
from capymoa.stream import Stream
from moa.streams.filters import StandardisationFilter, NormalisationFilter
from moa.streams import FilteredStream

# Open the stream from an ARFF file
elec_stream = stream_from_file(path_to_csv_or_arff=DATA_PATH+"electricity.arff")
# Create a FilterStream and use the NormalisationFilter
elec_stream_normalised = Stream(CLI=f"-s ({elec_stream.moa_stream.getCLICreationString(elec_stream.moa_stream.__class__)}) \
-f NormalisationFilter ", moa_stream=FilteredStream())

# Creating a learner
ob_learner = OnlineBagging(schema=elec_stream.get_schema(), ensemble_size=5)

# Creating the evaluator
ob_evaluator = ClassificationEvaluator(schema=elec_stream_normalised.get_schema())

while elec_stream_normalised.has_more_instances():
    instance = elec_stream_normalised.next_instance()
    
    prediction = ob_learner.predict(instance)
    ob_evaluator.update(instance.y_index, prediction)
    ob_learner.train(instance)
    # print(instance.x)

ob_evaluator.accuracy()

79.69412076271186