# Using sklearn with CapyMOA

* Demonstrate how someone can directly use sklearn learners in CapyMOA.
* Ideally, one should be free to use other learners

In [1]:
from capymoa.datasets import ElectricityTiny

DATA_PATH = "../data/"

capymoa_root: /Users/ng98/Desktop/CODE/CapyMOA_Latest/src/capymoa
MOA jar path location (config.ini): /Users/ng98/Desktop/CODE/CapyMOA_Latest/src/capymoa/jar/moa.jar
JVM Location (system): 
JAVA_HOME: /Users/ng98/Library/Java/JavaVirtualMachines/openjdk-14.0.1/Contents/Home
JVM args: ['-Xmx8g', '-Xss10M']
Sucessfully started the JVM and added MOA jar to the class path


## 1. Using scikit-learn

* Example showing how a model from scikit-learn can be used with our ```Instance``` representation with an instance loop 

In [2]:
from sklearn import linear_model
from capymoa.evaluation import ClassificationEvaluator

# Creating a stream. Using the tiny version of the electricity dataset to speed
# up the process
elec_stream = ElectricityTiny()

# Creating a learner
sklearn_SGD = linear_model.SGDClassifier()

# Creating the evaluator
ob_evaluator = ClassificationEvaluator(schema=elec_stream.get_schema())

# elec_stream.schema.get_label_indexes() --> the class labels

# Counter for partial fits
partial_fit_count = 0
while elec_stream.has_more_instances():
    instance = elec_stream.next_instance()

    prediction = -1
    if partial_fit_count > 0: # scikit-learn does not allows invoking predict in a model that was not fit before
        prediction = sklearn_SGD.predict([instance.x])[0]
    ob_evaluator.update(instance.y_index, prediction)
    sklearn_SGD.partial_fit([instance.x], [instance.y_index], classes=elec_stream.schema.get_label_indexes())
    partial_fit_count += 1

ob_evaluator.accuracy()

84.7

## 2. Using SKClassifier
* Instead of sklearn ```SGDClassifier``` here we use CapyMOA ```SKClassifier``` on the same instance loop.


In [3]:
from sklearn import linear_model
from capymoa.base import SKClassifier
from capymoa.evaluation import ClassificationEvaluator

## Opening a file as a stream
elec_stream = ElectricityTiny()

# Creating a learner
sklearn_SGD = SKClassifier(schema=elec_stream.get_schema(), sklearner=linear_model.SGDClassifier())

# Creating the evaluator
sklearn_SGD_evaluator = ClassificationEvaluator(schema=elec_stream.get_schema())

while elec_stream.has_more_instances():
    instance = elec_stream.next_instance()

    prediction = sklearn_SGD.predict(instance)
    sklearn_SGD_evaluator.update(instance.y_index, prediction)
    sklearn_SGD.train(instance)

sklearn_SGD_evaluator.accuracy()

84.7

## 3. Using prequential evaluation + SKClassifier
* Instead of an instance loop we use CapyMOA ```prequential_evaluation()``` in this example.

In [4]:
from capymoa.evaluation import prequential_evaluation

## Opening a file as a stream
elec_stream = ElectricityTiny()

# Creating a learner
sklearn_SGD = SKClassifier(schema=elec_stream.get_schema(), sklearner=linear_model.SGDClassifier())

results_sklearn_SGD = prequential_evaluation(stream=elec_stream, learner=sklearn_SGD, window_size=4500)

results_sklearn_SGD['cumulative'].accuracy()

84.7