# Benchmark avec River

## Consigne

* Accédez à River, rubrique API Benchmark, et prenez connaissance des méthodes
évaluées.
* Pré-sélectionnez les méthodes disponibles :
* * Régression : toutes les méthodes.
* * Classification multi-classes : sélectionnez 3 méthodes et 3 jeux de données.
* * Classification binaire : sélectionnez les méthodes dont le temps d’exécution < 1000 secondes.
* Exécutez les 3 types de méthodes sur River pour vérifier la compilation et reportez les
résultats du benchmark existant.
* Présentez une table indiquant en gras les meilleures performances et ajoutez une analyse comparative par dataset.


## Approche

Pour effectuer ce TP, étant donné le nombres importants d'algorithmes et de datasets, on va faire du multi-threading pour exécuter les benchmarks en parallèle.

On va s'inspirer de ce qui est fait dans River pour exécuter les benchmarks.

In [None]:
from __future__ import annotations

import sys

import copy
import itertools
import json
import logging
import multiprocessing

import pandas as pd
from config import MODELS, N_CHECKPOINTS, TRACKS
from tqdm import tqdm

from river import metrics
from river.evaluate import Track


## Manage logging : make sure only warnings and errors are logged
logging.basicConfig(level=logging.WARN)
logger = logging.getLogger(__name__)

MODELS["Binary classification"].update(MODELS["Multiclass classification"])
details = {}

In [14]:
def run_dataset(model_str, no_dataset, track, n_checkpoints=N_CHECKPOINTS):
    """ Run a model on a dataset and return the results.

    Args:
        model_str (_type_): model name
        no_dataset (_type_): dataset index
        track (Track) : binary, multiclass or regression in our case

    Returns:
        results :
            step : int
            track : str
            model : str
            dataset : str
            ... metric values ...
    """
    if isinstance(track, int):
        track = TRACKS[track]
    
    if isinstance(model_str, str):    
        model_name = model_str    
        model = MODELS[track.name][model_name].clone()
    else:
        model_name = model_str.__class__.__name__
        model = model_str.clone()
    
    dataset = track.datasets[no_dataset]
    print(f"Processing {model_str} on {dataset.__class__.__name__}")

    results = []
    track = copy.deepcopy(track)
    time = 0.0
    
    # Run the model on the dataset with a progress bar and N_CHECKPOINTS checkpoints
    for i in tqdm(
        track.run(model, dataset, n_checkpoints=n_checkpoints),
        total=n_checkpoints,
        desc=f"{model_str} on {dataset.__class__.__name__}",
    ):
        time += i["Time"].total_seconds()
        res = {
            "step": i["Step"],
            "track": track.name,
            "model": model_name,
            "dataset": dataset.__class__.__name__,
        }
        for k, v in i.items():
            if isinstance(v, metrics.base.Metric):
                res[k] = v.get()
        res["Memory in Mb"] = i["Memory"] / 1024**2
        res["Time in s"] = time
        results.append(res)
    return results

In [None]:


def run_track(models: list[str], track: Track, n_workers: int = 50, n_checkpoints: int = N_CHECKPOINTS, pickle=True):
    """ Run a track with multiple models in parallel.

    Args:
        models (list[str]): list of model names
        track (Track): track 
        n_workers (int, optional): number of parallel workers. Defaults to 50.
    """
    if isinstance(track, int):
        track = TRACKS[track]
        


    pool = multiprocessing.Pool(processes=n_workers)
    runs = list(itertools.product(models, range(len(track.datasets)), [track], [n_checkpoints]))
    results = []

    for val in pool.starmap(run_dataset, runs):
        results.extend(val)

    
    csv_name = track.name.replace(" ", "_").lower()
    pd.DataFrame(results).to_csv(f"./{csv_name}.csv", index=False)


## Regression

In [None]:
from river.evaluate import RegressionTrack

track = RegressionTrack()

details[track.name] = {"Dataset": {}, "Model": {}}
for dataset in track.datasets:
    details[track.name]["Dataset"][dataset.__class__.__name__] = repr(dataset)
for model_name, model in MODELS[track.name].items():
    details[track.name]["Model"][model_name] = repr(model)
with open("details.json", "w") as f:
    json.dump(details, f, indent=2)
run_track(models=MODELS[track.name].keys(), track=track, n_workers=50)


Processing Passive-Aggressive Regressor, mode 1 on ChickWeightsProcessing Stochastic Gradient Tree on TrumpApprovalProcessing Linear Regression on ChickWeightsProcessing Linear Regression on TrumpApprovalProcessing Aggregated Mondrian Forest on ChickWeightsProcessing Passive-Aggressive Regressor, mode 1 on TrumpApproval
Processing Aggregated Mondrian Forest on TrumpApproval
Processing Passive-Aggressive Regressor, mode 2 on TrumpApprovalProcessing k-Nearest Neighbors on TrumpApproval
Processing Passive-Aggressive Regressor, mode 2 on ChickWeights
Processing Linear Regression with l2 regularization on TrumpApproval
Processing Linear Regression with l1 regularization on TrumpApprovalProcessing Hoeffding Tree on TrumpApprovalProcessing Linear Regression with l2 regularization on ChickWeights
Processing Linear Regression with l1 regularization on ChickWeightsProcessing Hoeffding Tree on ChickWeightsProcessing k-Nearest Neighbors on ChickWeights


Linear Regression on TrumpApproval:   0%|          | 0/50 [00:00<?, ?it/s]





Processing Hoeffding Adaptive Tree on ChickWeights



Aggregated Mondrian Forest on ChickWeights:   0%|          | 0/50 [00:00<?, ?it/s]




Passive-Aggressive Regressor, mode 2 on ChickWeights:   0%|          | 0/50 [00:00<?, ?it/s]





k-Nearest Neighbors on TrumpApproval:   0%|          | 0/50 [00:00<?, ?it/s] [00:00<?, ?it/s]t/s]


Processing Adaptive Random Forest on ChickWeights


Passive-Aggressive Regressor, mode 1 on TrumpApproval:   0%|          | 0/50 [00:00<?, ?it/s]t/s]


Processing Adaptive Model Rules on TrumpApprovalProcessing [baseline] Mean predictor on ChickWeightsProcessing Stochastic Gradient Tree on ChickWeights

Hoeffding Adaptive Tree on ChickWeights:   0%|          | 0/50 [00:00<?, ?it/s]

Processing Streaming Random Patches on TrumpApproval

k-Nearest Neighbors on ChickWeights:   0%|          | 0/50 [00:00<?, ?it/s]

Processing [baseline] Mean predictor on TrumpApprovalProcessing Adaptive Model Rules on ChickWeightsProcessing Streaming Random Patches on ChickWeights
Processing Exponentially Weighted Average on ChickWeights

Passive-Aggressive Regressor, mode 1 on ChickWeights:   0%|          | 0/50 [00:00<?, ?it/s]


Processing River MLP on TrumpApprovalProcessing Hoeffding Adaptive Tree on TrumpApproval
Processing Adaptive Random Forest on TrumpApprovalProcessing Exponentially Weighted Average on TrumpApproval

Linear Regression with l1 regularization on ChickWeights:   0%|          | 0/50 [00:00<?, ?it/s]






Adaptive Random Forest on ChickWeights:   0%|          | 0/50 [00:00<?, ?it/s]







Adaptive Model Rules on ChickWeights:   0%|          | 0/50 [00:00<?, ?it/s]





Streaming Random Patches on TrumpApproval:   0%|          | 0/50 [00:00<?, ?it/s]

Processing River MLP on ChickWeights

River MLP on TrumpApproval:   0%|          | 0/50 [00:00<?, ?it/s]:00<?, ?it/s]]]?it/s]

Processing Bagging on TrumpApproval


Exponentially Weighted Average on ChickWeights:   0%|          | 0/50 [00:00<?, ?it/s]




Bagging on TrumpApproval:   0%|          | 0/50 [00:00<?, ?it/s]] [00:00<?, ?it/s]

Processing Bagging on ChickWeights


[baseline] Mean predictor on ChickWeights: 53it [00:00, 1183.24it/s]             
Linear Regression on TrumpApproval: 51it [00:00, 556.29it/s]              
Passive-Aggressive Regressor, mode 1 on ChickWeights: 53it [00:00, 656.19it/s]              
Linear Regression on ChickWeights: 53it [00:00, 618.50it/s]              
Linear Regression with l2 regularization on TrumpApproval:  92%|█████████▏| 46/50 [00:00<00:00, 452.69it/s]
k-Nearest Neighbors on ChickWeights:  24%|██▍       | 12/50 [00:00<00:00, 112.84it/s]                 ]
Adaptive Model Rules on ChickWeights:  34%|███▍      | 17/50 [00:00<00:00, 163.22it/s]
Stochastic Gradient Tree on ChickWeights:  26%|██▌       | 13/50 [00:00<00:00, 127.68it/s]                 
Adaptive Random Forest on ChickWeights:   4%|▍         | 2/50 [00:00<00:02, 16.08it/s]                 
Streaming Random Patches on ChickWeights:   4%|▍         | 2/50 [00:00<00:03, 15.65it/s]0it/s]]
Streaming Random Patches on TrumpApproval:   2%|▏         | 1/50 [00:

## Binary Classification


In [11]:
from river.evaluate import BinaryClassificationTrack

track = BinaryClassificationTrack()

details[track.name] = {"Dataset": {}, "Model": {}}
for dataset in track.datasets:
    details[track.name]["Dataset"][dataset.__class__.__name__] = repr(dataset)
for model_name, model in MODELS[track.name].items():
    details[track.name]["Model"][model_name] = repr(model)
with open("details.json", "w") as f:
    json.dump(details, f, indent=2)
run_track(models=MODELS[track.name].keys(), track=track, n_workers=50)


Processing Logistic regression on BananasProcessing Aggregated Mondrian Forest on PhishingProcessing Aggregated Mondrian Forest on Elec2Processing Aggregated Mondrian Forest on BananasProcessing Aggregated Mondrian Forest on SMTPProcessing Logistic regression on Elec2
Processing ALMA on SMTPProcessing Logistic regression on SMTPProcessing sklearn SGDClassifier on Phishing
Processing ALMA on Elec2
Processing sklearn SGDClassifier on Elec2Processing Logistic regression on Phishing


Processing Naive Bayes on Elec2Processing Naive Bayes on Bananas
Processing sklearn SGDClassifier on Bananas
Processing ALMA on Phishing
Processing ALMA on BananasProcessing Naive Bayes on SMTP



Processing Naive Bayes on Phishing


Logistic regression on Bananas:   0%|          | 0/50 [00:00<?, ?it/s]




Logistic regression on Elec2:   0%|          | 0/50 [00:00<?, ?it/s]


Processing Hoeffding Tree on Bananas

Aggregated Mondrian Forest on Phishing:   0%|          | 0/50 [00:00<?, ?it/s]




Aggregated Mondrian Forest on Bananas:   0%|          | 0/50 [00:00<?, ?it/s]

Processing Hoeffding Tree on Phishing

ALMA on SMTP:   0%|          | 0/50 [00:00<?, ?it/s]

Processing Hoeffding Tree on Elec2Processing Hoeffding Tree on SMTP


Aggregated Mondrian Forest on SMTP:   0%|          | 0/50 [00:00<?, ?it/s]




sklearn SGDClassifier on Elec2:   0%|          | 0/50 [00:00<?, ?it/s]

Processing Streaming Random Patches on Elec2

Naive Bayes on Bananas:   0%|          | 0/50 [00:00<?, ?it/s]

Processing Adaptive Random Forest on BananasProcessing Adaptive Random Forest on Phishing
Processing Hoeffding Adaptive Tree on Elec2

sklearn SGDClassifier on Phishing:   0%|          | 0/50 [00:00<?, ?it/s]

Processing Hoeffding Adaptive Tree on PhishingProcessing Hoeffding Adaptive Tree on SMTPProcessing Vowpal Wabbit logistic regression on SMTPProcessing Streaming Random Patches on Bananas

sklearn SGDClassifier on Bananas:   0%|          | 0/50 [00:00<?, ?it/s]

Processing Hoeffding Adaptive Tree on BananasProcessing Vowpal Wabbit logistic regression on Elec2

Logistic regression on Phishing:   0%|          | 0/50 [00:00<?, ?it/s]


Processing Adaptive Random Forest on Elec2

ALMA on Elec2:   0%|          | 0/50 [00:00<?, ?it/s]

Processing Adaptive Random Forest on SMTPProcessing Streaming Random Patches on SMTP

ALMA on Bananas:   0%|          | 0/50 [00:00<?, ?it/s]

Processing sklearn SGDClassifier on SMTPProcessing Vowpal Wabbit logistic regression on Phishing

Naive Bayes on Elec2:   0%|          | 0/50 [00:00<?, ?it/s]


Processing k-Nearest Neighbors on SMTPProcessing ADWIN Bagging on Bananas

Naive Bayes on SMTP:   0%|          | 0/50 [00:00<?, ?it/s]


Processing ADWIN Bagging on Phishing


Processing ADWIN Bagging on Elec2Processing k-Nearest Neighbors on Bananas
Processing k-Nearest Neighbors on Phishing

Processing Streaming Random Patches on Phishing

Naive Bayes on Phishing:   0%|          | 0/50 [00:00<?, ?it/s]






ALMA on Phishing:   0%|          | 0/50 [00:00<?, ?it/s]?, ?it/s]

Processing k-Nearest Neighbors on Elec2

Hoeffding Tree on Phishing:   0%|          | 0/50 [00:00<?, ?it/s]

Processing AdaBoost on Elec2


Processing ADWIN Bagging on SMTP





Hoeffding Tree on Elec2:   0%|          | 0/50 [00:00<?, ?it/s]

Processing AdaBoost on Bananas




Adaptive Random Forest on Bananas:   0%|          | 0/50 [00:00<?, ?it/s]





Hoeffding Adaptive Tree on Elec2:   0%|          | 0/50 [00:00<?, ?it/s]

Processing Vowpal Wabbit logistic regression on Bananas

Hoeffding Tree on SMTP:   0%|          | 0/50 [00:00<?, ?it/s]/50 [00:00<?, ?it/s]






Hoeffding Adaptive Tree on Bananas:   0%|          | 0/50 [00:00<?, ?it/s]




Streaming Random Patches on SMTP:   0%|          | 0/50 [00:00<?, ?it/s]?, ?it/s]




k-Nearest Neighbors on SMTP:   0%|          | 0/50 [00:00<?, ?it/s]t/s]




Logistic regression on Phishing: 100%|██████████| 50/50 [00:00<00:00, 257.73it/s] 19.22it/s]/s]using no cache
Reading datafile = none
num sources = 0
Num weight bits = 18
learning rate = 0.5
initial_t = 0
power_t = 0.5
Enabled learners: gd, scorer-identity, count_label
Input label = SIMPLE
Output pred = SCALAR
average  since         example        example        current        current  current
loss     last          counter         weight          label        predict features
using no cache
Reading datafile = none
num sources = 0
Num weight bits = 18
learning rate = 0.5
initial_t = 0
power_t = 0.5
Enabled learners: gd, scorer-identity, count_label
Input label = SIMPLE
Output pred = SCALAR
average  since         example        example        current        current  current
loss     last          counter         weight          label        predict features
using no cache
Reading datafile = none
num sources = 0
Num weight bits = 18
learning rate = 0.5
initial_t = 0
power_t = 0.5
Enabled

Processing AdaBoost on Phishing


ALMA on Bananas:  24%|██▍       | 12/50 [00:00<00:00, 53.34it/s]50/50 [00:00<00:00, 260.68it/s]
Naive Bayes on Bananas:  20%|██        | 10/50 [00:00<00:00, 44.53it/s]

Processing AdaBoost on SMTP

Logistic regression on Elec2:   2%|▏         | 1/50 [00:00<00:12,  3.84it/s]




Hoeffding Tree on SMTP:   2%|▏         | 1/50 [00:00<00:14,  3.31it/s]2it/s]it/s]]81it/s]/s]
ALMA on SMTP:   2%|▏         | 1/50 [00:00<00:17,  2.88it/s].53it/s]it/s]

Processing Bagging on Bananas

ALMA on Bananas:  36%|███▌      | 18/50 [00:00<00:00, 49.65it/s]0:08,  5.60it/s]




sklearn SGDClassifier on Phishing:  16%|█▌        | 8/50 [00:00<00:01, 21.98it/s]2.75it/s]/s]0.002085 0.000000         2048         2048.0        -1.0000        -1.0000        4

finished run
number of examples = 2500
weighted example sum = 2500.000000
weighted label sum = -154.000000
average loss = 0.385539
best constant = -0.123200
best constant's loss = 0.984822
total feature number = 15652
1.006213 0.987185         1024         1024.0         1.0000        -0.1903        3

finished run
number of examples = 0
weighted example sum = 0.000000
weighted label sum = 0.000000
average loss = n.a.
total feature number = 0
0.670015 0.635999         1024         1024.0         1.0000         0.7296        9
0.994244 0.982274         2048         2048.0        -1.0000        -0.2130        3
0.001042 0.000000         4096         4096.0        -1.0000        -1.0000        4
k-Nearest Neighbors on Phishing:  14%|█▍        | 7/50 [00:00<00:02, 19.55it/s]0it/s]
Logistic regression on Elec2:   4

Processing Bagging on Elec2


Naive Bayes on Phishing: 100%|██████████| 50/50 [00:00<00:00, 95.41it/s] 24.29it/s]43.07it/s]
sklearn SGDClassifier on Bananas:   8%|▊         | 4/50 [00:00<00:06,  7.42it/s]

Processing Bagging on Phishing

Vowpal Wabbit logistic regression on Elec2:   4%|▍         | 2/50 [00:00<00:12,  3.72it/s]




Hoeffding Adaptive Tree on SMTP:   2%|▏         | 1/50 [00:00<00:28,  1.70it/s]it/s]s]0.627324 0.584633         2048         2048.0        -1.0000        -0.9190        9
0.999660 1.005076         4096         4096.0        -1.0000         0.0237        3
Adaptive Random Forest on Phishing:   8%|▊         | 4/50 [00:00<00:08,  5.45it/s].94it/s]/s]0.577826 0.528328         4096         4096.0        -1.0000        -0.5765        9
0.000521 0.000000         8192         8192.0        -1.0000        -1.0000        4
Vowpal Wabbit logistic regression on Bananas: 100%|██████████| 50/50 [00:01<00:00, 48.22it/s]

Adaptive Random Forest on Bananas:   4%|▍         | 2/50 [00:01<00:25,  1.86it/s]0.996027 0.992395         8192         8192.0        -1.0000        -0.1376        3
0.568387 0.558947         8192         8192.0        -1.0000        -0.2747        9

finished run
number of examples = 10600
weighted example sum = 10600.000000
weighted label sum = -548.000000
average loss = 0.992353
b

Processing Bagging on SMTP


Adaptive Random Forest on Phishing:  12%|█▏        | 6/50 [00:01<00:07,  6.20it/s].02it/s]

Processing Leveraging Bagging on Bananas


Hoeffding Adaptive Tree on Phishing:  74%|███████▍  | 37/50 [00:01<00:00, 29.69it/s]
Vowpal Wabbit logistic regression on SMTP:   8%|▊         | 4/50 [00:01<00:14,  3.20it/s]

Processing Leveraging Bagging on Elec2


Hoeffding Adaptive Tree on Phishing: 100%|██████████| 50/50 [00:01<00:00, 36.20it/s]it/s]]
sklearn SGDClassifier on Bananas:  16%|█▌        | 8/50 [00:01<00:08,  5.02it/s]

Processing Leveraging Bagging on Phishing


sklearn SGDClassifier on Phishing:  54%|█████▍    | 27/50 [00:01<00:01, 15.93it/s]s]0.000261 0.000000        16384        16384.0        -1.0000        -1.0000        4
sklearn SGDClassifier on Phishing:  72%|███████▏  | 36/50 [00:01<00:00, 17.81it/s]]t/s]/s]
Hoeffding Adaptive Tree on Bananas:  90%|█████████ | 45/50 [00:01<00:00, 22.12it/s]

Processing Leveraging Bagging on SMTP


Hoeffding Tree on SMTP:  12%|█▏        | 6/50 [00:02<00:16,  2.64it/s]0, 23.01it/s]40it/s]
Adaptive Random Forest on Bananas:   8%|▊         | 4/50 [00:02<00:25,  1.84it/s]

Processing Stacking on Bananas


k-Nearest Neighbors on Phishing:  36%|███▌      | 18/50 [00:02<00:06,  5.25it/s]s]s]it/s]
ADWIN Bagging on Phishing:  16%|█▌        | 8/50 [00:02<00:11,  3.64it/s]05,  5.71it/s]

Processing Stacking on Elec2


sklearn SGDClassifier on Phishing:  92%|█████████▏| 46/50 [00:02<00:00, 16.05it/s]it/s]/s]0.576888 0.585389        16384        16384.0         1.0000         0.2746        9
Hoeffding Adaptive Tree on Elec2:  10%|█         | 5/50 [00:02<00:24,  1.86it/s]s]]]93it/s]
Leveraging Bagging on Phishing:   6%|▌         | 3/50 [00:01<00:23,  1.97it/s]

Processing Stacking on Phishing

Bagging on Phishing:  28%|██▊       | 14/50 [00:02<00:05,  6.88it/s]




Naive Bayes on Elec2:  14%|█▍        | 7/50 [00:03<00:17,  2.43it/s]9it/s]5.95it/s]s]s]t/s]0.000931 0.001601        32768        32768.0        -1.0000        -1.0000        4
Logistic regression on Elec2:  42%|████▏     | 21/50 [00:04<00:04,  6.09it/s] 4.22it/s]t/s]0.560467 0.544046        32768        32768.0         1.0000        -0.1656        9
AdaBoost on SMTP:  10%|█         | 5/50 [00:06<01:05,  1.45s/it]/s]24,  1.61it/s] 4.94it/s]
ALMA on SMTP:  40%|████      | 20/50 [00:06<00:10,  2.97it/s]

Processing Stacking on SMTP

Streaming Random Patches on Phishing:  38%|███▊      | 19/50 [00:06<00:17,  1.77it/s]




Hoeffding Adaptive Tree on SMTP:  30%|███       | 15/50 [00:08<00:23,  1.50it/s]it/s]5it/s]0.600512 0.640557        65536        65536.0         1.0000         0.2733        9
0.000529 0.000126        65536        65536.0        -1.0000        -1.0000        4
ALMA on SMTP:  54%|█████▍    | 27/50 [00:09<00:07,  2.99it/s]0,  5.29it/s]0:11,  2.53it/s]]


Processing Voting on Bananas

AdaBoost on Bananas:  88%|████████▊ | 44/50 [00:09<00:00,  6.04it/s] 1.69it/s]




Leveraging Bagging on Bananas:  30%|███       | 15/50 [00:08<00:20,  1.72it/s]4.98it/s]t/s]
AdaBoost on Bananas:  94%|█████████▍| 47/50 [00:10<00:00,  5.18it/s]]0:07,  2.86it/s]

Processing Voting on Elec2

Logistic regression on Elec2:  92%|█████████▏| 46/50 [00:10<00:00,  4.72it/s]




Leveraging Bagging on Phishing:  38%|███▊      | 19/50 [00:09<00:11,  2.74it/s]03it/s]it/s]
Vowpal Wabbit logistic regression on SMTP:  50%|█████     | 25/50 [00:10<00:08,  3.09it/s]

Processing Voting on Phishing


Streaming Random Patches on Phishing:  60%|██████    | 30/50 [00:10<00:07,  2.68it/s]1it/s]


Processing Voting on SMTP

Vowpal Wabbit logistic regression on SMTP:  52%|█████▏    | 26/50 [00:10<00:07,  3.28it/s]




Bagging on Phishing: 100%|██████████| 50/50 [00:10<00:00,  4.65it/s] 1.67it/s]it/s]s]9it/s]
Vowpal Wabbit logistic regression on Elec2:  98%|█████████▊| 49/50 [00:11<00:00,  4.72it/s]

Processing [baseline] Last Class on Bananas

Logistic regression on Elec2: 51it [00:11,  4.47it/s]                        6,  3.34it/s]





Bagging on Bananas:  74%|███████▍  | 37/50 [00:11<00:03,  3.30it/s]it/s]

Processing [baseline] Last Class on Elec2

Aggregated Mondrian Forest on Bananas:  38%|███▊      | 19/50 [00:11<00:22,  1.40it/s]




Adaptive Random Forest on Bananas:  30%|███       | 15/50 [00:11<00:34,  1.01it/s]]s]
Vowpal Wabbit logistic regression on Elec2: 100%|██████████| 50/50 [00:11<00:00,  4.66it/s]

Processing [baseline] Last Class on Phishing

Naive Bayes on SMTP:  24%|██▍       | 12/50 [00:11<00:25,  1.47it/s] 6.67it/s]




sklearn SGDClassifier on SMTP:   8%|▊         | 4/50 [00:11<02:08,  2.80s/it]              


Processing [baseline] Last Class on SMTP

Hoeffding Adaptive Tree on SMTP:  44%|████▍     | 22/50 [00:11<00:16,  1.68it/s]




[baseline] Last Class on Bananas:  96%|█████████▌| 48/50 [00:00<00:00, 140.25it/s]]
Vowpal Wabbit logistic regression on SMTP:  58%|█████▊    | 29/50 [00:11<00:06,  3.00it/s]
Hoeffding Tree on SMTP:  70%|███████   | 35/50 [00:11<00:05,  2.80it/s]
finished run
number of examples = 90624
weighted example sum = 90624.000000
weighted label sum = -6838.000000
average loss = 0.585391
best constant = -0.150909
best constant's loss = 0.977226
total feature number = 813498

finished run
number of examples = 0
weighted example sum = 0.000000
weighted label sum = 0.000000
average loss = n.a.
total feature number = 0
[baseline] Last Class on Elec2:   8%|▊         | 4/50 [00:00<00:06,  7.43it/s]/s]t/s]
ADWIN Bagging on Bananas: 100%|██████████| 50/50 [00:12<00:00,  3.85it/s]00:04,  3.48it/s]
sklearn SGDClassifier on SMTP:  10%|█         | 5/50 [00:13<01:50,  2.47s/it]s]]]].74it/s]0.000804 0.001079       131072       131072.0        -1.0000        -1.0000        4
Naive Bayes on Elec2:  74%|███████▍

## Multiclass Classification

In [None]:
from river import (ensemble, forest, preprocessing, neighbors)

from river.evaluate import MultiClassClassificationTrack

track = MultiClassClassificationTrack()
track_no = 1
models = {"Adaptive Random Forest": forest.ARFClassifier(seed=42),
        "k-Nearest Neighbors": preprocessing.StandardScaler() | neighbors.KNNClassifier(),

        "Streaming Random Patches": ensemble.SRPClassifier()}

details[track.name] = {"Dataset": {}, "Model": {}}
for dataset in track.datasets:
    details[track.name]["Dataset"][dataset.__class__.__name__] = repr(dataset)
for model_name, model in MODELS[track.name].items():
    details[track.name]["Model"][model_name] = repr(model)
with open("details.json", "w") as f:
    json.dump(details, f, indent=2)
run_track(models=models.keys(), track=track_no, n_workers=50, pickle=True)


Processing Streaming Random Patches on ImageSegmentsProcessing Adaptive Random Forest on KeystrokeProcessing Adaptive Random Forest on Insects
Processing k-Nearest Neighbors on KeystrokeProcessing Streaming Random Patches on KeystrokeProcessing k-Nearest Neighbors on Insects

Processing Streaming Random Patches on InsectsProcessing Adaptive Random Forest on ImageSegments


Processing k-Nearest Neighbors on ImageSegments



Streaming Random Patches on ImageSegments:   0%|          | 0/50 [00:00<?, ?it/s]




Adaptive Random Forest on Insects:   0%|          | 0/50 [00:00<?, ?it/s]s]it/s]67it/s]s]


Adaptive Random Forest on ImageSegments: 51it [00:03, 13.79it/s]                        ]
k-Nearest Neighbors on ImageSegments: 51it [00:06,  7.43it/s]                        it/s]
Adaptive Random Forest on Keystroke: 100%|██████████| 50/50 [00:15<00:00,  3.32it/s]4it/s]
Streaming Random Patches on ImageSegments: 51it [00:20,  2.46it/s]                        
k-Nearest Neighbors on Keystroke:  60%|██████    | 30/50 [00:45<00:29,  1.47s/it]s/it]Process ForkPoolWorker-396:
Process ForkPoolWorker-397:
Process ForkPoolWorker-394:
Process ForkPoolWorker-390:
Streaming Random Patches on Keystroke:  34%|███▍      | 17/50 [00:45<01:28,  2.67s/it]Process ForkPoolWorker-375:
Process ForkPoolWorker-395:
Process ForkPoolWorker-379:
Process ForkPoolWorker-357:
Process ForkPoolWorker-380:
Process ForkPoolWorker-352:
Process ForkPoolWorker-385:
Process ForkPoolWorker-383:
Process ForkPoolWorker-398:
Process F

KeyboardInterrupt: 

Process ForkPoolWorker-355:
Process ForkPoolWorker-365:
Process ForkPoolWorker-368:
Process ForkPoolWorker-386:
Process ForkPoolWorker-370:
Process ForkPoolWorker-376:
Process ForkPoolWorker-360:
Process ForkPoolWorker-377:
Process ForkPoolWorker-382:
Process ForkPoolWorker-366:


Process ForkPoolWorker-351:
Process ForkPoolWorker-358:
Process ForkPoolWorker-392:
Process ForkPoolWorker-372:

k-Nearest Neighbors on Keystroke:  60%|██████    | 30/50 [00:45<00:30,  1.51s/it]Process ForkPoolWorker-361:
Process ForkPoolWorker-367:
Process ForkPoolWorker-391:
Process ForkPoolWorker-381:
Process ForkPoolWorker-369:
Process ForkPoolWorker-364:
Process ForkPoolWorker-363:
Process ForkPoolWorker-373:
Traceback (most recent call last):
Traceback (most recent call last):
Process ForkPoolWorker-362:
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call l

dict_keys(['Adaptive Random Forest', 'AdaBoost', 'Streaming Random Patches'])
