# Stream Classification
---

## `NEWeather` dataset

**Description:** The National Oceanic and Atmospheric Administration (NOAA),
has compiled a database of weather measurements from over 7,000 weather 
stations worldwide. Records date back to the mid-1900s. Daily measurements
include a variety of features (temperature, pressure, wind speed, etc.) as
well as a series of indicators for precipitation and other weather-related
events. The `NEweather` dataset contains data from this database, specifically
from the Offutt Air Force Base in Bellevue, Nebraska ranging for over 50 years
(1949-1999).

**Features:** 8 Daily weather measurements
 
|       Attribute      | Description |
|:--------------------:|:-----------------------------|
| `temp`                   | Temperature
| `dew_pnt`                | Dew Point
| `sea_lvl_press`          | Sea Level Pressure
| `visibility`             | Visibility
| `avg_wind_spd`           | Average Wind Speed
| `max_sustained_wind_spd` | Maximum Sustained Wind Speed
| `max_temp`               | Maximum Temperature
| `min_temp`               | Minimum Temperature


**Class:** `rain` | 0: no rain, 1: rain
 
**Samples:** 18,159


In [3]:
import pandas as pd
from river.stream import iter_pandas
from river.metrics import Accuracy,BalancedAccuracy,CohenKappa,GeometricMean
from river.metrics.base import Metrics
from river.utils import Rolling
from river.evaluate import progressive_val_score

In [18]:
data = pd.read_csv("../datasets/NEweather.csv")
features = data.columns[:-1]

In [19]:
data.columns[-1]

'rain'

In this example, we load the data from a csv file with `pandas.read_csv`, and we use the [iter_pandas](https://riverml.xyz/latest/api/stream/iter-pandas/) utility method to iterate over the `DataFrame`.

In [20]:
stream = iter_pandas(X=data[features], y=data['rain'])

## Naïve Bayes
---
[GaussianNB](https://riverml.xyz/0.18.0/api/naive-bayes/GaussianNB/) maintains a Gaussian distribution $G_{cf}$ is maintained for each class $c$ and each feature $f$. Each Gaussian is updated using the amount associated with each feature; the details can be be found in proba.Gaussian. The joint log-likelihood is then obtained by summing the log probabilities of each feature associated with each class.

In [7]:
from river.naive_bayes import GaussianNB

model = GaussianNB()
metrics = Metrics(metrics=[Accuracy(),BalancedAccuracy(),GeometricMean(),CohenKappa()])

progressive_val_score(dataset=stream,
                      model=model,
                      metric=metrics,
                      print_every=1000)

[1,000] Accuracy: 71.27%, BalancedAccuracy: 71.96%, GeometricMean: 71.93%, CohenKappa: 40.19%
[2,000] Accuracy: 69.88%, BalancedAccuracy: 70.38%, GeometricMean: 70.37%, CohenKappa: 36.23%
[3,000] Accuracy: 68.99%, BalancedAccuracy: 69.64%, GeometricMean: 69.62%, CohenKappa: 34.23%
[4,000] Accuracy: 68.82%, BalancedAccuracy: 68.87%, GeometricMean: 68.87%, CohenKappa: 33.48%
[5,000] Accuracy: 69.09%, BalancedAccuracy: 67.97%, GeometricMean: 67.92%, CohenKappa: 32.70%
[6,000] Accuracy: 69.13%, BalancedAccuracy: 67.87%, GeometricMean: 67.80%, CohenKappa: 32.65%
[7,000] Accuracy: 69.15%, BalancedAccuracy: 67.89%, GeometricMean: 67.82%, CohenKappa: 32.62%
[8,000] Accuracy: 68.50%, BalancedAccuracy: 67.31%, GeometricMean: 67.25%, CohenKappa: 31.56%
[9,000] Accuracy: 68.65%, BalancedAccuracy: 66.69%, GeometricMean: 66.50%, CohenKappa: 30.97%
[10,000] Accuracy: 69.04%, BalancedAccuracy: 66.36%, GeometricMean: 66.01%, CohenKappa: 30.75%
[11,000] Accuracy: 69.52%, BalancedAccuracy: 66.52%, Geomet

Accuracy: 69.21%, BalancedAccuracy: 66.27%, GeometricMean: 65.80%, CohenKappa: 31.28%

## K-Nearest Neighbors
---
[KNN](https://riverml.xyz/0.18.0/api/neighbors/KNNClassifier/) is a non-parametric classification method that keeps track of the last window_size training samples. The predicted class-label for a given query sample is obtained in two steps:

- Find the closest n_neighbors to the query sample in the data window. 
- Aggregate the class-labels of the n_neighbors to define the predicted class for the query sample.

In [9]:
from river.neighbors import KNNClassifier

model = KNNClassifier(n_neighbors=5)
metrics = Metrics(metrics=[Accuracy(),BalancedAccuracy(),GeometricMean(),CohenKappa()])
stream = iter_pandas(X=data[features], y=data['rain'])

progressive_val_score(dataset=stream,
                      model=model,
                      metric=metrics,
                      print_every=1000)

[1,000] Accuracy: 77.18%, BalancedAccuracy: 71.52%, GeometricMean: 69.63%, CohenKappa: 45.32%
[2,000] Accuracy: 78.34%, BalancedAccuracy: 71.07%, GeometricMean: 68.71%, CohenKappa: 44.95%
[3,000] Accuracy: 78.86%, BalancedAccuracy: 70.49%, GeometricMean: 67.63%, CohenKappa: 44.34%
[4,000] Accuracy: 78.27%, BalancedAccuracy: 70.40%, GeometricMean: 67.72%, CohenKappa: 43.85%
[5,000] Accuracy: 78.04%, BalancedAccuracy: 70.34%, GeometricMean: 67.72%, CohenKappa: 43.62%
[6,000] Accuracy: 77.90%, BalancedAccuracy: 70.46%, GeometricMean: 68.00%, CohenKappa: 43.66%
[7,000] Accuracy: 78.20%, BalancedAccuracy: 70.89%, GeometricMean: 68.58%, CohenKappa: 44.44%
[8,000] Accuracy: 77.92%, BalancedAccuracy: 70.80%, GeometricMean: 68.54%, CohenKappa: 44.16%
[9,000] Accuracy: 78.10%, BalancedAccuracy: 71.04%, GeometricMean: 68.78%, CohenKappa: 44.74%
[10,000] Accuracy: 78.14%, BalancedAccuracy: 71.09%, GeometricMean: 68.87%, CohenKappa: 44.78%
[11,000] Accuracy: 78.33%, BalancedAccuracy: 71.29%, Geomet

Accuracy: 77.88%, BalancedAccuracy: 72.01%, GeometricMean: 70.26%, CohenKappa: 46.15%

## Hoeffding Tree
---

Tree-based models are popular due to their interpretability. [Hoeffding Tree](https://riverml.xyz/0.18.0/api/tree/HoeffdingTreeClassifier/)  uses a tree data structure to model the data. When a sample arrives, it traverses the tree until it reaches a leaf node. Internal nodes define the path for a data sample based on the values of its features. Leaf nodes are models that provide predictions for unlabeled-samples and can update their internal state using the labels from labeled samples.

In [10]:
from river.tree import HoeffdingTreeClassifier

model = HoeffdingTreeClassifier()
metrics = Metrics(metrics=[Accuracy(),BalancedAccuracy(),GeometricMean(),CohenKappa()])
stream = iter_pandas(X=data[features], y=data['rain'])

progressive_val_score(dataset=stream,
                      model=model,
                      metric=metrics,
                      print_every=1000)

[1,000] Accuracy: 70.87%, BalancedAccuracy: 71.11%, GeometricMean: 71.10%, CohenKappa: 38.92%
[2,000] Accuracy: 69.73%, BalancedAccuracy: 68.12%, GeometricMean: 68.01%, CohenKappa: 33.45%
[3,000] Accuracy: 70.89%, BalancedAccuracy: 63.00%, GeometricMean: 60.16%, CohenKappa: 26.85%
[4,000] Accuracy: 71.29%, BalancedAccuracy: 61.85%, GeometricMean: 57.40%, CohenKappa: 25.57%
[5,000] Accuracy: 71.79%, BalancedAccuracy: 62.23%, GeometricMean: 57.58%, CohenKappa: 26.59%
[6,000] Accuracy: 72.13%, BalancedAccuracy: 62.56%, GeometricMean: 57.88%, CohenKappa: 27.40%
[7,000] Accuracy: 72.82%, BalancedAccuracy: 64.11%, GeometricMean: 60.42%, CohenKappa: 30.23%
[8,000] Accuracy: 72.58%, BalancedAccuracy: 64.31%, GeometricMean: 60.90%, CohenKappa: 30.45%
[9,000] Accuracy: 72.80%, BalancedAccuracy: 63.98%, GeometricMean: 59.98%, CohenKappa: 30.21%
[10,000] Accuracy: 72.85%, BalancedAccuracy: 63.64%, GeometricMean: 59.32%, CohenKappa: 29.69%
[11,000] Accuracy: 73.30%, BalancedAccuracy: 63.81%, Geomet

Accuracy: 73.55%, BalancedAccuracy: 65.87%, GeometricMean: 62.56%, CohenKappa: 34.07%

## Hoeffding Adaptive Tree
---
The [HAT](https://riverml.xyz/0.18.0/api/tree/HoeffdingAdaptiveTreeClassifier/) model uses [ADWIN](https://riverml.xyz/0.18.0/api/drift/ADWIN/) to detect changes. If change is detected in a given branch, an alternate branch is created and eventually replaces the original branch if it shows better performance on new data.

In [11]:
from river.tree import HoeffdingAdaptiveTreeClassifier

model = HoeffdingAdaptiveTreeClassifier(seed=42)
metrics = Metrics(metrics=[Accuracy(),BalancedAccuracy(),GeometricMean(),CohenKappa()])
stream = iter_pandas(X=data[features], y=data['rain'])

progressive_val_score(dataset=stream, 
                      model=model, 
                      metric=metrics, 
                      print_every=1000)

[1,000] Accuracy: 70.67%, BalancedAccuracy: 70.72%, GeometricMean: 70.72%, CohenKappa: 38.32%
[2,000] Accuracy: 69.88%, BalancedAccuracy: 70.05%, GeometricMean: 70.05%, CohenKappa: 35.85%
[3,000] Accuracy: 69.72%, BalancedAccuracy: 67.38%, GeometricMean: 67.15%, CohenKappa: 32.06%
[4,000] Accuracy: 71.39%, BalancedAccuracy: 67.12%, GeometricMean: 66.31%, CohenKappa: 33.24%
[5,000] Accuracy: 71.97%, BalancedAccuracy: 66.39%, GeometricMean: 64.95%, CohenKappa: 32.83%
[6,000] Accuracy: 72.50%, BalancedAccuracy: 66.95%, GeometricMean: 65.52%, CohenKappa: 34.03%
[7,000] Accuracy: 73.17%, BalancedAccuracy: 67.60%, GeometricMean: 66.19%, CohenKappa: 35.37%
[8,000] Accuracy: 72.88%, BalancedAccuracy: 67.33%, GeometricMean: 65.88%, CohenKappa: 34.91%
[9,000] Accuracy: 72.94%, BalancedAccuracy: 66.52%, GeometricMean: 64.51%, CohenKappa: 33.97%
[10,000] Accuracy: 73.05%, BalancedAccuracy: 66.36%, GeometricMean: 64.21%, CohenKappa: 33.78%
[11,000] Accuracy: 73.63%, BalancedAccuracy: 66.66%, Geomet

Accuracy: 73.43%, BalancedAccuracy: 67.90%, GeometricMean: 66.25%, CohenKappa: 36.72%

## AdaptiveRandomForest
---



The 3 most important aspects of [ARF](https://riverml.xyz/0.18.0/api/forest/ARFClassifier/) are:
- inducing diversity through re-sampling
- inducing diversity through randomly selecting subsets of features for node splits
- drift detectors per base tree, which cause selective resets in response to drifts

It also allows training background trees, which start training if a warning is detected and replace the active tree if the warning escalates to a drift.

In [21]:
from river.forest import ARFClassifier

model = ARFClassifier(n_models=10)
metrics = Metrics(metrics=[Accuracy(),BalancedAccuracy(),GeometricMean(),CohenKappa()])
stream = iter_pandas(X=data[features], y=data['rain'])

progressive_val_score(dataset=stream,
                      model=model,
                      metric=metrics,
                      print_every=1000)

[1,000] Accuracy: 72.67%, BalancedAccuracy: 63.82%, GeometricMean: 58.52%, CohenKappa: 30.79%
[2,000] Accuracy: 75.19%, BalancedAccuracy: 64.95%, GeometricMean: 59.71%, CohenKappa: 33.59%
[3,000] Accuracy: 76.43%, BalancedAccuracy: 65.25%, GeometricMean: 59.63%, CohenKappa: 34.72%
[4,000] Accuracy: 76.67%, BalancedAccuracy: 65.59%, GeometricMean: 59.76%, CohenKappa: 35.73%
[5,000] Accuracy: 76.72%, BalancedAccuracy: 66.08%, GeometricMean: 60.64%, CohenKappa: 36.58%
[6,000] Accuracy: 76.95%, BalancedAccuracy: 67.16%, GeometricMean: 62.63%, CohenKappa: 38.40%
[7,000] Accuracy: 77.48%, BalancedAccuracy: 68.00%, GeometricMean: 63.89%, CohenKappa: 40.02%
[8,000] Accuracy: 77.26%, BalancedAccuracy: 68.22%, GeometricMean: 64.38%, CohenKappa: 40.20%
[9,000] Accuracy: 77.21%, BalancedAccuracy: 67.94%, GeometricMean: 63.79%, CohenKappa: 39.88%
[10,000] Accuracy: 77.46%, BalancedAccuracy: 68.16%, GeometricMean: 64.07%, CohenKappa: 40.38%
[11,000] Accuracy: 77.79%, BalancedAccuracy: 68.47%, Geomet

Accuracy: 77.93%, BalancedAccuracy: 70.56%, GeometricMean: 67.72%, CohenKappa: 44.52%

## StreamingRandomPatches
---
[SRP](https://riverml.xyz/0.18.0/api/ensemble/SRPClassifier/) is an ensemble method that simulates bagging or random subspaces. The default algorithm uses both bagging and random subspaces, namely Random Patches. The default base estimator is a Hoeffding Tree, but other base estimators can be used (differently from random forest variations).

In [22]:
from river.ensemble import SRPClassifier
from river.tree import HoeffdingTreeClassifier

model = SRPClassifier(model=HoeffdingTreeClassifier(),
                      n_models=10,
                      seed=42)
metrics = Metrics(metrics=[Accuracy(),BalancedAccuracy(),GeometricMean(),CohenKappa()])
stream = iter_pandas(X=data[features], y=data['rain'])

progressive_val_score(dataset=stream, 
                      model=model, 
                      metric=metrics, 
                      print_every=1000)

[1,000] Accuracy: 72.97%, BalancedAccuracy: 65.15%, GeometricMean: 61.14%, CohenKappa: 33.02%
[2,000] Accuracy: 75.09%, BalancedAccuracy: 65.36%, GeometricMean: 60.68%, CohenKappa: 34.10%
[3,000] Accuracy: 76.53%, BalancedAccuracy: 65.39%, GeometricMean: 59.82%, CohenKappa: 35.02%
[4,000] Accuracy: 76.84%, BalancedAccuracy: 66.08%, GeometricMean: 60.65%, CohenKappa: 36.63%
[5,000] Accuracy: 76.92%, BalancedAccuracy: 66.32%, GeometricMean: 60.95%, CohenKappa: 37.13%
[6,000] Accuracy: 77.28%, BalancedAccuracy: 67.00%, GeometricMean: 61.96%, CohenKappa: 38.52%
[7,000] Accuracy: 77.81%, BalancedAccuracy: 67.72%, GeometricMean: 63.02%, CohenKappa: 40.00%
[8,000] Accuracy: 77.60%, BalancedAccuracy: 67.90%, GeometricMean: 63.44%, CohenKappa: 40.13%
[9,000] Accuracy: 77.72%, BalancedAccuracy: 68.02%, GeometricMean: 63.48%, CohenKappa: 40.54%
[10,000] Accuracy: 77.85%, BalancedAccuracy: 68.17%, GeometricMean: 63.71%, CohenKappa: 40.80%
[11,000] Accuracy: 78.16%, BalancedAccuracy: 68.37%, Geomet

Accuracy: 78.16%, BalancedAccuracy: 70.02%, GeometricMean: 66.51%, CohenKappa: 44.12%

## Concept Drift Impact

Concept drift can negatively impact learning methods if not properly handled. Multiple real-world applications suffer **model degradation** as the models can not adapt to changes in the data.

---
## `AGRAWAL` dataset

We will load the data from a csv file. The data was generated using the `AGRAWAL` data generator with 3 **gradual drifts** at the 5k, 10k, and 15k marks. It contains 9 features, 6 numeric and 3 categorical.

There are 10 functions for generating binary class labels from the features. These functions determine whether a **loan** should be approved.

| Feature    | Description            | Values                                                                |
|------------|------------------------|-----------------------------------------------------------------------|
| `salary`     | salary                 | uniformly distributed from 20k to 150k                                |
| `commission` | commission             | if (salary <   75k) then 0 else uniformly distributed from 10k to 75k |
| `age`        | age                    | uniformly distributed from 20 to 80                                   |
| `elevel`     | education level        | uniformly chosen from 0 to 4                                          |
| `car`        | car maker              | uniformly chosen from 1 to 20                                         |
| `zipcode`    | zip code of the town   | uniformly chosen from 0 to 8                                          |
| `hvalue`     | value of the house     | uniformly distributed from 50k x zipcode to 100k x zipcode            |
| `hyears`     | years house owned      | uniformly distributed from 1 to 30                                    |
| `loan`       | total loan amount      | uniformly distributed from 0 to 500k                                  |

**Class:** `y` | 0: no loan, 1: loan
 
**Samples:** 20,000

`elevel`, `car`, and `zipcode` are categorical features.

In [23]:
data = pd.read_csv("../datasets/agr_a_20k.csv")
features = data.columns[:-1]

## Naïve Bayes

In [13]:
from river.naive_bayes import GaussianNB

model = GaussianNB()
metrics = Rolling(Metrics(metrics=[Accuracy(),BalancedAccuracy(),GeometricMean(),CohenKappa()]),window_size=500)
stream = iter_pandas(X=data[features], y=data['class'])

progressive_val_score(dataset=stream,
                      model=model,
                      metric=metrics,
                      print_every=500)


[500] Accuracy: 81.76%, BalancedAccuracy: 74.69%, GeometricMean: 71.56%, CohenKappa: 54.66%
[1,000] Accuracy: 86.20%, BalancedAccuracy: 79.37%, GeometricMean: 76.72%, CohenKappa: 65.37%
[1,500] Accuracy: 87.80%, BalancedAccuracy: 82.87%, GeometricMean: 81.07%, CohenKappa: 71.19%
[2,000] Accuracy: 89.40%, BalancedAccuracy: 84.41%, GeometricMean: 82.96%, CohenKappa: 74.45%
[2,500] Accuracy: 89.80%, BalancedAccuracy: 84.49%, GeometricMean: 83.16%, CohenKappa: 74.70%
[3,000] Accuracy: 87.00%, BalancedAccuracy: 80.54%, GeometricMean: 78.15%, CohenKappa: 67.64%
[3,500] Accuracy: 87.60%, BalancedAccuracy: 79.87%, GeometricMean: 77.29%, CohenKappa: 67.25%
[4,000] Accuracy: 90.80%, BalancedAccuracy: 85.44%, GeometricMean: 84.19%, CohenKappa: 76.91%
[4,500] Accuracy: 87.00%, BalancedAccuracy: 80.91%, GeometricMean: 78.70%, CohenKappa: 68.03%
[5,000] Accuracy: 86.80%, BalancedAccuracy: 80.98%, GeometricMean: 78.86%, CohenKappa: 67.87%
[5,500] Accuracy: 47.60%, BalancedAccuracy: 52.32%, GeometricM

Accuracy: 57.80%, BalancedAccuracy: 60.46%, GeometricMean: 59.40%, CohenKappa: 18.97%

## Hoeffding Tree

In [15]:
from river.tree import HoeffdingTreeClassifier

model = HoeffdingTreeClassifier(nominal_attributes=['elevel', 'car', 'zipcode'])
metrics = Rolling(Metrics(metrics=[Accuracy(),BalancedAccuracy(),GeometricMean(),CohenKappa()]),window_size=500)
stream = iter_pandas(X=data[features], y=data['class'])

progressive_val_score(dataset=stream,
                      model=model,
                      metric=metrics,
                      print_every=500)


[500] Accuracy: 79.76%, BalancedAccuracy: 72.89%, GeometricMean: 69.86%, CohenKappa: 50.18%
[1,000] Accuracy: 83.60%, BalancedAccuracy: 75.91%, GeometricMean: 72.37%, CohenKappa: 58.36%
[1,500] Accuracy: 78.80%, BalancedAccuracy: 76.25%, GeometricMean: 75.74%, CohenKappa: 53.18%
[2,000] Accuracy: 87.00%, BalancedAccuracy: 90.15%, GeometricMean: 89.61%, CohenKappa: 73.49%
[2,500] Accuracy: 83.20%, BalancedAccuracy: 87.61%, GeometricMean: 86.73%, CohenKappa: 66.16%
[3,000] Accuracy: 88.60%, BalancedAccuracy: 91.44%, GeometricMean: 91.04%, CohenKappa: 76.38%
[3,500] Accuracy: 89.00%, BalancedAccuracy: 91.51%, GeometricMean: 91.28%, CohenKappa: 76.29%
[4,000] Accuracy: 91.40%, BalancedAccuracy: 92.69%, GeometricMean: 92.63%, CohenKappa: 81.10%
[4,500] Accuracy: 89.80%, BalancedAccuracy: 91.43%, GeometricMean: 91.29%, CohenKappa: 78.43%
[5,000] Accuracy: 90.80%, BalancedAccuracy: 92.17%, GeometricMean: 92.07%, CohenKappa: 80.49%
[5,500] Accuracy: 55.20%, BalancedAccuracy: 55.87%, GeometricM

Accuracy: 66.40%, BalancedAccuracy: 67.02%, GeometricMean: 66.97%, CohenKappa: 32.35%

## Hoeffding Adaptive Tree

In [16]:
from river.tree import HoeffdingAdaptiveTreeClassifier

model = HoeffdingAdaptiveTreeClassifier(nominal_attributes=['elevel', 'car', 'zipcode'], seed=42)
metrics = Rolling(Metrics(metrics=[Accuracy(),BalancedAccuracy(),GeometricMean(),CohenKappa()]),window_size=500)
stream = iter_pandas(X=data[features], y=data['class'])

progressive_val_score(dataset=stream, 
                      model=model, 
                      metric=metrics, 
                      print_every=500)

[500] Accuracy: 81.36%, BalancedAccuracy: 74.69%, GeometricMean: 71.90%, CohenKappa: 54.12%
[1,000] Accuracy: 83.80%, BalancedAccuracy: 78.03%, GeometricMean: 76.11%, CohenKappa: 60.55%
[1,500] Accuracy: 90.40%, BalancedAccuracy: 91.92%, GeometricMean: 91.77%, CohenKappa: 80.02%
[2,000] Accuracy: 91.80%, BalancedAccuracy: 92.79%, GeometricMean: 92.74%, CohenKappa: 82.41%
[2,500] Accuracy: 90.60%, BalancedAccuracy: 92.74%, GeometricMean: 92.55%, CohenKappa: 79.88%
[3,000] Accuracy: 92.20%, BalancedAccuracy: 93.55%, GeometricMean: 93.46%, CohenKappa: 83.24%
[3,500] Accuracy: 93.00%, BalancedAccuracy: 94.40%, GeometricMean: 94.33%, CohenKappa: 84.39%
[4,000] Accuracy: 95.20%, BalancedAccuracy: 96.49%, GeometricMean: 96.43%, CohenKappa: 89.33%
[4,500] Accuracy: 92.20%, BalancedAccuracy: 93.10%, GeometricMean: 93.05%, CohenKappa: 83.18%
[5,000] Accuracy: 92.40%, BalancedAccuracy: 93.94%, GeometricMean: 93.82%, CohenKappa: 83.88%
[5,500] Accuracy: 50.60%, BalancedAccuracy: 52.76%, GeometricM

Accuracy: 78.20%, BalancedAccuracy: 79.66%, GeometricMean: 79.42%, CohenKappa: 56.23%

## AdaptiveRandomForest

In [24]:
from river.forest import ARFClassifier

model = ARFClassifier(n_models=10,nominal_attributes=['elevel', 'car', 'zipcode'])
metrics = Rolling(Metrics(metrics=[Accuracy(),BalancedAccuracy(),GeometricMean(),CohenKappa()]),window_size=500)
stream = iter_pandas(X=data[features], y=data['class'])

progressive_val_score(dataset=stream,
                      model=model,
                      metric=metrics,
                      print_every=500)

[500] Accuracy: 71.94%, BalancedAccuracy: 65.23%, GeometricMean: 61.98%, CohenKappa: 32.59%
[1,000] Accuracy: 78.60%, BalancedAccuracy: 72.77%, GeometricMean: 70.67%, CohenKappa: 48.56%
[1,500] Accuracy: 79.20%, BalancedAccuracy: 73.30%, GeometricMean: 70.38%, CohenKappa: 50.67%
[2,000] Accuracy: 80.40%, BalancedAccuracy: 73.74%, GeometricMean: 70.75%, CohenKappa: 52.09%
[2,500] Accuracy: 80.40%, BalancedAccuracy: 72.83%, GeometricMean: 69.65%, CohenKappa: 50.42%
[3,000] Accuracy: 82.20%, BalancedAccuracy: 75.44%, GeometricMean: 72.65%, CohenKappa: 55.98%
[3,500] Accuracy: 82.40%, BalancedAccuracy: 74.85%, GeometricMean: 72.22%, CohenKappa: 54.62%
[4,000] Accuracy: 87.60%, BalancedAccuracy: 82.42%, GeometricMean: 81.21%, CohenKappa: 69.33%
[4,500] Accuracy: 85.20%, BalancedAccuracy: 80.57%, GeometricMean: 79.29%, CohenKappa: 64.89%
[5,000] Accuracy: 83.80%, BalancedAccuracy: 77.86%, GeometricMean: 75.56%, CohenKappa: 60.75%
[5,500] Accuracy: 49.20%, BalancedAccuracy: 53.10%, GeometricM

Accuracy: 69.60%, BalancedAccuracy: 63.31%, GeometricMean: 57.42%, CohenKappa: 29.25%

## StreamingRandomPatches
---
We set the drift and warning detection options

In [25]:
from river.ensemble import SRPClassifier
from river.tree import HoeffdingTreeClassifier
from river.drift import ADWIN

model = SRPClassifier(model=HoeffdingTreeClassifier(nominal_attributes=['elevel', 'car', 'zipcode']),
                      n_models=10,
                      drift_detector=ADWIN(delta=0.001),
                      warning_detector=ADWIN(delta=0.01),
                      seed=42)

metrics = Rolling(Metrics(metrics=[Accuracy(),BalancedAccuracy(),GeometricMean(),CohenKappa()]),window_size=500)
stream = iter_pandas(X=data[features], y=data['class'])

progressive_val_score(dataset=stream, 
                      model=model, 
                      metric=metrics, 
                      print_every=500)

[500] Accuracy: 88.58%, BalancedAccuracy: 85.31%, GeometricMean: 84.74%, CohenKappa: 73.36%
[1,000] Accuracy: 94.20%, BalancedAccuracy: 93.08%, GeometricMean: 93.02%, CohenKappa: 86.82%
[1,500] Accuracy: 93.40%, BalancedAccuracy: 92.11%, GeometricMean: 92.00%, CohenKappa: 85.40%
[2,000] Accuracy: 94.60%, BalancedAccuracy: 93.48%, GeometricMean: 93.42%, CohenKappa: 87.85%
[2,500] Accuracy: 93.60%, BalancedAccuracy: 93.00%, GeometricMean: 92.98%, CohenKappa: 85.44%
[3,000] Accuracy: 93.80%, BalancedAccuracy: 92.81%, GeometricMean: 92.76%, CohenKappa: 86.00%
[3,500] Accuracy: 93.60%, BalancedAccuracy: 93.03%, GeometricMean: 93.02%, CohenKappa: 85.15%
[4,000] Accuracy: 96.20%, BalancedAccuracy: 96.03%, GeometricMean: 96.03%, CohenKappa: 91.28%
[4,500] Accuracy: 93.60%, BalancedAccuracy: 92.56%, GeometricMean: 92.50%, CohenKappa: 85.62%
[5,000] Accuracy: 93.80%, BalancedAccuracy: 92.90%, GeometricMean: 92.86%, CohenKappa: 86.17%
[5,500] Accuracy: 57.60%, BalancedAccuracy: 56.61%, GeometricM

Accuracy: 65.00%, BalancedAccuracy: 55.39%, GeometricMean: 37.54%, CohenKappa: 12.69%