This notebook provides some examples using the ```CascadeSVC``` class. As shown below, it can be used the same way as the ```SVC``` class from ```scikit-learn```, on which it is based. It can indeed be plugged into a ```Pipeline```, its hyperparameters can be tuned with ```GridSearchCV```, etc.

Below are the required imports to run this notebook.

In [6]:
import pandas as pd
from sklearn.datasets import make_moons
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.model_selection import GridSearchCV
from cascadesvc.cascadesvc import CascadeSVC
from sklearn.metrics import roc_auc_score, make_scorer
from lets_plot import LetsPlot, ggplot, aes, geom_point
LetsPlot.setup_html()

Let us generate a dataset containing 100,000 samples, 2 continuous features and a binary response, using the ```make_moons``` function.

In [7]:
X, y = make_moons(n_samples = 100000, noise = 0.15)
data = pd.DataFrame(X,columns=["x0","x1"]).assign(y=y.astype("str"))
ggplot(data,aes(x="x0",y="x1",color="y"))+geom_point(alpha=0.5)

It is often recommended to scale data before fitting a SVM; ```scikit-learn``` provides a convenient way to do so using ```Pipelines```. The SVM can be trained as shown below.

In [8]:
svc = Pipeline(
    (
        ("scaler", StandardScaler()),
        ("svm", SVC(probability=True, kernel="rbf", C=1, gamma=1))
    )
)
svc.fit(X, y)

A Cascade SVM can be trained the same way:

In [9]:
svc = Pipeline(
    (
        ("scaler", StandardScaler()),
        ("svm", CascadeSVC(probability=True, kernel="rbf", C=1, gamma=1))
    )
)
svc.fit(X, y)

Cascade layer 1
Total number of instances: 100000
Cascade layer 2
Done


On this example, with this set of hyperparameters, the training time of the ```CascadeSVC``` class is 18 times lower.

Now, let us tune it with ```GridSearchCV```.

In [10]:
auc = make_scorer(roc_auc_score, response_method="predict_proba")
gridsearch = GridSearchCV(svc, param_grid = {"svm__C": [0.1,1,10], "svm__gamma": [0.1,1,10]}, scoring = auc, verbose = 1, n_jobs = -1)
gridsearch.fit(X, y)
res = pd.DataFrame({
    "C" : gridsearch.cv_results_["param_svm__C"],
    "gamma" : gridsearch.cv_results_["param_svm__gamma"],
    "mean_test_score" : gridsearch.cv_results_["mean_test_score"],
    "std_test_score" : gridsearch.cv_results_["std_test_score"]
})
print(res)

Fitting 5 folds for each of 9 candidates, totalling 45 fits
Cascade layer 1
Total number of instances: 100000
Cascade layer 2
Done
      C  gamma  mean_test_score  std_test_score
0   0.1    0.1         0.671417        0.003880
1   0.1    1.0         0.999491        0.000084
2   0.1   10.0         0.995811        0.000759
3   1.0    0.1         0.999434        0.000082
4   1.0    1.0         0.999372        0.000094
5   1.0   10.0         0.996301        0.000868
6  10.0    0.1         0.999445        0.000082
7  10.0    1.0         0.996724        0.001294
8  10.0   10.0         0.997104        0.000764


A maximum AUC value of 0.999445 was obtained for hyperparameter values C = 10 and gamma = 0.1. This set of hyperparameters can now be used to predict values and get metrics for new data. 

In [12]:
Xtest, ytest = make_moons(n_samples = 100000, noise = 0.15)
prob = gridsearch.best_estimator_.predict_proba(Xtest)[:,1]
print("Confusion matrix:")
print(pd.crosstab(ytest, prob>0.5))
print(f"AUC: {roc_auc_score(ytest, prob)}")

Confusion matrix:
col_0  False  True 
row_0              
0      49532    468
1        453  49547
AUC: 0.999505427


The obtained AUC value on this test data is similar to the value previously obtained through cross-validation.