# Training with a Searched Policy Example

In this example, we'll demonstrate how to train a model using a policy schedule.
 
We'll start by loading the data, one-hot encoding the labels in the format required by `PyBoost`, training a base classifier for initial weights, and initializing the model.

In [1]:
# imort packages
from sklearn.preprocessing import OneHotEncoder
from py_boost import GradientBoosting
from sklearn.preprocessing import OneHotEncoder
import numpy as np
import copy
from sklearn.preprocessing import PolynomialFeatures
from gcforest import *
from data.dataset import *
from sklearn.model_selection import KFold, StratifiedKFold
import random
import ray
from hyperparameters import *

## load data
We have encapsulated the calls to all the datasets used in our experiments for simplicity. Due to space constraints, we have included only two datasets: `adult` and `kdd`. You can easily call them like this:

In [2]:
# get data
X_train, y_train, X_test, y_test=get_data('adult')

## initialize model 
The initialization of the model consists of two steps. First, we perform one-hot encoding of the labels for use by the base classifier `PyBoost`. Then, we train a PyBoost model to obtain the necessary weight values for initialization.

In [3]:
# one-hot encoding
one_hot = OneHotEncoder(sparse_output=False).fit(y_train.reshape(-1,1))
# get weights
clf = GradientBoosting('bce', lr=0.3, colsample=0.3, verbose=-1)
clf.fit(X_train, one_hot.transform(y_train.reshape(-1,1)))
weights = clf.get_feature_importance()
weights = [w / sum(weights) for w in weights]

## train model
To train the model, we assume that we have already obtained the boosting parameters for each layer during model initialization. We then invoke the `fit` function for training. It's important to note that our `fit`` function does not support the use of an evaluation set.

In [4]:
# searched policy schedule
aug_policy_schedule = aug_schedule['adult']
# initial gcForest with aug
df = gcForest(encoder=one_hot, weights=weights, classifier='sketch',means=np.mean(X_train, axis=0),std=np.std(X_train, axis=0), random_state=42, num_estimator=150, num_forests=1, max_features=0.3, num_classes=3, n_fold=5, max_layer=15, aug_policy_schedule=aug_policy_schedule,aug_type='cutmix')
# fit gcForest
df.fit(X_train, y_train)

layer index:0
val  acc:86.88615214520439 
layer index:1
val  acc:86.6558152390897 
layer index:2
val  acc:86.26577807806885 
layer index:3
val  acc:86.96600227265748 
layer index:4
val  acc:86.62203249285956 
layer index:5
val  acc:86.50532846042812 
layer index:6
val  acc:86.4500476029606 
layer index:7
val  acc:85.9433064095083 
layer index:8
val  acc:85.62083474094776 
layer index:9
val  acc:86.67117103283069 
layer index:10
val  acc:86.92914836767912 
layer index:11
val  acc:86.92607720893093 
layer index:12
val  acc:86.91379257393814 
layer index:13
val  acc:86.99057154264304 
layer index:14
val  acc:86.95985995516108 


## Making Predictions

You can use the model for predictions using the `predict` and `predict_proba` functions, similar to Scikit-Learn:

In [5]:
# predict 
y_pred=df.predict(X_test)
y_pred_proba=df.predict_proba(X_test)

## Evaluating the Model

To evaluate the model's performance, you can use the custom `score` function, which provides accuracy metrics for each layer and the overall ensemble accuracy. For personal use, you can obtain raw outputs using the predict_proba function to utilize other evaluation criteria.


In [6]:
from sklearn.metrics import accuracy_score
accuracy = accuracy_score(y_test, y_pred)
print("模型准确度:", accuracy)

模型准确度: 0.8753147841041705


In [8]:
# or you can use this
df.score(X_test, y_test)

layer  0
test acc:  87.41477796204164
ensemble acc:  87.53147841041705
layer  1
test acc:  87.34721454456114
ensemble acc:  87.34721454456114
layer  2
test acc:  87.44548860635096
ensemble acc:  87.46391499293655
layer  3
test acc:  87.36564093114674
ensemble acc:  87.51919415269333
layer  4
test acc:  87.33493028683742
ensemble acc:  87.46391499293655
layer  5
test acc:  87.39635157545605
ensemble acc:  87.44548860635096
layer  6
test acc:  87.28579325594251
ensemble acc:  87.46391499293655
layer  7
test acc:  87.44548860635096
ensemble acc:  87.47005712179842
layer  8
test acc:  87.05853448805357
ensemble acc:  87.49462563724587
layer  9
test acc:  87.42706221976538
ensemble acc:  87.5069098949696
layer  10
test acc:  87.43320434862724
ensemble acc:  87.46391499293655
layer  11
test acc:  87.51919415269333
ensemble acc:  87.48234137952214
layer  12
test acc:  87.41477796204164
ensemble acc:  87.47619925066029
layer  13
test acc:  87.44548860635096
ensemble acc:  87.51305202383146
lay

array([0, 0, 0, ..., 1, 0, 1])

## grid search
The following demonstrates how to perform `grid search` for each layer. Grid search helps us find the optimal boosting parameters for the current layer. Our code implementation allows for searching the best parameters for each layer, but it's important to note that a greedy search may not necessarily yield the overall best result in the end.

We have implemented two equivalent versions, differing in whether we use the parallel computing library `ray` to accelerate the search.

In [16]:
# search best policy for one layer and not use ray
df = gcForest(encoder=one_hot, weights=weights, classifier='sketch',means=np.mean(X_train, axis=0),std=np.std(X_train, axis=0), random_state=42, num_estimator=150, num_forests=1, max_features=0.3, num_classes=2, n_fold=5, max_layer=15, aug_type='cutmix')
df.load_data(X_train,y_train) 
df.get_best_policy(X_train,y_train)# or df.get_best_policy()


--------------
layer 0,   X_train shape:(32561, 14)...
 
cutmix   0   0.1   87.14105832130463  
cutmix   0   0.2   87.14105832130463  
cutmix   0   0.3   87.14105832130463  
cutmix   0   0.4   87.14105832130463  
cutmix   0   0.5   87.14105832130463  
cutmix   0   0.6   87.14105832130463  
cutmix   0   0.7   87.14105832130463  
cutmix   0.05   0.2   87.13184484506003  
cutmix   0.05   0.3   87.20862381376493  
cutmix   0.05   0.4   87.17791222628297  
cutmix   0.05   0.5   87.15027179754922  
cutmix   0.05   0.6   87.36525290992292  
cutmix   0.05   0.7   87.32532784619636  
cutmix   0.1   0.2   87.07963514634072  
cutmix   0.1   0.3   87.1594852737938  
cutmix   0.1   0.4   87.17176990878659  
cutmix   0.1   0.5   86.98750038389484  
cutmix   0.1   0.6   87.15334295629741  
cutmix   0.1   0.7   87.07349282884432  
cutmix   0.2   0.4   87.30382973495901  
cutmix   0.2   0.5   87.23933540124689  
cutmix   0.2   0.6   87.18405454377937  
cutmix   0.2   0.7   87.10113325757808  
cutmix  

('cutmix', 0.05, 0.6)

In [17]:
# search best policy for one layer and use ray
df = gcForest(encoder=one_hot, weights=weights, classifier='sketch',means=np.mean(X_train, axis=0),std=np.std(X_train, axis=0), random_state=42, num_estimator=150, num_forests=1, max_features=0.3, num_classes=2, n_fold=5, max_layer=15, aug_type='cutmix',ray=True)
df.load_data(X_train,y_train)
df.get_best_policy(X_train,y_train)#default as each layer's data


--------------
layer 0,   X_train shape:(32561, 14)...
 


2023-10-09 22:44:46,864	INFO worker.py:1621 -- Started a local Ray instance.


cutmix   0   0.1   87.14105832130463  
cutmix   0   0.2   87.14412948005283  
cutmix   0   0.3   87.14105832130463  
cutmix   0   0.4   87.14105832130463  
cutmix   0   0.5   87.14105832130463  
cutmix   0   0.6   87.14105832130463  
cutmix   0   0.7   87.14105832130463  
cutmix   0.05   0.2   87.21169497251313  
cutmix   0.05   0.3   87.20555265501673  
cutmix   0.05   0.4   87.16255643254199  
cutmix   0.05   0.5   87.15027179754922  
cutmix   0.05   0.6   87.14412948005283  
cutmix   0.05   0.7   87.13798716255643  
cutmix   0.1   0.2   87.06427935259974  
cutmix   0.1   0.3   87.13184484506003  
cutmix   0.1   0.4   87.09806209882989  
cutmix   0.1   0.5   87.12877368631185  
cutmix   0.1   0.6   87.1564141150456  
cutmix   0.1   0.7   86.99364270139124  
cutmix   0.2   0.4   86.92300605018274  
cutmix   0.2   0.5   87.10420441632628  
cutmix   0.2   0.6   87.13491600380824  
cutmix   0.2   0.7   87.12877368631185  
cutmix   0.3   0.4   87.11648905131906  
cutmix   0.3   0.5   86.8

('cutmix', 0.05, 0.2)

# augDF
The following code demonstrates how to easily invoke `augDF` for the complete training process.

In [4]:
from aug import aug_DF
use_ray=False
if use_ray:
    ray.init(ignore_reinit_error=True)
aug=aug_DF(classifier='sketch',num_classes=2, max_layer=15, gpu_id = 0, aug_type='cutmix',ray=use_ray)
aug_policy_schedule = aug.fit(X_train, y_train)
if use_ray:
    ray.shutdown()


--------------
layer 0,   X_train shape:(32561, 14)...
 
cutmix   0   0.1   87.36525290992292  
cutmix   0   0.2   87.36525290992292  
cutmix   0   0.3   87.35911059242653  
cutmix   0   0.4   87.36525290992292  
cutmix   0   0.5   87.36525290992292  
cutmix   0   0.6   87.36832406867111  
cutmix   0   0.7   87.36525290992292  
cutmix   0.05   0.2   87.24854887749149  
cutmix   0.05   0.3   87.32225668744817  
cutmix   0.05   0.4   87.23933540124689  
cutmix   0.05   0.5   87.17791222628297  
cutmix   0.05   0.6   87.27004698872885  
cutmix   0.05   0.7   87.29768741746261  
cutmix   0.1   0.2   87.29768741746261  
cutmix   0.1   0.3   87.33454132244096  
cutmix   0.1   0.4   87.36218175117472  
cutmix   0.1   0.5   87.39903565615307  
cutmix   0.1   0.6   87.18098338503117  
cutmix   0.1   0.7   87.23012192500231  
cutmix   0.2   0.4   87.14412948005283  
cutmix   0.2   0.5   87.19633917877215  
cutmix   0.2   0.6   87.27926046497343  
cutmix   0.2   0.7   87.3069008937072  
cutmix  

In [5]:
print(aug.predict_proba(X_test))

[[0.93147813 0.0018552 ]
 [0.71058433 0.22274902]
 [0.61192051 0.3214128 ]
 ...
 [0.16618344 0.76714989]
 [0.8864263  0.04690703]
 [0.24409576 0.68923756]]


In [6]:
aug.score(X_test, y_test)

layer  0
test acc:  87.45777286407468
layer  1
test acc:  87.36564093114674
layer  2
test acc:  87.44548860635096
ensemble acc:  87.41477796204164
layer  3
test acc:  87.42706221976538
ensemble acc:  87.48234137952214
layer  4
test acc:  87.3226460291137
ensemble acc:  87.45163073521283
layer  5
test acc:  87.39020944659418
ensemble acc:  87.39635157545605
layer  6
test acc:  87.49462563724587
ensemble acc:  87.46391499293655
layer  7
test acc:  87.48234137952214
ensemble acc:  87.45163073521283
layer  8
test acc:  87.41477796204164
ensemble acc:  87.48234137952214
layer  9
test acc:  87.40249370431792
ensemble acc:  87.47619925066029
layer  10
test acc:  87.39635157545605
ensemble acc:  87.45163073521283
layer  11
test acc:  87.51919415269333
ensemble acc:  87.47619925066029
layer  12
test acc:  87.54376266814077
ensemble acc:  87.49462563724587
layer  13
test acc:  87.5744733124501
ensemble acc:  87.47005712179842
layer  14
test acc:  87.56833118358823
ensemble acc:  87.4884835083840

array([0, 0, 0, ..., 1, 0, 1])

In [None]:
aug.get_searched_policy()

[('cutmix', 0, 0.4),
 ('cutmix', 0, 0.4),
 ('cutmix', 0.05, 0.7),
 ('cutmix', 0.05, 0.7),
 ('cutmix', 0.05, 0.7),
 ('cutmix', 0.1, 0.6),
 ('cutmix', 0.05, 0.7),
 ('cutmix', 0.05, 0.7),
 ('cutmix', 0, 0.7),
 ('cutmix', 0.05, 0.4),
 ('cutmix', 0.1, 0.7),
 ('cutmix', 0.05, 0.6),
 ('cutmix', 0.05, 0.6),
 ('cutmix', 0.05, 0.6),
 ('cutmix', 0.05, 0.6)]