# Иерархические модели: Результаты экспериментов

## Описание эндпоинтов
1. **Категория EPA** (классы 1-4)  
2. **Категория GHS** (классы 1-5)  
3. **LD50** (регрессия, ммоль/кг)  
4. **Токсичность** (бинарная классификация: `1`, если LD50 < 2000 мг/кг)  
5. **Высокая токсичность** (бинарная классификация: `1`, если LD50 < 50 мг/кг)  

## Используемые алгоритмы
- **Random Forest**  
- **SVM/SVR**  
- **XGBoost**  
- **kNN**  

Все модели настроены с использованием **5-кратной кросс-валидации**.

---

### Эндпоинт 1: Токсичность (LD50 < 2000 мг/кг)
**Тип задачи**: Бинарная классификация  

| Алгоритм       | Оптимальные гиперпараметры                                                                                     | Best Score       |
|----------------|---------------------------------------------------------------------------------------------------------------|------------------|
| **kNN**        | `n_neighbors=151`, `p=1`, `weights='distance'`                                                                | 0.8786442988615353 |
| **SVM**        | `C=10`, `gamma=0.001`, `kernel='rbf'`                                                                         | 0.8816975779192351 |
| **Random Forest** | `n_estimators=500`, `min_samples_split=10`, `min_samples_leaf=6`, `max_features='log2'`, `max_depth=65`, `bootstrap=True` | 0.8814752606339683 |
| **XGBoost**    | `subsample=0.6`, `n_estimators=500`, `min_child_weight=1`, `max_depth=3`, `learning_rate=0.01`, `gamma=5`, `colsample_bytree=0.6` | 0.8817171651619816 |

---

### Эндпоинт 2: Прогнозирование LD50 (регрессия)
**Тип задачи**: Регрессия  

| Алгоритм       | Оптимальные гиперпараметры                                                                                     | Best Score       |
|----------------|---------------------------------------------------------------------------------------------------------------|------------------|
| **Random Forest** | `n_estimators=1500`, `min_samples_split=5`, `min_samples_leaf=4`, `max_features='sqrt'`, `max_depth=80`, `bootstrap=False` | 0.3066968307723924 |
| **XGBoost**    | `subsample=0.6`, `n_estimators=1500`, `min_child_weight=1`, `max_depth=10`, `learning_rate=0.01`, `gamma=1`, `colsample_bytree=0.9` | 0.3072981768556339 |
| **SVR**        | `C=1`, `gamma=0.01`, `kernel='rbf'`                                                                           | 0.30772790955034346 |
| **kNN**        | `n_neighbors=45`, `p=2`, `weights='distance'`                                                                 | 0.3181178447216808 |

---

### Эндпоинт 3: Категория GHS (многоклассовая классификация)
**Тип задачи**: Многоклассовая классификация (5 классов)  

| Алгоритм       | Оптимальные гиперпараметры                                                                                     | Best Score       |
|----------------|---------------------------------------------------------------------------------------------------------------|------------------|
| **Random Forest** | `n_estimators=500`, `min_samples_split=2`, `min_samples_leaf=2`, `max_features='sqrt'`, `max_depth=65`, `bootstrap=False` | 0.6389376458324004 |
| **XGBoost**    | `subsample=0.9`, `n_estimators=500`, `min_child_weight=3`, `max_depth=10`, `learning_rate=0.01`, `gamma=0`, `colsample_bytree=0.6` | 0.6362382063966948 |
| **SVM**        | `C=0.1`, `kernel='linear'`                                                                                    | 0.6338661563414092 |
| **kNN**        | `n_neighbors=55`, `p=1`, `weights='distance'`                                                                 | 0.6290994018747665 |

---

**Примечание**:  
- Значения `Best Score` приведены без округления.  
- Для регрессии (LD50) метрика — коэффициент детерминации (R²).  
- Для классификации — точность (Accuracy).

In [10]:
from utils import * 

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import math
import itertools
from pprint import pprint
import joblib

import statistics

# Models
from xgboost import XGBClassifier, XGBRegressor
from sklearn.ensemble import RandomForestClassifier, RandomForestRegressor
from sklearn.svm import SVC, SVR
from sklearn.neighbors import KNeighborsClassifier, KNeighborsRegressor

from sklearn.preprocessing import LabelEncoder, LabelBinarizer
from sklearn.model_selection import KFold, cross_validate, GridSearchCV, cross_val_score, RandomizedSearchCV 
from sklearn.model_selection import cross_val_predict

from sklearn.pipeline import Pipeline

from sklearn.metrics import make_scorer

#regression matrics
from sklearn.metrics import mean_absolute_error , mean_squared_error, r2_score

#classification metrics
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
from sklearn.metrics import accuracy_score, balanced_accuracy_score, roc_auc_score, f1_score, matthews_corrcoef

from sklearn.base import BaseEstimator
from sklearn.base import ClassifierMixin
from sklearn.base import TransformerMixin
from sklearn.base import clone
from sklearn.model_selection._split import check_cv

In [11]:
train_labels = pd.read_csv('../data/processed/train_labels.csv', index_col = 'CASRN')
train_labels.shape

(8221, 6)

In [12]:
train_Hfeatures = pd.read_csv('../data/Hmodel_features_combined/train_Hfeatures.csv', index_col = 'CASRN')
train_Hfeatures.shape

(8221, 100)

## Endpoint 1: Toxic

In [13]:
%%time
#yes
endpoint = 'Toxic'
descriptor = 'Hmodel'
algorithm = 'RF'
name = f'{endpoint}_{algorithm}_{descriptor}'

encoder_toxic = joblib.load('../encoder_models/encoder_toxic.joblib')

# model
rf_clf = RandomForestClassifier(random_state =42, n_jobs=6,
                              n_estimators = 500, min_samples_split = 10, min_samples_leaf=6,
                              max_features = 'log2', max_depth=65, bootstrap= True)

#input
a, b,c,d,e = prepare_input(train_labels, train_Hfeatures, target = 'toxic', encoder = encoder_toxic)

# results
BCM_mf,  BCM_oof, BCM_base_model, cv_score  = Classification_meta_features(rf_clf, a, c, b, d, e,cv=10,n_jobs=1, 
                                                      col_names = [f'{name}-0', f'{name}-1'])
# report the results
report_clf_models(cv_score)

# Save results
# BCM_mf.to_csv(f'../data/Hmodel_features/{name}.csv') # no need for this 
np.save(f'../results/Hierarchical_models/{name}.npy', BCM_oof)
joblib.dump(BCM_base_model, f'../models/Hierarchical_models/{name}.pkl')
joblib.dump(cv_score, f'../results/Hierarchical_models/{name}_CVScore')


Accuracy: 0.798 std: 0.014
Balance Accuracy: 0.792 std: 0.014
matthews_corrcoef: 0.588 std: 0.028
f1_score: 0.798 std: 0.014
AUROC: 0.792 std: 0.014
CPU times: total: 5min 50s
Wall time: 1min 4s


['../results/Hierarchical_models/Toxic_RF_Hmodel_CVScore']

In [14]:
%%time
#yes
endpoint = 'Toxic'
descriptor = 'Hmodel'
algorithm = 'SVM'
name = f'{endpoint}_{algorithm}_{descriptor}'

encoder_toxic = joblib.load('../encoder_models/encoder_toxic.joblib')

# model
clf = SVC(random_state=42, probability=True,
          C = 10, gamma = 0.001, kernel = 'rbf')

#input
a, b,c,d,e = prepare_input(train_labels, train_Hfeatures, target = 'toxic', encoder = encoder_toxic)

# results
BCM_mf,  BCM_oof, BCM_model, cv_score  = Classification_meta_features(clf, a, c, b, d, e,cv=10,n_jobs=6, 
                                                      col_names = [f'{name}-0', f'{name}-1'])
# report the results
report_clf_models(cv_score)

# Save results
# BCM_mf.to_csv(f'../data/Hmodel_features/{name}.csv') # no need for this 
np.save(f'../results/Hierarchical_models/{name}.npy', BCM_oof)
joblib.dump(BCM_model, f'../models/Hierarchical_models/{name}.pkl')
joblib.dump(cv_score, f'../results/Hierarchical_models/{name}_CVScore')


Accuracy: 0.799 std: 0.016
Balance Accuracy: 0.792 std: 0.016
matthews_corrcoef: 0.589 std: 0.032
f1_score: 0.798 std: 0.016
AUROC: 0.792 std: 0.016
CPU times: total: 14.3 s
Wall time: 2min 13s


['../results/Hierarchical_models/Toxic_SVM_Hmodel_CVScore']

In [15]:
%%time
#yes
endpoint = 'Toxic'
descriptor = 'Hmodel'
algorithm = 'xgboost'
name = f'{endpoint}_{algorithm}_{descriptor}'

encoder_toxic = joblib.load('../encoder_models/encoder_toxic.joblib')

# model
clf = XGBClassifier(random_state =123, n_jobs=6,
                    subsample = 0.6, n_estimators = 500, min_child_weight=1,
                    max_depth = 3, learning_rate=0.01, gamma= 5,
                    colsample_bytree = 0.6)


#input
a, b,c,d,e = prepare_input(train_labels, train_Hfeatures, target = 'toxic', encoder = encoder_toxic)

# results
BCM_mf,  BCM_oof, BCM_model, cv_score  = Classification_meta_features(clf, a, c, b, d, e,cv=10,n_jobs=1, 
                                                      col_names = [f'{name}-0', f'{name}-1'])
# report the results
report_clf_models(cv_score)

# Save results
# BCM_mf.to_csv(f'../data/Hmodel_features/{name}.csv') # no need for this 
np.save(f'../results/Hierarchical_models/{name}.npy', BCM_oof)
joblib.dump(BCM_model, f'../models/Hierarchical_models/{name}.pkl')
joblib.dump(cv_score, f'../results/Hierarchical_models/{name}_CVScore')


Accuracy: 0.8 std: 0.013
Balance Accuracy: 0.794 std: 0.013
matthews_corrcoef: 0.592 std: 0.027
f1_score: 0.8 std: 0.013
AUROC: 0.794 std: 0.013
CPU times: total: 1min 43s
Wall time: 17.6 s


['../results/Hierarchical_models/Toxic_xgboost_Hmodel_CVScore']

In [16]:
%%time
#yes
endpoint = 'Toxic'
descriptor = 'Hmodel'
algorithm = 'knn'
name = f'{endpoint}_{algorithm}_{descriptor}'

encoder_toxic = joblib.load('../encoder_models/encoder_toxic.joblib')

# model
clf = KNeighborsClassifier(n_neighbors = 151, p=1, weights = 'distance')

#input
a, b,c,d,e = prepare_input(train_labels, train_Hfeatures, target = 'toxic', encoder = encoder_toxic)

# results
BCM_mf,  BCM_oof, BCM_model, cv_score  = Classification_meta_features(clf, a, c, b, d, e,cv=10,n_jobs=6, 
                                                      col_names = [f'{name}-0', f'{name}-1'])
# report the results
report_clf_models(cv_score)

# Save results
# BCM_mf.to_csv(f'../data/Hmodel_features/{name}.csv') # no need for this 
np.save(f'../results/Hierarchical_models/{name}.npy', BCM_oof)
joblib.dump(BCM_model, f'../models/Hierarchical_models/{name}.pkl')
joblib.dump(cv_score, f'../results/Hierarchical_models/{name}_CVScore')


Accuracy: 0.794 std: 0.016
Balance Accuracy: 0.788 std: 0.016
matthews_corrcoef: 0.58 std: 0.033
f1_score: 0.794 std: 0.016
AUROC: 0.788 std: 0.016
CPU times: total: 1.42 s
Wall time: 2.48 s


['../results/Hierarchical_models/Toxic_knn_Hmodel_CVScore']

## Endpoint 2: EPA

In [17]:
%%time
#yes
endpoint = 'EPA'
descriptor = 'Hmodel'
algorithm = 'RF'
name = f'{endpoint}_{algorithm}_{descriptor}'

encoder_epa = joblib.load('../encoder_models/encoder_epa.joblib')

# model
clf = RandomForestClassifier(random_state =42, n_jobs=6,
                              n_estimators = 500, min_samples_split = 2, min_samples_leaf=2,
                              max_features = 'sqrt', max_depth=65, bootstrap= False)

#input
a, b,c,d,e = prepare_input(train_labels, train_Hfeatures, target = 'EPA_category', encoder = encoder_epa)

# results
MCM_mf,  MCM_oof, MCM_model, cv_score  = Classification_meta_features(clf, a, c, b, d, e,cv=10,n_jobs=1, 
                                                      col_names = [f'{name}-1', f'{name}-2', f'{name}-3', f'{name}-4'])
# report the results
report_clf_models(cv_score)

# Save results
np.save(f'../results/Hierarchical_models/{name}.npy', MCM_oof)
joblib.dump(MCM_model, f'../models/Hierarchical_models/{name}.pkl')
joblib.dump(cv_score, f'../results/Hierarchical_models/{name}_CVScore')


Accuracy: 0.659 std: 0.014
Balance Accuracy: 0.588 std: 0.018
matthews_corrcoef: 0.458 std: 0.023
f1_score: 0.648 std: 0.016
AUROC: 0.708 std: 0.012
CPU times: total: 20min 31s
Wall time: 3min 32s


['../results/Hierarchical_models/EPA_RF_Hmodel_CVScore']

In [18]:
%%time
#yes
endpoint = 'EPA'
descriptor = 'Hmodel'
algorithm = 'xgboost'
name = f'{endpoint}_{algorithm}_{descriptor}'

encoder_epa = joblib.load('../encoder_models/encoder_epa.joblib')

# model
clf = XGBClassifier(random_state =123, n_jobs=6,
                    subsample = 0.9, n_estimators = 500, min_child_weight=3,
                    max_depth = 10, learning_rate=0.01, gamma= 0,
                    colsample_bytree = 0.6)

#input
a, b,c,d,e = prepare_input(train_labels, train_Hfeatures, target = 'EPA_category', encoder = encoder_epa)

# results
MCM_mf,  MCM_oof, MCM_model, cv_score  = Classification_meta_features(clf, a, c, b, d, e,cv=10,n_jobs=1, 
                                                      col_names = [f'{name}-1', f'{name}-2', f'{name}-3', f'{name}-4'])
# report the results
report_clf_models(cv_score)

# Save results
np.save(f'../results/Hierarchical_models/{name}.npy', MCM_oof)
joblib.dump(MCM_model, f'../models/Hierarchical_models/{name}.pkl')
joblib.dump(cv_score, f'../results/Hierarchical_models/{name}_CVScore')


Accuracy: 0.659 std: 0.013
Balance Accuracy: 0.59 std: 0.014
matthews_corrcoef: 0.458 std: 0.019
f1_score: 0.648 std: 0.015
AUROC: 0.709 std: 0.011
CPU times: total: 57min 27s
Wall time: 9min 41s


['../results/Hierarchical_models/EPA_xgboost_Hmodel_CVScore']

In [19]:
%%time
#yes
endpoint = 'EPA'
descriptor = 'Hmodel'
algorithm = 'knn'
name = f'{endpoint}_{algorithm}_{descriptor}'

encoder_epa = joblib.load('../encoder_models/encoder_epa.joblib')

# model
clf = KNeighborsClassifier(n_neighbors = 55, weights = 'distance', p=1)

#input
a, b,c,d,e = prepare_input(train_labels, train_Hfeatures, target = 'EPA_category', encoder = encoder_epa)

# results
MCM_mf,  MCM_oof, MCM_model, cv_score  = Classification_meta_features(clf, a, c, b, d, e,cv=10,n_jobs=6, 
                                                      col_names = [f'{name}-1', f'{name}-2', f'{name}-3', f'{name}-4'])
# report the results
report_clf_models(cv_score)

# Save results
np.save(f'../results/Hierarchical_models/{name}.npy', MCM_oof)
joblib.dump(MCM_model, f'../models/Hierarchical_models/{name}.pkl')
joblib.dump(cv_score, f'../results/Hierarchical_models/{name}_CVScore')


Accuracy: 0.655 std: 0.013
Balance Accuracy: 0.573 std: 0.011
matthews_corrcoef: 0.447 std: 0.016
f1_score: 0.639 std: 0.015
AUROC: 0.699 std: 0.009
CPU times: total: 1.02 s
Wall time: 4.25 s


['../results/Hierarchical_models/EPA_knn_Hmodel_CVScore']

In [20]:
%%time
#yes
endpoint = 'EPA'
descriptor = 'Hmodel'
algorithm = 'SVM'
name = f'{endpoint}_{algorithm}_{descriptor}'

encoder_epa = joblib.load('../encoder_models/encoder_epa.joblib')

# model
clf = SVC(random_state=42, probability=True,
          C = 0.1, kernel = 'linear')

#input
a, b,c,d,e = prepare_input(train_labels, train_Hfeatures, target = 'EPA_category', encoder = encoder_epa)

# results
MCM_mf,  MCM_oof, MCM_model, cv_score  = Classification_meta_features(clf, a, c, b, d, e,cv=10,n_jobs=6, 
                                                      col_names = [f'{name}-1', f'{name}-2', f'{name}-3', f'{name}-4'])
# report the results
report_clf_models(cv_score)

# Save results
np.save(f'../results/Hierarchical_models/{name}.npy', MCM_oof)
joblib.dump(MCM_model, f'../models/Hierarchical_models/{name}.pkl')
joblib.dump(cv_score, f'../results/Hierarchical_models/{name}_CVScore')


Accuracy: 0.65 std: 0.012
Balance Accuracy: 0.571 std: 0.011
matthews_corrcoef: 0.441 std: 0.017
f1_score: 0.636 std: 0.014
AUROC: 0.698 std: 0.01
CPU times: total: 16.1 s
Wall time: 2min 16s


['../results/Hierarchical_models/EPA_SVM_Hmodel_CVScore']

## Endpoint 3: LD50

In [21]:
%%time
#yes
endpoint = 'LD50'
descriptor = 'Hmodel'
algorithm = 'RF'
name = f'{endpoint}_{algorithm}_{descriptor}'

# model
rf_reg = RandomForestRegressor(random_state =42, n_jobs=6,
                              n_estimators = 1500, min_samples_split = 5, min_samples_leaf=4,
                              max_features = 'sqrt', max_depth=80, bootstrap= False)

#input
a, b,c,d,e = prepare_input(train_labels, train_Hfeatures, target = 'logLD50_mmolkg')

# results
RM_mf, RM_oof, RM_model, cv_score = Regression_meta_features(rf_reg, a, c, b, 
                                                       d, e,cv=10, n_jobs = 1, col_names = [f'{name}'])
# report the results
report_cv_reg_models(cv_score)

# Save results
np.save(f'../results/Hierarchical_models/{name}.npy', RM_oof)
joblib.dump(RM_model, f'../models/Hierarchical_models/{name}.pkl')
joblib.dump(cv_score, f'../results/Hierarchical_models/{name}_CVScore')


RMSE: 0.549 std: 0.024
R2: 0.629 std: 0.029
MAE: 0.398 std: 0.019
MSE: 0.302 std: 0.027
CPU times: total: 36min 43s
Wall time: 6min 21s


['../results/Hierarchical_models/LD50_RF_Hmodel_CVScore']

In [22]:
%%time
#yes
endpoint = 'LD50'
descriptor = 'Hmodel'
algorithm = 'xgboost'
name = f'{endpoint}_{algorithm}_{descriptor}'

# model
reg = XGBRegressor(random_state =123, n_jobs=6, objective ='reg:squarederror',
                    subsample = 0.6, n_estimators = 1500, min_child_weight=1,
                    max_depth = 10, learning_rate=0.01, gamma= 1,
                    colsample_bytree = 0.9)

#input
a, b,c,d,e = prepare_input(train_labels, train_Hfeatures, target = 'logLD50_mmolkg')

# results
RM_mf, RM_oof, RM_model, cv_score = Regression_meta_features(reg, a, c, b, 
                                                       d, e,cv=10, n_jobs = 1, col_names = [f'{name}'])
# report the results
report_cv_reg_models(cv_score)

# Save results
np.save(f'../results/Hierarchical_models/{name}.npy', RM_oof)
joblib.dump(RM_model, f'../models/Hierarchical_models/{name}.pkl')
joblib.dump(cv_score, f'../results/Hierarchical_models/{name}_CVScore')


RMSE: 0.551 std: 0.026
R2: 0.627 std: 0.031
MAE: 0.399 std: 0.02
MSE: 0.305 std: 0.029
CPU times: total: 14min 8s
Wall time: 2min 23s


['../results/Hierarchical_models/LD50_xgboost_Hmodel_CVScore']

In [23]:
%%time
#yes
endpoint = 'LD50'
descriptor = 'Hmodel'
algorithm = 'knn'
name = f'{endpoint}_{algorithm}_{descriptor}'

# model
reg = KNeighborsRegressor(p = 2, n_neighbors = 45, weights = 'distance')

#input
a, b,c,d,e = prepare_input(train_labels, train_Hfeatures, target = 'logLD50_mmolkg')

# results
RM_mf, RM_oof, RM_model, cv_score = Regression_meta_features(reg, a, c, b, 
                                                       d, e,cv=10, n_jobs = 6, col_names = [f'{name}'])
# report the results
report_cv_reg_models(cv_score)

# Save results
np.save(f'../results/Hierarchical_models/{name}.npy', RM_oof)
joblib.dump(RM_model, f'../models/Hierarchical_models/{name}.pkl')
joblib.dump(cv_score, f'../results/Hierarchical_models/{name}_CVScore')


RMSE: 0.561 std: 0.025
R2: 0.613 std: 0.03
MAE: 0.408 std: 0.018
MSE: 0.316 std: 0.028
CPU times: total: 1.67 s
Wall time: 3.23 s


['../results/Hierarchical_models/LD50_knn_Hmodel_CVScore']

In [24]:
%%time
#yes
endpoint = 'LD50'
descriptor = 'Hmodel'
algorithm = 'SVM'
name = f'{endpoint}_{algorithm}_{descriptor}'

# model
reg = SVR(C = 1, gamma = 0.01, kernel = 'rbf')

#input
a, b,c,d,e = prepare_input(train_labels, train_Hfeatures, target = 'logLD50_mmolkg')

# results
RM_mf, RM_oof, RM_model, cv_score = Regression_meta_features(reg, a, c, b, 
                                                       d, e,cv=10, n_jobs = 6, col_names = [f'{name}'])
# report the results
report_cv_reg_models(cv_score)

# Save results
np.save(f'../results/Hierarchical_models/{name}.npy', RM_oof)
joblib.dump(RM_model, f'../models/Hierarchical_models/{name}.pkl')
joblib.dump(cv_score, f'../results/Hierarchical_models/{name}_CVScore')


RMSE: 0.553 std: 0.026
R2: 0.625 std: 0.03
MAE: 0.398 std: 0.019
MSE: 0.306 std: 0.029
CPU times: total: 2.89 s
Wall time: 12.8 s


['../results/Hierarchical_models/LD50_SVM_Hmodel_CVScore']

## Get the predictrions on test set

In [25]:
test_Hfeatures = pd.read_csv('../data/Hmodel_features_combined/test_Hfeatures.csv', index_col = 'CASRN')
test_Hfeatures.shape

(2849, 100)

In [27]:
%%time

index = test_Hfeatures.index

model_path = '../models/Hierarchical_models/'
result_path = '../results/Hierarchical_testset_preds/'

endpoints = ['Toxic', 'EPA', 'LD50']
descriptors = ['Hmodel']
algorithms = ['RF', 'SVM', 'knn', 'xgboost']

feature = test_Hfeatures.values.astype('float32')

for e in endpoints:
    for d in descriptors:
        for a in algorithms:
            name = f'{e}_{a}_{d}'
            print(f'{name}: computing....')
            model = joblib.load(f'{model_path}{name}.pkl')
            
            if e == 'Toxic':
                predictions = model.predict_proba(feature)
                df = pd.DataFrame(predictions, columns=[f'{name}-0', f'{name}-1'],index = index)
                df.to_csv(f'{result_path}{name}.csv')

                print(f'{name}: saved')
            if e == 'EPA':
                predictions = model.predict_proba(feature)
                df = pd.DataFrame(predictions, columns=[f'{name}-1', f'{name}-2', f'{name}-3', f'{name}-4'], index = index)
                df.to_csv(f'{result_path}{name}.csv')

                print(f'{name}: saved')
            if e == 'LD50':
                predictions = model.predict(feature)
                df = pd.DataFrame(predictions, columns=[f'{name}'],index = index)
                df.to_csv(f'{result_path}{name}.csv')
                print(f'{name}: saved') 

Toxic_RF_Hmodel: computing....
Toxic_RF_Hmodel: saved
Toxic_SVM_Hmodel: computing....
Toxic_SVM_Hmodel: saved
Toxic_knn_Hmodel: computing....
Toxic_knn_Hmodel: saved
Toxic_xgboost_Hmodel: computing....
Toxic_xgboost_Hmodel: saved
EPA_RF_Hmodel: computing....
EPA_RF_Hmodel: saved
EPA_SVM_Hmodel: computing....
EPA_SVM_Hmodel: saved
EPA_knn_Hmodel: computing....
EPA_knn_Hmodel: saved
EPA_xgboost_Hmodel: computing....
EPA_xgboost_Hmodel: saved
LD50_RF_Hmodel: computing....
LD50_RF_Hmodel: saved
LD50_SVM_Hmodel: computing....
LD50_SVM_Hmodel: saved
LD50_knn_Hmodel: computing....
LD50_knn_Hmodel: saved
LD50_xgboost_Hmodel: computing....
LD50_xgboost_Hmodel: saved
CPU times: total: 17.1 s
Wall time: 7.16 s
