# Train models: Example 1

> Note: This notebook assumes that you have annotated/labelled data within your SimBA project. The notebook also assumes
that you have a apriori set the hyperparameters/sampling/evaluation settings either in the SimBA project_config.ini (when training a single model, as in Example 1 below)
or have a set of valid CSV config files inside the project_folder/configs directory of your SimBA project (when grid searching models, as in Example 2 below). Also, this notebook is written using a smaller dataset, so you may expect longer runtimes on your end.

In [1]:
from simba.model.train_rf import TrainRandomForestClassifier
from simba.model.grid_search_rf import GridSearchRandomForestClassifier

In [2]:
### Define the path to the SimBA project config ini.
CONFIG_PATH = '/Users/simon/Desktop/envs/simba/troubleshooting/two_black_animals_14bp/project_folder/project_config.ini'


### Examle 1: TRAIN A SINGLE MODEL

In [3]:
###Create an instance of a model trainer based on the settings in the project config
model_trainer = TrainRandomForestClassifier(config_path=CONFIG_PATH)

Reading in 4 annotated files...
Dataset size: 13.876192MB / 0.013876GB
Number of features in dataset: 454
Number of Attack frames in dataset: 272.0 (3.91%)


In [4]:
###Run the model trainer based on the settings in the project config
model_trainer.run()

Training and evaluating model...
Fitting Attack model...


[Parallel(n_jobs=-1)]: Using backend ThreadingBackend with 8 concurrent workers.
[Parallel(n_jobs=-1)]: Done  34 tasks      | elapsed:    0.1s
[Parallel(n_jobs=-1)]: Done 184 tasks      | elapsed:    0.7s
[Parallel(n_jobs=-1)]: Done 434 tasks      | elapsed:    1.6s
[Parallel(n_jobs=-1)]: Done 784 tasks      | elapsed:    2.9s
[Parallel(n_jobs=-1)]: Done 1234 tasks      | elapsed:    4.5s
[Parallel(n_jobs=-1)]: Done 1784 tasks      | elapsed:    6.6s
[Parallel(n_jobs=-1)]: Done 2000 out of 2000 | elapsed:    7.3s finished


In [5]:
### Save the model
model_trainer.save()

SIMBA COMPLETE: Classifier Attack saved in models/generated_models directory (elapsed time: 10.5654s) 	complete
SIMBA COMPLETE: Evaluation files are in models/generated_models/model_evaluations folders 	complete


### Example 2: TRAIN MULTIPLE MODELS: ONE FOR EACH SETTINGS FILE PRESENT IN THE PROJECT_FOLDER/CONFIGS DIRECTORY.


In [6]:
###Create an instance of a grid model trainer.
model_trainer = GridSearchRandomForestClassifier(config_path=CONFIG_PATH)

Reading in 4 annotated files...
Reading complete Together_2 (elapsed time: 0.3268s)...
Reading complete Together_3 (elapsed time: 0.3219s)...
Reading complete Together_1 (elapsed time: 0.2962s)...
Reading complete Together_4 (elapsed time: 0.2478s)...


In [7]:
###Run the grid search model trainer. Note: Each model is saved without the need to call the save function (as when training a single model above).
model_trainer.run()

Training model 1/1 (Attack)...
MODEL 1 settings
+------------------------+---------+
| Setting                | value   |
| Model name             | Attack  |
+------------------------+---------+
| Ensemble method        | RF      |
+------------------------+---------+
| Estimators (trees)     | 2000    |
+------------------------+---------+
| Max features           | sqrt    |
+------------------------+---------+
| Under sampling setting | None    |
+------------------------+---------+
| Under sampling ratio   | nan     |
+------------------------+---------+
| Over sampling setting  | None    |
+------------------------+---------+
| Over sampling ratio    | nan     |
+------------------------+---------+
| criterion              | gini    |
+------------------------+---------+
| Min sample leaf        | 1       |
+------------------------+---------+ 	TABLE
# 454 features.
Fitting Attack model...


[Parallel(n_jobs=-1)]: Using backend ThreadingBackend with 8 concurrent workers.
[Parallel(n_jobs=-1)]: Done  34 tasks      | elapsed:    0.1s
[Parallel(n_jobs=-1)]: Done 184 tasks      | elapsed:    0.7s
[Parallel(n_jobs=-1)]: Done 434 tasks      | elapsed:    1.7s
[Parallel(n_jobs=-1)]: Done 784 tasks      | elapsed:    3.1s
[Parallel(n_jobs=-1)]: Done 1234 tasks      | elapsed:    5.3s
[Parallel(n_jobs=-1)]: Done 1784 tasks      | elapsed:    8.2s
[Parallel(n_jobs=-1)]: Done 2000 out of 2000 | elapsed:    9.5s finished
[Parallel(n_jobs=8)]: Using backend ThreadingBackend with 8 concurrent workers.
[Parallel(n_jobs=8)]: Done  34 tasks      | elapsed:    0.0s
[Parallel(n_jobs=8)]: Done 184 tasks      | elapsed:    0.1s
[Parallel(n_jobs=8)]: Done 434 tasks      | elapsed:    0.1s


Creating classification report visualization...


[Parallel(n_jobs=8)]: Done 784 tasks      | elapsed:    0.2s
[Parallel(n_jobs=8)]: Done 1234 tasks      | elapsed:    0.3s
[Parallel(n_jobs=8)]: Done 1784 tasks      | elapsed:    0.4s
[Parallel(n_jobs=8)]: Done 2000 out of 2000 | elapsed:    0.6s finished
[Parallel(n_jobs=8)]: Using backend ThreadingBackend with 8 concurrent workers.
[Parallel(n_jobs=8)]: Done  34 tasks      | elapsed:    0.0s
[Parallel(n_jobs=8)]: Done 184 tasks      | elapsed:    0.0s
[Parallel(n_jobs=8)]: Done 434 tasks      | elapsed:    0.1s
[Parallel(n_jobs=8)]: Done 784 tasks      | elapsed:    0.2s
[Parallel(n_jobs=8)]: Done 1234 tasks      | elapsed:    0.3s
[Parallel(n_jobs=8)]: Done 1784 tasks      | elapsed:    0.4s
[Parallel(n_jobs=8)]: Done 2000 out of 2000 | elapsed:    0.4s finished


Saving model meta data file...
Classifier Attack_0 saved in models/validations/model_files directory ...
SIMBA COMPLETE: All models and evaluations complete. The models/evaluation files are in models/validations folders 	complete
