# Example 13 - Train, Val, Test pipeline (different thresholds)

In this notebook, we'll train models for different thresholds, validate models for different thresholds, test models on the selected threshold. 

## Step 1. Train models on different thresholds

In this step, we'll train models on different thresholds, co-training models with CNN or HC features. 

In [1]:
from train_co_training_CNN import TrainCoTrainingCNN
from train_co_training_HC import TrainCoTrainingHC

In [2]:
# path
pkl_dir = "pkl/"
train_dir = "npy/train/"

In [3]:
# parameters
threshold = -66
iou_threshold = 0.5
neg_pos_ratio = 2
unlabel_label_ratio = 1
sel_feats = ['total_water_column', 'depth', 'latitude', 'longitude'] # contextual features
left_feats = ['Sv_18kHz_min', 'Sv_18kHz_p5', 'Sv_18kHz_p25', 'Sv_18kHz_p50', 'Sv_18kHz_p75', 'Sv_18kHz_p95', 'Sv_18kHz_max', 'Sv_18kHz_std', 'Sv_38kHz_min', 'Sv_38kHz_p5', 'Sv_38kHz_p25', 'Sv_38kHz_p50', 'Sv_38kHz_p75', 'Sv_38kHz_p95', 'Sv_38kHz_max', 'Sv_38kHz_std', 'Sv_120kHz_min', 'Sv_120kHz_p5', 'Sv_120kHz_p25', 'Sv_120kHz_p50', 'Sv_120kHz_p75', 'Sv_120kHz_p95', 'Sv_120kHz_max', 'Sv_120kHz_std', 'Sv_200kHz_min', 'Sv_200kHz_p5', 'Sv_200kHz_p25', 'Sv_200kHz_p50', 'Sv_200kHz_p75', 'Sv_200kHz_p95', 'Sv_200kHz_max', 'Sv_200kHz_std', 'Sv_ref_18kHz', 'Sv_ref_120kHz', 'Sv_ref_200kHz', 'length', 'thickness', 'area', 'perimeter', 'rectangularity', 'compact', 'circularity', 'elongation']
right_feats = sel_feats
k = 10
u = 200

In [4]:
# co-training with CNN features
co_training_CNN = TrainCoTrainingCNN(pkl_dir, train_dir, threshold, neg_pos_ratio, unlabel_label_ratio, sel_feats, k, u)
co_training_CNN.train_co_training()

719 1439 2159


  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)


  epoch    train_loss    valid_acc    valid_loss     dur
-------  ------------  -----------  ------------  ------
      1        [36m1.8867[0m       [32m0.5903[0m        [35m4.2807[0m  1.3718
      2        [36m0.3115[0m       [32m0.6759[0m        [35m0.7315[0m  1.0474
      3        [36m0.2825[0m       0.6713        0.8974  1.0327
      4        0.2875       0.5810        0.8761  1.0284
      5        [36m0.2567[0m       [32m0.7477[0m        [35m0.7048[0m  1.0365
      6        0.2617       0.7083        [35m0.6900[0m  1.0308
      7        [36m0.2169[0m       0.6736        1.2957  1.0338
      8        [36m0.1962[0m       0.6944        [35m0.6888[0m  1.0426
      9        [36m0.1740[0m       0.6875        1.2930  1.0468
     10        [36m0.1434[0m       [32m0.7593[0m        [35m0.5378[0m  1.0352
     11        [36m0.1352[0m       0.6944        0.8338  1.0541
     12        [36m0.0981[0m       0.6875        1.0381  1.0392
     13        [36m0.

      6        [36m0.0011[0m       0.8519        0.4309  1.8188
      7        [36m0.0006[0m       0.8466        0.4880  1.8262
      8        [36m0.0004[0m       0.8439        0.5577  1.8253
      9        [36m0.0003[0m       0.8399        0.5922  1.8273
     10        [36m0.0003[0m       0.8386        0.6173  1.8155
     11        [36m0.0003[0m       0.8386        0.6396  1.8194
     12        [36m0.0002[0m       0.8360        0.6595  1.8212
     13        [36m0.0002[0m       0.8373        0.6774  1.8126
     14        [36m0.0002[0m       0.8373        0.6935  1.8218
     15        [36m0.0002[0m       0.8373        0.7081  1.8192
     16        [36m0.0002[0m       0.8360        0.7216  1.8110
     17        [36m0.0001[0m       0.8360        0.7342  1.8192
     18        [36m0.0001[0m       0.8347        0.7460  1.8225
     19        [36m0.0001[0m       0.8347        0.7570  1.8459
     20        [36m0.0001[0m       0.8333        0.7674  1.8203
Re-initial

In [4]:
# co-training with HC features
co_training_HC = TrainCoTrainingHC(pkl_dir, threshold, neg_pos_ratio, unlabel_label_ratio, left_feats, right_feats, k, u)
co_training_HC.train_co_training()

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self.obj[item] = s


715 1430 2145


## Step 2. Validate model

In this step, we'll run saved model on val dataset, check performance. 

In [4]:
from predict_co_training_CNN import PredictCoTrainingCNN
from predict_co_training_HC import PredictCoTrainingHC

In [5]:
# add parameters
mode = "val"

In [6]:
# co-training with CNN features

In [7]:
# co-training with HC features
co_training_HC = PredictCoTrainingHC(pkl_dir, mode, threshold, iou_threshold, left_feats, right_feats)
result_HC = co_training_HC()

0/238
sucessfully generate features!
sucessfully get labels!
1/238
2/238
sucessfully generate features!
sucessfully get labels!
3/238
4/238
sucessfully generate features!
sucessfully get labels!
5/238
6/238
sucessfully generate features!
sucessfully get labels!
7/238
sucessfully generate features!
sucessfully get labels!
8/238
sucessfully generate features!
9/238
10/238
sucessfully generate features!
sucessfully get labels!
11/238
sucessfully generate features!
sucessfully get labels!
12/238
sucessfully generate features!
sucessfully get labels!
13/238
sucessfully generate features!
sucessfully get labels!
14/238
sucessfully generate features!
sucessfully get labels!
15/238
sucessfully generate features!
sucessfully get labels!
16/238
sucessfully generate features!
sucessfully get labels!
17/238
sucessfully generate features!
sucessfully get labels!
18/238
sucessfully generate features!
sucessfully get labels!
19/238
sucessfully generate features!
sucessfully get labels!
20/238
21/238


184/238
sucessfully generate features!
sucessfully get labels!
185/238
186/238
187/238
sucessfully generate features!
sucessfully get labels!
188/238
189/238
sucessfully generate features!
sucessfully get labels!
190/238
sucessfully generate features!
sucessfully get labels!
191/238
sucessfully generate features!
192/238
sucessfully generate features!
sucessfully get labels!
193/238
194/238
195/238
196/238
sucessfully generate features!
197/238
sucessfully generate features!
sucessfully get labels!
198/238
sucessfully generate features!
199/238
sucessfully generate features!
sucessfully get labels!
200/238
sucessfully generate features!
sucessfully get labels!
201/238
202/238
203/238
sucessfully generate features!
sucessfully get labels!
204/238
sucessfully generate features!
sucessfully get labels!
205/238
sucessfully generate features!
sucessfully get labels!
206/238
sucessfully generate features!
sucessfully get labels!
207/238
208/238
sucessfully generate features!
sucessfully get 

In [8]:
import pickle
with open(pkl_dir + f"outputs_HC_{threshold}.pkl", "wb") as handle:
    pickle.dump(result_HC, handle)

In [9]:
from compute_metrics import FinalMetrics
metrics = FinalMetrics(pkl_dir, mode, result_HC, iou_threshold)
final_metrics = metrics()
print(final_metrics)

{'recall_pixel': 0.6492424845093827, 'recall_object': 0.7163461538461539, 'precision_all': 0.9184027777777778, 'precision_labeled': 0.9796954314720813}
