Skip to content
5th Winning solution for Humpback whale identification
Branch: master
Clone or download
Weimin, Wang
Weimin, Wang add in comp data
Latest commit 49217cd Mar 22, 2019
Type Name Latest commit message Commit time
Failed to load latest commit information.
code fix import error in a02 Mar 21, 2019
data add in comp data Mar 21, 2019
models Delete .DS_Store Mar 12, 2019
modified_data add in bounding box res Mar 21, 2019 Update Mar 12, 2019

Humpback whale re-identification using Siamese neural nets

Code for 5th place winning solution for Humpback Whale Identification contest


  • Hardware: GPU NVIDIA 1080 Ti
  • Software: Python 3.6, keras==2.2.4, keras-retinanet==0.5.0, albumentations, pyvips, scipy, numpy, pandas, tqdm, lap, sklearn


Input data location

  1. Both training and test images should be put inside below folder separately:
  • ../data/train/
  • ../data/test/
  1. train.csv and sample_submission.csv are at below locations:
  • ../data/train.csv
  • ../data/sample_submission.csv

Part 1 - Bounding box models

  • Requires: ../modified_data/p2bb_v5.pkl
  • Requires: ../modified_data/retinanet/cropping_train_v2.csv - some boxes for playground competition
  1. python3 retinanet/
  2. python3 retinanet/
  3. python3 retinanet/
  4. python3 retinanet/
  5. python3 retinanet/

As result we obtain following files:

  • ../modified_data/p2bb_averaged_v1.pkl - boxes for train/test images
  • ../modified_data/p2bb_averaged_playground_v1.pkl - boxes for playground images

Part 2 - Siamese Nets with DenseNet121 and SE-ResNext50

Generate KFold splits

  1. python3

As result we have 2 files with different KFold splits

  • ../modified_data/kfold/new_4_folds_split_train_val_v1.pkl - kfold split v1 (used by DenseNet121)
  • ../modified_data/kfold/new_4_folds_split_train_val_v2.pkl - kfold split v2 (used by SE-ResNext50)

Part with siamese nets (DenseNet121)

  1. python3 siamese_net_v5_densenet121/
  2. python3 siamese_net_v5_densenet121/
  3. python3 siamese_net_v5_densenet121/ 0
  4. python3 siamese_net_v5_densenet121/ 1
  5. python3 siamese_net_v5_densenet121/ 2
  6. python3 siamese_net_v5_densenet121/ 3
  7. python3 siamese_net_v5_densenet121/ 0
  8. python3 siamese_net_v5_densenet121/ 1
  9. python3 siamese_net_v5_densenet121/ 2
  10. python3 siamese_net_v5_densenet121/ 3
  11. python3 siamese_net_v5_densenet121/

Part with siamese nets (SE-ResNext50)

  1. python3 siamese_net_v6_se_resnext/
  2. python3 siamese_net_v6_se_resnext/
  3. python3 siamese_net_v6_se_resnext/ 0
  4. python3 siamese_net_v6_se_resnext/ 1
  5. python3 siamese_net_v6_se_resnext/ 2
  6. python3 siamese_net_v6_se_resnext/ 3
  7. python3 siamese_net_v6_se_resnext/

Create tables for using models predictions in ensemble

  1. python3

As result we will have 4 files with prediction matrices, which will be used for ensemble

  • ../features/cv-analysis-fs14-LB959-densenet121-512px-sparse.pkl
  • ../features/cv-analysis-fs14-LB959-densenet121-512px-sparse-test.pkl
  • ../features/cv-analysis-fs16-LB959-seresnext50-384px-sparse.pkl
  • ../features/cv-analysis-fs16-LB959-seresnext50-384px-sparse-test.pkl

Part 3 - Siamese Nets with customized ConvNets model

Create kfold splits

  • python

Train Siamese Nets with k-fold approach

Train four-fold siamese nets, and each training requires two GPUs. Make sure you have enough GPUs (8) to run all four model training parallelly. Otherwise, run in sequence four times

  • python --CUDA_VISIBLE_DEVICES 0,1 --RUN_FOLD 0
  • python --CUDA_VISIBLE_DEVICES 2,3 --RUN_FOLD 1
  • python --CUDA_VISIBLE_DEVICES 4,5 --RUN_FOLD 2
  • python --CUDA_VISIBLE_DEVICES 6,7 --RUN_FOLD 3

Create inference for customized ConvNets siamese nets

Once above trainings are done, find out the best saved weights from each model based on log, and run inference below to generate the final averaged test-vs-train score matrix

  • python --model_weights_1 ../path_to_your_best_weights_1 --model_weights_2 ../path_to_your_best_weights_2 --model_weights_3 ../path_to_your_best_weights_3 --model_weights_4 ../path_to_your_best_weights_4

Part 4 - Ensemble all three models, and apply post processing steps

  1. Check to make sure all three models are generated inside ../features/, then run:
  • python
  1. Final submit will be generated in:
  • ../submission/final_submit_with_post_proc.csv
You can’t perform that action at this time.