Skip to content

soumya997/kaggle-GBR-Experimentations

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

25 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Introduction:

The goal of this competition is to accurately identify starfish in real-time by building an object detection model trained on underwater videos of coral reefs. Me and my teammate started early and contributed with few Notebook and discussion threads, but as the deadline for the overlapping competition Sartorius -CIS was approaching we had to shift our focus toward that. We got started with the competiion after the 31st Dec. It took some time to digest all the things that were happening, [im still reading the solutions, and some code πŸ˜….]. This was a very successful competition [also sartorius], total of 2,026 teams and 61,174 submissions. People shared there ideas and code massively, because of that it was a great start for a beginner like me, because I got to learn a lot of things. With the help of the community our team was able to get in top 3% of the LB and had a silver medal [we were expecting Bronze]. Here I will be discussing about my experiments. I created a lot of NBs in this competition, and from that I tried to compile most of the NBs that are important. Here you will find some forked NBs(with modification) and some independent NBs. If I missed something please create an issue and ask there.

Inference image:

   

Inference video:

img_pred_seq45518.mp4

See more here

Few Ideas That We tried:

  • Most of the ideas were proposed on the Discussion forums
  • When I got to yolov5 it was clear that yolov5 is FTW. But there were conflits on which version to use yolov5s6 or yolov5m6. Because some were getting better results on one of them.
    • I first started with 2 stage detector FasterRCNN, I tried different backbones and hyper-params with different augmentation techniques[geometric and colog and combined]. I tried ResNet101,ResNet50,MobileNet,EfficientB3,SwinTransformer. Check out this amazing repository by @mrinath, it helped for efficient with timm timmFasterRcnn.

    • I started with 3fold yolov5s6 It was a video based splitting. I was using this repository minor changes over the ultralytics yolov5 to track the f2 score. As per my analysis it was most likely that video_id2 would give more better f2, because it has more data, and there were varience in the data. I tried different hyper parameters in that, and different training image resolutions. I tried doing ensemble after training each fold. I did the same with yolov5m6. I found out Adam was working better SGD. I also did some experiments with custom augmentation using albumentations.

    • After seeing some discussion on yolov5 model freezing, I thought of trying that, and for this best splitting was sequence based groupfold. for more check out ultralytics docs. I trained both yolov5s6 and yolov5m6. image size was +/-3000.

    • Along with yolov5, tracking was doing a better job increasing the CV/LB. I also tried that. I used norfair tracking. I saw some discussions on different tracking to use, like deep sort and so, but ended up using the norfair one as it was giving decent results and I did not had much time.

    • As a postprocessing technique I also tried to use classification on the bounding boxes. It also helped. I tried different models Normal CNN, Densenet121, Resnet [50,101], Efficientnet[B3] and ensemble. I used this code for doing the ensemble, it is also added that in this repo. Our demo pipeline looks like this,

  graph TD;
      A(Competition Data)--> B(video split vid_0);
      A(Competition Data)--> C(video split vid_1);
      A(Competition Data)--> D(video split vid_2);
      B(video split vid_0)-->E(Train yolov5s6 img-3584);
      C(video split vid_1)-->M(Train yolov5s6 img-3584);
      D(video split vid_2)-->N(Train yolov5s6 img-3584);
      E(Train yolov5s6 img-3584)--> K(TTA w/ inf imgsize-6200);
      M(Train yolov5s6 img-3584)--> K(TTA w/ inf imgsize-6200);
      N(Train yolov5s6 img-3584)--> K(TTA w/ inf imgsize-6200);
      K(TTA w/ inf imgsize-6200)--eg conf:0.30, thr:0.50--> G(WBF);
      G(WBF)--> F(Classification);
      F(Classification)--> H(norfair Tracking);
      H(norfair Tracking)-->final
Loading

IMP NBs:


YOLO Inference NBs:


NB tracking

training NB Inference NB
reef-frcnn-resnet50-on-4k.ipynb kaggle reef-frcnn-resnet50-on-4k-infr.ipynb
frcnn-with-efficientnetv3-timm.ipynb ---
fasterrcnn-resnet101.ipynb ---
learning-to-torch-fasterrcnn-pytorch.ipynb kaggle ---
FasterRCNN[train]-color+geo aug-480p-SGD-90:10-e20 kaggle learning-to-torch-fasterrcnn-infer.ipynb kaggle
trained yolov5s6 [img1920,bs2,e7] on fold 1 w/ evolve params yolov5-inference-nb.ipynb
somu: yolov5s6[train1] base model [1 of 5 folds] ---
somu: yolov5s6 video_fold vid_id:1
somu: yolov5s6 video_fold vid2
somu: yolov5s6 video_fold vid2 [evolve]❌
yolov5m6 frcnn_albumentations vid2 [no cp]
yolov5m6 vid:2 adam
classification starfish
FasterRCNN[train]coloraug-480p-SGD/AdamW-90:10-e20
yolov5 Albumentations [train]
Evaluate F2 for YoloX and Norfair tracking
yolov5[train1] base model [1 of 5 folds]
somu: yolov5s6 video_fold vid2
somu: yolov5s6 video_fold vid2, copypaste:0.5
resume yolov5m6 vid2
somu: yolov5s6 video_fold vid0
yolov5m6 video fold: vid_id=2, img=3k,e11,bs2
yolov5m6 vid:2 adam, 3k img
yolov5s6 vid:2 adam, 3584 img,cp:0.5
FasterRCNN:resnet50,90/10,e12,bs8,SGD,cnf0.15,i480
FasterRCNN[train]-geoaug-480p-SGD-90:10

Reef Experiments

1. FasterRCNN resnet50:

Inference NB: https://www.kaggle.com/soumya9977/learning-to-torch-fasterrcnn-infer

Experiment log FasterRCNN:

Version model file used link CV/LB
v9 fasterRCNN fasterrcnn_resnet50_fpn-e10.pt NB 0.461/0.285
v12 fasterRCNN fasterrcnn_resnet50_fpn-e9.pt NB 0.461/~0.285
v10 fasterRCNN fasterrcnn_resnet50_fpn-e11.pt NB 0.459/0.285
v13 fasterRCNN fasterrcnn_resnet50_fpn-e8.pt NB 0.460/0.288
v11 fasterRCNN fasterrcnn_resnet50_fpn-e6.pt NB 0.457/0.291

Experiment log FasterRCNN:

Version model file used link CV/LB
v9 fasterRCNN resnet50,90/10,e12,bs8,SGD,cnf0.15,i480 fasterrcnn_resnet50_fpn-e10.pt NB 0.461/0.285
v12 fasterRCNN ................... same as above fasterrcnn_resnet50_fpn-e9.pt NB 0.461/~0.285
v10 fasterRCNN ................... same as above fasterrcnn_resnet50_fpn-e11.pt NB 0.459/0.285
v13 fasterRCNN ................... same as above fasterrcnn_resnet50_fpn-e8.pt NB 0.460/0.288
v11 fasterRCNN ................... same as above fasterrcnn_resnet50_fpn-e6.pt NB 0.457/0.291
v16 fasterRCNN resnet50,90/10,e12,bs8,SGD,cnf0.15,i480,geo aug fasterrcnn_resnet50_fpn-e6.pt NB 0.467/0.274
v17 fasterRCNN resnet50,90/10,e20,bs8,SGD,cnf0.15,i480,color aug fasterrcnn_resnet50_fpn-e6.pt NB 0.407/0.382
v18 fasterRCNN resnet50,90/10,e20,bs8,SGD,cnf0.15,i480,color aug fasterrcnn_resnet50_fpn-e20.pt NB 0.407/0.291
v20 FasterRCNN[train]-color+geo aug-480p-SGD-90:10-e20,multi conf,new train loop,bs8 fasterrcnn_resnet50_fpn-e11.pt NB 0.338/?
v21 FasterRCNN[train]-color+geo aug-480p-SGD-90:10-e20,multi conf,new train loop,bs8 fasterrcnn_resnet50_fpn-e20.pt NB 0.338/0.184
v22 fasterRCNN resnet50,90/10,e20,bs8,SGD,cnf0.15,i480,color aug [inf imgSize2400] fasterrcnn_resnet50_fpn-e6.pt NB 0.407/0.00 [problem in the code]
v23 fasterRCNN resnet50,90/10,e16,bs8,AdamW,cnf0.15,i480,color aug [save_multy: future_resume] fasterrcnn_resnet50_fpn-e7.pt NB 0.382/?
v24 fasterRCNN resnet50,90/10,e16,bs8,AdamW,cnf0.1,i480,color aug [save_multy: future_resume] fasterrcnn_resnet50_fpn-e7.pt NB 0.382/?

Experiment log YOLOV5:

2. YOLOV5 table:

version config iou & conf img_size[train/test] epoch used CV/LB
starfish-v17 [tracking,tta] 1/5 fold,yolos6 ,3000,e11,bs2 0.4,0.28 1920 x 3 best.pt 0.81/0.571
starfish-v16 [tracking,tta] Good Moon model yolos6 "" 6400 f2_sub.pt ?/0.647
starfish-v15 [tracking,tta] Good Moon model yolos6 "" 1920 x 3 f2_sub.pt ?/0.641
Leon-V5 - v3 [no tracking, tta] Good Moon model yolos6 "" 10000 f2_sub.pt ?/0.432
Leon-V5 - v4 [no tracking, tta] Good Moon model yolos6 0.4,0.20 10000 f2_sub.pt ?/0.424
Leon-V5 - v1 [no tracking, tta] Good Moon model yolos6 0.50,0.30 6400 f2_sub.pt ?/0.665
Leon-V5 - v2 [no tracking, tta] Good Moon model yolos6 0.4,0.28 6400 f2_sub.pt ?/0.665
starfish-v13 [tracking,tta] 1/5 fold,yolov5s5 ,3000,e11,bs2 0.4,0.28 1920 x 3 best.pt 0.76/0.588
starfish-v12 [tracking,tta] 1/5 fold,yolov5s5 ,3000,e11,bs2 0.4,0.15 1920 x 3 best.pt 0.76/0.580
starfish-v07 [tracking,tta] 1/5 fold,yolov5s5 ,3000,e11,bs2 0.4,0.28 1920 x 3 best.pt 0.76/0.588

Experiment log YOLOV5:

Model cofig Log:

version config epoch for sub cv/lb
v4 CONF= 0.28, IOU= 0.40, sheep's model fold2 best ?/0.616
v5 yolov5s5:albu[frcnn],imgsize=3600,bs=2,e11,CONF= 0.28, IOU= 0.40 best.pt ?/0.552
v7 yolov5s5:albu[frcnn],imgsize=3600,bs=2,e11,CONF= 0.28, IOU= 0.40 [SAME as ☝] epoch6.pt 0.73871/0.552
v8 yolov5s5:ammarnassanalhajali yolov5 best.pt ?/?
v10 yolov5s5:[BASE MODEL] imgsize=3600,bs=2,e11,CONF= 0.28, IOU= 0.40 best.pt 0.76../0.588
v12 yolov5s5:[BASE MODEL] imgsize=3600,bs=2,e11,CONF= 0.15, IOU= 0.40 best.pt 0.76../0.580
v12 yolov5s5:[BASE MODEL] imgsize=3600,bs=2,e11,CONF= 0.15, IOU= 0.40 epoch7.pt 0.76../?
v15 yolov5s6:[Good Moon Model] CONF= 0.28, IOU= 0.40, img size = 1980x2 f2_sub2.pt 0.76../?
v16 yolov5s6:[Good Moon Model] CONF= 0.28, IOU= 0.40, img size = 6400 f2_sub2.pt 0.76../?
v16 yolov5s6:[yolov5s6] imgsize=3600,bs=2,e11,CONF= 0.28, IOU= 0.40 best.pt 0.81../?
v18 yolov5s6:[Good Moon Model] CONF= 0.30, IOU= 0.50, img size = 6400 f2_sub2.pt 0.76../?
v21 yolov5s6: Vid based split, vid:2, CONF= 0.30, IOU= 0.50, inf img size = 6400, train 3584, 6th epoch best f2 epoch 0.89/0.620
v23 yolov5s6: Vid based split, vid:1, CONF= 0.30, IOU= 0.50, inf img size = 6400, train 3584, 7th epoch best f2 epoch 0.72/0.610
v32 yolov5s6:[Good Moon Model] CONF=0.30,IOU= 0.50, img size = 6400 '../input/yolov5s6/yolov5s6_sub9.pt' ?/0.680
v33 yolov5m6: 3584img, adam,e6,bs2,vid:2 last.pt 0.87/0.558[p]
v34 yolov5m6: 3k img, e6,bs2, vid:2 epoch5.pt 0.869/0.558[p]
v35 yolov5s6 3584img, video_fold vid2, copypaste:0.5, e10 epoch7.pt 0.88/0.625[p]
v36 yolov5m6 resume training, 3584img, video_fold vid2, e11 epoch9.pt 0.88/0.623[p]

Experiment log YOLOV5:

3. YOLOV5m6 experiments:

  • yolov5m6try1-epoch5 = 0.600
  • yolov5m6try1-epoch3 = 0.562
  • yolov5s6-vid_id:0 = 0.555 [yolov5s6]

Future work:

  • Write a optuna scritp for adjusting the inference hyper-params of yolov5
  • Add some other NBs and writeups that are left.
  • making the code into python scripts.
  • Add model prediction video

Random Advice to Myself:

  • dont participate on multiple comps w/ overlapping timings [one at a time]
  • Try to explore more fields [NLP,audio,tabular]
  • use better ways to track [use WandB/google sheet]
  • In the middle of the competition, when a LB score boosting trick[increasing image size to 3x,4x ..10x] got shared, that changed the momentum of the competition. The competition suddenly turned into a GPU war. PPl who had more resources and compute were getting high LB scores. This thing really effected me bcoz I did not had any good compute except kaggle and colab. I was feeling like giving up. But though that I learned two life lession, which are,
    • keep patience
    • either you go all the way or you dont go anywhere.

After ward I understood that, anyone can achieve anything if they has the patience of keep working irrespective of the outcome. And for the 2nd point, I guess I developed it in myself that if you stop a process in the middle, it will give you nothing but regret, because you started with a motivation, right? I used to see the posters that I wrote over the competition time, used to see the timelines that I made involving differnet experimentation ideas, the progress I had made, these thing motivated me to keep on going all the way. So my advice will be to create some posters/TODOs/learning blogs while in the process of doing it, this will keep you motivated through out the journey and when you feel like giving up see those poster/blogs, and ask yourself, if I had to give up after coming that far, then why I even started doing it, those blogs/posters will give you the reason behind of your starting.