Introduction:

The goal of this competition is to accurately identify starfish in real-time by building an object detection model trained on underwater videos of coral reefs. Me and my teammate started early and contributed with few Notebook and discussion threads, but as the deadline for the overlapping competition Sartorius -CIS was approaching we had to shift our focus toward that. We got started with the competiion after the 31st Dec. It took some time to digest all the things that were happening, [im still reading the solutions, and some code 😅.]. This was a very successful competition [also sartorius], total of 2,026 teams and 61,174 submissions. People shared there ideas and code massively, because of that it was a great start for a beginner like me, because I got to learn a lot of things. With the help of the community our team was able to get in top 3% of the LB and had a silver medal [we were expecting Bronze]. Here I will be discussing about my experiments. I created a lot of NBs in this competition, and from that I tried to compile most of the NBs that are important. Here you will find some forked NBs(with modification) and some independent NBs. If I missed something please create an issue and ask there.

Inference image:

Inference video:

img_pred_seq45518.mp4

See more here

Few Ideas That We tried:

Most of the ideas were proposed on the Discussion forums

When I got to yolov5 it was clear that yolov5 is FTW. But there were conflits on which version to use yolov5s6 or yolov5m6. Because some were getting better results on one of them.
- I first started with 2 stage detector FasterRCNN, I tried different backbones and hyper-params with different augmentation techniques[geometric and colog and combined]. I tried ResNet101,ResNet50,MobileNet,EfficientB3,SwinTransformer. Check out this amazing repository by @mrinath, it helped for efficient with timm timmFasterRcnn.
- I started with 3fold yolov5s6 It was a video based splitting. I was using this repository minor changes over the ultralytics yolov5 to track the f2 score. As per my analysis it was most likely that video_id2 would give more better f2, because it has more data, and there were varience in the data. I tried different hyper parameters in that, and different training image resolutions. I tried doing ensemble after training each fold. I did the same with yolov5m6. I found out Adam was working better SGD. I also did some experiments with custom augmentation using albumentations.
- After seeing some discussion on yolov5 model freezing, I thought of trying that, and for this best splitting was sequence based groupfold. for more check out ultralytics docs. I trained both yolov5s6 and yolov5m6. image size was +/-3000.
- Along with yolov5, tracking was doing a better job increasing the CV/LB. I also tried that. I used norfair tracking. I saw some discussions on different tracking to use, like deep sort and so, but ended up using the norfair one as it was giving decent results and I did not had much time.
- As a postprocessing technique I also tried to use classification on the bounding boxes. It also helped. I tried different models Normal CNN, Densenet121, Resnet [50,101], Efficientnet[B3] and ensemble. I used this code for doing the ensemble, it is also added that in this repo. Our demo pipeline looks like this,

  graph TD;
      A(Competition Data)--> B(video split vid_0);
      A(Competition Data)--> C(video split vid_1);
      A(Competition Data)--> D(video split vid_2);
      B(video split vid_0)-->E(Train yolov5s6 img-3584);
      C(video split vid_1)-->M(Train yolov5s6 img-3584);
      D(video split vid_2)-->N(Train yolov5s6 img-3584);
      E(Train yolov5s6 img-3584)--> K(TTA w/ inf imgsize-6200);
      M(Train yolov5s6 img-3584)--> K(TTA w/ inf imgsize-6200);
      N(Train yolov5s6 img-3584)--> K(TTA w/ inf imgsize-6200);
      K(TTA w/ inf imgsize-6200)--eg conf:0.30, thr:0.50--> G(WBF);
      G(WBF)--> F(Classification);
      F(Classification)--> H(norfair Tracking);
      H(norfair Tracking)-->final

IMP NBs:

YOLO Inference NBs:

NB tracking

training NB	Inference NB
reef-frcnn-resnet50-on-4k.ipynb kaggle	reef-frcnn-resnet50-on-4k-infr.ipynb
frcnn-with-efficientnetv3-timm.ipynb	---
fasterrcnn-resnet101.ipynb	---
learning-to-torch-fasterrcnn-pytorch.ipynb `kaggle`	---
FasterRCNN[train]-color+geo aug-480p-SGD-90:10-e20 `kaggle`	learning-to-torch-fasterrcnn-infer.ipynb `kaggle`
trained yolov5s6 [img1920,bs2,e7] on fold 1 w/ evolve params	yolov5-inference-nb.ipynb
somu: yolov5s6[train1] base model [1 of 5 folds]	---
somu: yolov5s6 video_fold vid_id:1
somu: yolov5s6 video_fold vid2
somu: yolov5s6 video_fold vid2 [evolve]❌
yolov5m6 frcnn_albumentations vid2 [no cp]
yolov5m6 vid:2 adam
classification starfish
FasterRCNN[train]coloraug-480p-SGD/AdamW-90:10-e20
yolov5 Albumentations [train]
Evaluate F2 for YoloX and Norfair tracking
yolov5[train1] base model [1 of 5 folds]
somu: yolov5s6 video_fold vid2
somu: yolov5s6 video_fold vid2, copypaste:0.5
resume yolov5m6 vid2
somu: yolov5s6 video_fold vid0
yolov5m6 video fold: vid_id=2, img=3k,e11,bs2
yolov5m6 vid:2 adam, 3k img
yolov5s6 vid:2 adam, 3584 img,cp:0.5
FasterRCNN:resnet50,90/10,e12,bs8,SGD,cnf0.15,i480
FasterRCNN[train]-geoaug-480p-SGD-90:10

Reef Experiments

1. FasterRCNN resnet50:

Inference NB: https://www.kaggle.com/soumya9977/learning-to-torch-fasterrcnn-infer

Experiment log FasterRCNN:

Version	model	file used	link	CV/LB
v9	fasterRCNN	fasterrcnn_resnet50_fpn-e10.pt	NB	0.461/0.285
v12	fasterRCNN	fasterrcnn_resnet50_fpn-e9.pt	NB	0.461/~0.285
v10	fasterRCNN	fasterrcnn_resnet50_fpn-e11.pt	NB	0.459/0.285
v13	fasterRCNN	fasterrcnn_resnet50_fpn-e8.pt	NB	0.460/0.288
v11	fasterRCNN	fasterrcnn_resnet50_fpn-e6.pt	NB	0.457/0.291

Experiment log FasterRCNN:

Version	model	file used	link	CV/LB
v9	fasterRCNN resnet50,90/10,e12,bs8,SGD,cnf0.15,i480	fasterrcnn_resnet50_fpn-e10.pt	NB	0.461/0.285
v12	fasterRCNN ................... same as above	fasterrcnn_resnet50_fpn-e9.pt	NB	0.461/~0.285
v10	fasterRCNN ................... same as above	fasterrcnn_resnet50_fpn-e11.pt	NB	0.459/0.285
v13	fasterRCNN ................... same as above	fasterrcnn_resnet50_fpn-e8.pt	NB	0.460/0.288
v11	fasterRCNN ................... same as above	fasterrcnn_resnet50_fpn-e6.pt	NB	0.457/0.291
v16	fasterRCNN resnet50,90/10,e12,bs8,SGD,cnf0.15,i480,geo aug	fasterrcnn_resnet50_fpn-e6.pt	NB	0.467/0.274
v17	fasterRCNN resnet50,90/10,e20,bs8,SGD,cnf0.15,i480,color aug	fasterrcnn_resnet50_fpn-e6.pt	NB	0.407/0.382
v18	fasterRCNN resnet50,90/10,e20,bs8,SGD,cnf0.15,i480,color aug	fasterrcnn_resnet50_fpn-e20.pt	NB	0.407/0.291
v20	FasterRCNN[train]-color+geo aug-480p-SGD-90:10-e20,multi conf,new train loop,bs8	fasterrcnn_resnet50_fpn-e11.pt	NB	0.338/?
v21	FasterRCNN[train]-color+geo aug-480p-SGD-90:10-e20,multi conf,new train loop,bs8	fasterrcnn_resnet50_fpn-e20.pt	NB	0.338/0.184
v22	fasterRCNN resnet50,90/10,e20,bs8,SGD,cnf0.15,i480,color aug [inf imgSize2400]	fasterrcnn_resnet50_fpn-e6.pt	NB	0.407/0.00 [problem in the code]
v23	fasterRCNN resnet50,90/10,e16,bs8,AdamW,cnf0.15,i480,color aug [save_multy: future_resume]	fasterrcnn_resnet50_fpn-e7.pt	NB	0.382/?
v24	fasterRCNN resnet50,90/10,e16,bs8,AdamW,cnf0.1,i480,color aug [save_multy: future_resume]	fasterrcnn_resnet50_fpn-e7.pt	NB	0.382/?

Experiment log YOLOV5:

2. YOLOV5 table:

version	config	iou & conf	img_size[train/test]	epoch used	CV/LB
starfish-v17	[tracking,tta] 1/5 fold,yolos6 ,3000,e11,bs2	0.4,0.28	1920 x 3	best.pt	0.81/0.571
starfish-v16	[tracking,tta] Good Moon model yolos6	""	6400	f2_sub.pt	?/0.647
starfish-v15	[tracking,tta] Good Moon model yolos6	""	1920 x 3	f2_sub.pt	?/0.641
Leon-V5 - v3	[no tracking, tta] Good Moon model yolos6	""	10000	f2_sub.pt	?/0.432
Leon-V5 - v4	[no tracking, tta] Good Moon model yolos6	0.4,0.20	10000	f2_sub.pt	?/0.424
Leon-V5 - v1	[no tracking, tta] Good Moon model yolos6	0.50,0.30	6400	f2_sub.pt	?/0.665
Leon-V5 - v2	[no tracking, tta] Good Moon model yolos6	0.4,0.28	6400	f2_sub.pt	?/0.665
starfish-v13	[tracking,tta] 1/5 fold,yolov5s5 ,3000,e11,bs2	0.4,0.28	1920 x 3	best.pt	0.76/0.588
starfish-v12	[tracking,tta] 1/5 fold,yolov5s5 ,3000,e11,bs2	0.4,0.15	1920 x 3	best.pt	0.76/0.580
starfish-v07	[tracking,tta] 1/5 fold,yolov5s5 ,3000,e11,bs2	0.4,0.28	1920 x 3	best.pt	0.76/0.588

Experiment log YOLOV5:

Model cofig Log:

version	config	epoch for sub	cv/lb
v4	CONF= 0.28, IOU= 0.40, sheep's model	fold2 best	?/0.616
v5	yolov5s5:albu[frcnn],imgsize=3600,bs=2,e11,CONF= 0.28, IOU= 0.40	best.pt	?/0.552
v7	yolov5s5:albu[frcnn],imgsize=3600,bs=2,e11,CONF= 0.28, IOU= 0.40 [SAME as ☝]	epoch6.pt	0.73871/0.552
v8	yolov5s5:ammarnassanalhajali yolov5	best.pt	?/?
v10	yolov5s5:[BASE MODEL] imgsize=3600,bs=2,e11,CONF= 0.28, IOU= 0.40	best.pt	0.76../0.588
v12	yolov5s5:[BASE MODEL] imgsize=3600,bs=2,e11,CONF= 0.15, IOU= 0.40	best.pt	0.76../0.580
v12	yolov5s5:[BASE MODEL] imgsize=3600,bs=2,e11,CONF= 0.15, IOU= 0.40	epoch7.pt	0.76../?
v15	yolov5s6:[Good Moon Model] CONF= 0.28, IOU= 0.40, img size = 1980x2	f2_sub2.pt	0.76../?
v16	yolov5s6:[Good Moon Model] CONF= 0.28, IOU= 0.40, img size = 6400	f2_sub2.pt	0.76../?
v16	yolov5s6:[yolov5s6] imgsize=3600,bs=2,e11,CONF= 0.28, IOU= 0.40	best.pt	0.81../?
v18	yolov5s6:[Good Moon Model] CONF= 0.30, IOU= 0.50, img size = 6400	f2_sub2.pt	0.76../?
v21	yolov5s6: Vid based split, vid:2, CONF= 0.30, IOU= 0.50, inf img size = 6400, train 3584, 6th epoch	best f2 epoch	0.89/0.620
v23	yolov5s6: Vid based split, vid:1, CONF= 0.30, IOU= 0.50, inf img size = 6400, train 3584, 7th epoch	best f2 epoch	0.72/0.610
v32	yolov5s6:[Good Moon Model] CONF=0.30,IOU= 0.50, img size = 6400	'../input/yolov5s6/yolov5s6_sub9.pt'	?/0.680
v33	yolov5m6: 3584img, adam,e6,bs2,vid:2	last.pt	0.87/0.558[p]
v34	yolov5m6: 3k img, e6,bs2, vid:2	epoch5.pt	0.869/0.558[p]
v35	yolov5s6 3584img, video_fold vid2, copypaste:0.5, e10	epoch7.pt	0.88/0.625[p]
v36	yolov5m6 resume training, 3584img, video_fold vid2, e11	epoch9.pt	0.88/0.623[p]

Experiment log YOLOV5:

3. YOLOV5m6 experiments:

yolov5m6try1-epoch5 = 0.600
yolov5m6try1-epoch3 = 0.562
yolov5s6-vid_id:0 = 0.555 [yolov5s6]

Future work:

Write a optuna scritp for adjusting the inference hyper-params of yolov5
Add some other NBs and writeups that are left.
making the code into python scripts.
Add model prediction video

Random Advice to Myself:

dont participate on multiple comps w/ overlapping timings [one at a time]
Try to explore more fields [NLP,audio,tabular]
use better ways to track [use WandB/google sheet]
In the middle of the competition, when a LB score boosting trick[increasing image size to 3x,4x ..10x] got shared, that changed the momentum of the competition. The competition suddenly turned into a GPU war. PPl who had more resources and compute were getting high LB scores. This thing really effected me bcoz I did not had any good compute except kaggle and colab. I was feeling like giving up. But though that I learned two life lession, which are,
- keep patience
- either you go all the way or you dont go anywhere.

After ward I understood that, anyone can achieve anything if they has the patience of keep working irrespective of the outcome. And for the 2nd point, I guess I developed it in myself that if you stop a process in the middle, it will give you nothing but regret, because you started with a motivation, right? I used to see the posters that I wrote over the competition time, used to see the timelines that I made involving differnet experimentation ideas, the progress I had made, these thing motivated me to keep on going all the way. So my advice will be to create some posters/TODOs/learning blogs while in the process of doing it, this will keep you motivated through out the journey and when you feel like giving up see those poster/blogs, and ask yourself, if I had to give up after coming that far, then why I even started doing it, those blogs/posters will give you the reason behind of your starting.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
ensemble_boxes-1.0.4		ensemble_boxes-1.0.4
inference-NBs		inference-NBs
media		media
metric-calculation		metric-calculation
post-processing		post-processing
training		training
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction:

Inference image:

Inference video:

Few Ideas That We tried:

IMP NBs:

`Learning to Sea: Underwater img Enhancement + EDA` `public-kaggle` [200+ Upvotes]

YOLO Inference NBs:

NB tracking

Reef Experiments

1. FasterRCNN resnet50:

2. YOLOV5 table:

Model cofig Log:

3. YOLOV5m6 experiments:

Future work:

Random Advice to Myself:

About

Languages

soumya997/kaggle-GBR-Experimentations

Folders and files

Latest commit

History

Repository files navigation

Introduction:

Inference image:

Inference video:

Few Ideas That We tried:

IMP NBs:

Learning to Sea: Underwater img Enhancement + EDA public-kaggle [200+ Upvotes]

YOLO Inference NBs:

NB tracking

Reef Experiments

1. FasterRCNN resnet50:

2. YOLOV5 table:

Model cofig Log:

3. YOLOV5m6 experiments:

Future work:

Random Advice to Myself:

About

Topics

Resources

Stars

Watchers

Forks

Languages

`Learning to Sea: Underwater img Enhancement + EDA` `public-kaggle` [200+ Upvotes]