Use YoloR with swin transformer as backbone. #71

farazBhatti · 2022-09-01T09:01:03Z

@leondgarse I am trying to get inference using yolor with swin backbone but getting the following results. What can be the issue?

from keras_cv_attention_models import efficientnet, yolor
from keras_cv_attention_models import swin_transformer_v2

from keras_cv_attention_models import efficientnet, yolor
bb = swin_transformer_v2.SwinTransformerV2Small_window16(input_shape=(256, 256, 3), num_classes=1000)
model = yolor.YOLOR(backbone=bb) 

from keras_cv_attention_models import test_images
imm = test_images.dog_cat()
preds = model(model.preprocess_input(imm))
bboxs, lables, confidences = model.decode_predictions(preds)[0]

from keras_cv_attention_models.coco import data
data.show_image_with_bboxes(imm, bboxs, lables, confidences)

resulting output

The text was updated successfully, but these errors were encountered:

leondgarse · 2022-09-02T02:09:21Z

I think this a same issue with #70.
You have to train it firstly, as this combination is using their own pre-trained weights separately, we cannot expect a good result.
I may take a try for training, just got very little spare time recently...

farazBhatti · 2022-09-02T04:54:43Z

@leondgarse , which training script should i use for above mentioned combination training?

leondgarse · 2022-09-03T11:40:56Z

This is a command I've just tested. Detail usage for coco_train_script.py is explained in COCO training and evaluating.

CUDA_VISIBLE_DEVICES='0' ./coco_train_script.py --backbone swin_transformer_v2.SwinTransformerV2Small_window16 \
--det_header yolor.YOLOR --anchors_mode yolor -s yolor_swin

Here is a test result after only runing 9 epochs:

from keras_cv_attention_models import yolor, swin_transformer_v2, test_images

bb = swin_transformer_v2.SwinTransformerV2Small_window16(input_shape=(256, 256, 3), pretrained=None, num_classes=0)
model = yolor.YOLOR(backbone=bb, input_shape=(256, 256, 3), rescale_mode='torch')  # Default rescale_mode from coco_train_script.py is "torch"
model.load_weights('checkpoints/yolor_swin_latest.h5')  # Load the trained weights

# Detect
imm = test_images.dog_cat()
preds = model(model.preprocess_input(imm))
bboxs, lables, confidences = model.decode_predictions(preds)[0]

# Show
from keras_cv_attention_models.coco import data
data.show_image_with_bboxes(imm, bboxs, lables, confidences)

leondgarse · 2022-09-03T11:45:44Z

Uh, right, pull the latest code first, as I modified a little for swin float16 / float32 issue.

farazBhatti · 2022-09-03T11:47:48Z

ok, sure. I am working on it now, is it possible for you to share your trained model maybe?

leondgarse · 2022-09-03T12:35:58Z

If you really want, this yolor_swin.h5 is a model trained for 10 epochs.

farazBhatti · 2022-09-03T12:40:02Z

@leondgarse ,very thanks. I am currently training this model using colab, ill share once its done.::))

hamzakhalil798 · 2022-09-21T10:37:15Z

@leondgarse

Hey, Thank you so much for sharing your work and helping solve the quires. I want to train the above model you provided but am severely lacking in the resources department. It would be very generous of you to kindly train the model a little more whenever feasible and provide it if possible.
regards

leondgarse · 2022-09-22T12:07:20Z

Ya, my training is actually finished earlier, using inputa_shape 512. But only reaches test AP 0.4204, not very satisfying, as the total FLOPs is 63.25G...

CUDA_VISIBLE_DEVICES='0' ./coco_train_script.py \
--backbone swin_transformer_v2.SwinTransformerV2Small_window16 --det_header yolor.YOLOR \
--anchors_mode yolor -i 512 -b 32 -p adamw

Model weights is uploaded YOLOR_SwinTransformerV2Small_window16_512_epoch_89_val_ap_ar_0.4204.h5. Basic usage is same with the previous one, just with input_shape=512.

from keras_cv_attention_models import yolor, swin_transformer_v2, test_images

bb = swin_transformer_v2.SwinTransformerV2Small_window16(input_shape=(512, 512, 3), pretrained=None, num_classes=0)
model = yolor.YOLOR(backbone=bb, input_shape=(512, 512, 3), rescale_mode='torch')  # Default rescale_mode from coco_train_script.py is "torch"
model.load_weights('YOLOR_SwinTransformerV2Small_window16_512_epoch_89_val_ap_ar_0.4204.h5')  # Load the trained weights

# Detect
imm = test_images.dog_cat()
preds = model(model.preprocess_input(imm))
bboxs, lables, confidences = model.decode_predictions(preds)[0]

# Show
from keras_cv_attention_models.coco import data
data.show_image_with_bboxes(imm, bboxs, lables, confidences)

Eval

CUDA_VISIBLE_DEVICES='1' ./coco_eval_script.py \
-m checkpoints/YOLOR_SwinTransformerV2Small_window16_512_epoch_89_val_ap_ar_0.4204.h5 \
--nms_method hard --nms_iou_or_sigma 0.65 --nms_max_output_size 300 --nms_topk -1
# Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.426
# Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.618
# Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.456
# Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.216
# Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.473
# Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.608
# Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.336
# Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.526
# Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.562
# Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.316
# Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.628
# Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.778

hamzakhalil798 · 2022-09-22T14:22:00Z

@leondgarse
Thankyou so much!!!
Means alot.
You've made my day.

farazBhatti changed the title ~~Use YoloR with swin trqansformer as backbone.~~ Use YoloR with swin transformer as backbone. Sep 2, 2022

farazBhatti closed this as completed Sep 3, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use YoloR with swin transformer as backbone. #71

Use YoloR with swin transformer as backbone. #71

farazBhatti commented Sep 1, 2022 •

edited

leondgarse commented Sep 2, 2022

farazBhatti commented Sep 2, 2022

leondgarse commented Sep 3, 2022 •

edited

leondgarse commented Sep 3, 2022

farazBhatti commented Sep 3, 2022 •

edited

leondgarse commented Sep 3, 2022

farazBhatti commented Sep 3, 2022

hamzakhalil798 commented Sep 21, 2022

leondgarse commented Sep 22, 2022

hamzakhalil798 commented Sep 22, 2022

Use YoloR with swin transformer as backbone. #71

Use YoloR with swin transformer as backbone. #71

Comments

farazBhatti commented Sep 1, 2022 • edited

leondgarse commented Sep 2, 2022

farazBhatti commented Sep 2, 2022

leondgarse commented Sep 3, 2022 • edited

leondgarse commented Sep 3, 2022

farazBhatti commented Sep 3, 2022 • edited

leondgarse commented Sep 3, 2022

farazBhatti commented Sep 3, 2022

hamzakhalil798 commented Sep 21, 2022

leondgarse commented Sep 22, 2022

hamzakhalil798 commented Sep 22, 2022

farazBhatti commented Sep 1, 2022 •

edited

leondgarse commented Sep 3, 2022 •

edited

farazBhatti commented Sep 3, 2022 •

edited