Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fine-tuned SGGen model mAP result #204

Open
narchitect opened this issue Jan 4, 2024 · 4 comments
Open

Fine-tuned SGGen model mAP result #204

narchitect opened this issue Jan 4, 2024 · 4 comments

Comments

@narchitect
Copy link

Hello everyone,

I hope you can provide some insights on a matter we've been grappling with. We've been working with the pretrained Faster R-CNN model provided in this repository, attempting to fine-tune it for our specific dataset. However, due to the necessity of removing bbox layers when training SGGen, our bbox detection layers end up being trained solely on our dataset, without the benefit of pretrained values. Consequently, our mAP (mean Average Precision) struggles to exceed 10%.

Just to provide some context, our dataset comprises 377 similar images and includes 23 different classes, which, admittedly, doesn't make for an ideal scenario.

As a result, we've observed that the best mAP we could achieve using the SGGen model from this repository is approximately 25%. Given the challenges posed by our less-than-optimal data quality, we believe that achieving an mAP of 12% in fine-tuned models that require bbox detection, like SGGen, is the best we can realistically hope for.

Now, I'd like to reach out to the community to ask if anyone has experience fine-tuning SGGen models and whether they've achieved mAP values higher than 25%. We're particularly interested in understanding if a 10% mAP should be considered acceptable in this context.

Thank you in advance for sharing your insights and experiences. We look forward to your valuable input!

image

@Maelic
Copy link

Maelic commented Feb 14, 2024

Hi @narchitect ,

The mAP you are referring to here is the performance of your object detection alone (i.e. bounding box regression + classification), I would suggest you to switch from Faster-RCNN to another detector which will be more performant for few-shot settings, which seems to be your case. Faster-RCNN is a pretty old and bad detector at this point, especially for few-shot, using a more recent detector pretrained on a larger dataset such as Swin transformer, DETR, ViT etc will be better.
Then you can train a SGGen model by freezing the weights of your object detector and replacing the RPN layers, as I explained it in here.
Do not replace the full backbone layers or you will have to change the features extractor as well, which is more complex.

@Lxy811
Copy link

Lxy811 commented Mar 12, 2024

Hi @narchitect ,

The mAP you are referring to here is the performance of your object detection alone (i.e. bounding box regression + classification), I would suggest you to switch from Faster-RCNN to another detector which will be more performant for few-shot settings, which seems to be your case. Faster-RCNN is a pretty old and bad detector at this point, especially for few-shot, using a more recent detector pretrained on a larger dataset such as Swin transformer, DETR, ViT etc will be better. Then you can train a SGGen model by freezing the weights of your object detector and replacing the RPN layers, as I explained it in here. Do not replace the full backbone layers or you will have to change the features extractor as well, which is more complex.

How to freeze the weight of your object detector and how to implement the code

@narchitect
Copy link
Author

Do not replace the full back

Thank you so much for your reply! I was also planning to use another object detection model. hope I get better results soon thanks again!

@Maelic
Copy link

Maelic commented Apr 2, 2024

Hi @narchitect ,
The mAP you are referring to here is the performance of your object detection alone (i.e. bounding box regression + classification), I would suggest you to switch from Faster-RCNN to another detector which will be more performant for few-shot settings, which seems to be your case. Faster-RCNN is a pretty old and bad detector at this point, especially for few-shot, using a more recent detector pretrained on a larger dataset such as Swin transformer, DETR, ViT etc will be better. Then you can train a SGGen model by freezing the weights of your object detector and replacing the RPN layers, as I explained it in here. Do not replace the full backbone layers or you will have to change the features extractor as well, which is more complex.

How to freeze the weight of your object detector and how to implement the code

How to freeze the weights depends on your detector, however here you can do something simpler by forcing your detector to be in eval mode with something like model.rpn.eval() or model.backbone.eval() somewhere before your training loop, to ensure no gradients are computed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants