Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Train with dota-train-dataset(1024,14384files),the mAP on dota-val-dataset is 70.84 #4

Closed
hukaixuan19970627 opened this issue Jun 13, 2021 · 5 comments

Comments

@hukaixuan19970627
Copy link
Contributor

Thank you for your code, I'm learning how to use it, but I've had some problems and hope to get your help.
config: orientedreppoints_r50_demo.py
changes:
img_per_gpu=2 -> img_per_gpu=4
workers_per_gpu=2 -> workers_per_gpu=4
lr=0.01 -> lr=0.005
environment: 2 gpu(Tesla P40)
about mAP on val: 70.84.
classaps:[89.43 73.79 40.19 66.33 73.53 82.06 88.16 90.86 60.59 86.46 65.51 64.86 71.29 57.60 51.94 ]
my question: I use your checkpoints(form trainval-dataset) to detect dota-val-dataset and the mAP is about 82.
But the mAP 70.84(checkpoints form train-dota-dataset, test on val) feels lower than I expected(73 ~ 75). Is this normal?

@LiWentomng
Copy link
Owner

LiWentomng commented Jun 14, 2021

For training on the train dataset,evaluation on the val dataset. My results can gain the mAP:73.37447
class APs: [89.89954584 75.09381718 51.91760568 69.30359075 75.60788996 82.47240929
88.02548317 90.72148874 66.22466264 87.10500443 69.58421786 68.80032583
72.45845151 61.51307246 51.88949827].
My trained model is here (password: aabb). You can try it.

I guess that your results are resulted by these three aspects:

  1. My train set include 15749 files, subsize=1024 x1024, gap=200. The number of your files is less than it. My script is prepare_dota1_train_val.py to prepare the train and val dataset, and you can refer to it.

  2. The learning rate is a sensitive factor for the model training. My device environment is as follow: 8 RTX2080ti, 2 imgs per gpu.
    You can try the learning rate of 0.006, 0.008.

  3. You can also add the “RandomRotate”in the config to get a better mAP, as following:
    dict(type='RandomRotate', rate=0.5, angles=[30, 60, 90, 120, 150], auto_bound=False)

If you have any questions for this problem, please let me know. I'll try to help you to get the normal results.

@LiWentomng
Copy link
Owner

@hukaixuan19970627

@hukaixuan19970627
Copy link
Contributor Author

Yeah,the learning rate does have a significant impact on results. I got the mAP65 when the environment is 2 Tesla P40,4 imgs per gpu,lr=0.01(train on train-dota-dataset, test on val-dota-dataset).
My train-dota-dataset include 14384files(subsize=1024×1024, gap=100),maybe that's what makes the difference in results.

@hukaixuan19970627
Copy link
Contributor Author

Have you tried mixed precision training?
I add ‘fp16 = dict(loss_scale=512.)’ to the config file, but the mAP is just 4.78.
btw: The mAP is 74.98 with same config file, FP32 training.

@LiWentomng
Copy link
Owner

I haven't tried the mixed precision training to train this model.
As far as I know, Tesla P40 may not support FP16.
Besides, with a supportable GPU, the loss_scale=512 is used to adjust the magnification scale of loss and gradient during the training. The appropriate range is 0-1000. I guess that the model parameters have not been updated because the gradient is too small with fp16. Maybe a larger value will get a better result.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants