-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Open
Description
Hi,
I am training my model with PSP decode head and on my custom dataset which is highly imbalanced. I am using dice loss due to the imbalanced nature of the dataset but my gradients are exploding and are in the order of ~5000s. I am using Adam optimizer with LR 3e-4 and weight decay 0.0001 as per the mmsegmentation documentation. I even used gradient clipping according to optimizer_config = dict(
delete=True, grad_clip=dict(max_norm=35, norm_type=2))
but the least value where the gradients go to is around ~3000s. Can anyone tell me how can I stabilize the training process and get my gradients down to 1.0. My targets are in the range of [0,1] (binary semantic segmentation task).
Any help will be highly appreciated.
Metadata
Metadata
Assignees
Labels
No labels