Handling exploding gradients

Hi,

I am training my model with PSP decode head and on my custom dataset which is highly imbalanced. I am using dice loss due to the imbalanced nature of the dataset but my gradients are exploding and are in the order of ~5000s. I am using Adam optimizer with LR 3e-4 and weight decay 0.0001 as per the mmsegmentation documentation. I even used gradient clipping according to optimizer_config = dict(
    _delete_=True, grad_clip=dict(max_norm=35, norm_type=2)) 
 but the least value where the gradients go to is around ~3000s. Can anyone tell me how can I stabilize the training process and get my gradients down to 1.0. My targets are in the range of [0,1] (binary semantic segmentation task). 
 Any help will be highly appreciated. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Handling exploding gradients #331

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Handling exploding gradients #331

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions