You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 21, 2023. It is now read-only.
I am getting the loss is NaN error while training on custom data as well as benchmark dataset pascal voc dataset.
Here is the error:
INFO net.py: 271: labels_int32 : (512,) => cls_prob : (512, 10) ------|
INFO net.py: 271: bbox_pred : (512, 40) => loss_bbox : () ------- (op: SmoothL1Loss)
INFO net.py: 271: bbox_targets : (512, 40) => loss_bbox : () ------|
INFO net.py: 271: bbox_inside_weights : (512, 40) => loss_bbox : () ------|
INFO net.py: 271: bbox_outside_weights : (512, 40) => loss_bbox : () ------|
INFO net.py: 271: cls_prob : (512, 10) => accuracy_cls : () ------- (op: Accuracy)
INFO net.py: 271: labels_int32 : (512,) => accuracy_cls : () ------|
INFO net.py: 271: fpn_res2_2_sum : (1, 256, 336, 152) => _[mask]_roi_feat_fpn2 : (8, 256, 14, 14) ------- (op: RoIAlign)
INFO net.py: 271: mask_rois_fpn2 : (8, 5) => _[mask]_roi_feat_fpn2 : (8, 256, 14, 14) ------|
INFO net.py: 271: fpn_res3_7_sum : (1, 256, 168, 76) => _[mask]_roi_feat_fpn3 : (12, 256, 14, 14) ------- (op: RoIAlign)
INFO net.py: 271: mask_rois_fpn3 : (12, 5) => _[mask]_roi_feat_fpn3 : (12, 256, 14, 14) ------|
INFO net.py: 271: fpn_res4_35_sum : (1, 256, 84, 38) => _[mask]_roi_feat_fpn4 : (9, 256, 14, 14) ------- (op: RoIAlign)
INFO net.py: 271: mask_rois_fpn4 : (9, 5) => _[mask]_roi_feat_fpn4 : (9, 256, 14, 14) ------|
INFO net.py: 271: fpn_res5_2_sum : (1, 256, 42, 19) => _[mask]_roi_feat_fpn5 : (23, 256, 14, 14) ------- (op: RoIAlign)
INFO net.py: 271: mask_rois_fpn5 : (23, 5) => _[mask]_roi_feat_fpn5 : (23, 256, 14, 14) ------|
INFO net.py: 271: _[mask]_roi_feat_fpn2 : (8, 256, 14, 14) => _[mask]_roi_feat_shuffled : (52, 256, 14, 14) ------- (op: Concat)
INFO net.py: 271: _[mask]_roi_feat_fpn3 : (12, 256, 14, 14) => _[mask]_roi_feat_shuffled : (52, 256, 14, 14) ------|
INFO net.py: 271: _[mask]_roi_feat_fpn4 : (9, 256, 14, 14) => _[mask]_roi_feat_shuffled : (52, 256, 14, 14) ------|
INFO net.py: 271: _[mask]_roi_feat_fpn5 : (23, 256, 14, 14) => _[mask]_roi_feat_shuffled : (52, 256, 14, 14) ------|
INFO net.py: 271: _[mask]_roi_feat_shuffled : (52, 256, 14, 14) => _[mask]_roi_feat : (52, 256, 14, 14) ------- (op: BatchPermutation)
INFO net.py: 271: mask_rois_idx_restore_int32 : (52,) => _[mask]_roi_feat : (52, 256, 14, 14) ------|
INFO net.py: 271: _[mask]_roi_feat : (52, 256, 14, 14) => _[mask]_fcn1 : (52, 256, 14, 14) ------- (op: Conv)
INFO net.py: 271: _[mask]_fcn1 : (52, 256, 14, 14) => _[mask]_fcn1 : (52, 256, 14, 14) ------- (op: Relu)
INFO net.py: 271: _[mask]_fcn1 : (52, 256, 14, 14) => _[mask]_fcn2 : (52, 256, 14, 14) ------- (op: Conv)
INFO net.py: 271: _[mask]_fcn2 : (52, 256, 14, 14) => _[mask]_fcn2 : (52, 256, 14, 14) ------- (op: Relu)
INFO net.py: 271: _[mask]_fcn2 : (52, 256, 14, 14) => _[mask]_fcn3 : (52, 256, 14, 14) ------- (op: Conv)
INFO net.py: 271: _[mask]_fcn3 : (52, 256, 14, 14) => _[mask]_fcn3 : (52, 256, 14, 14) ------- (op: Relu)
INFO net.py: 271: _[mask]_fcn3 : (52, 256, 14, 14) => _[mask]_fcn4 : (52, 256, 14, 14) ------- (op: Conv)
INFO net.py: 271: _[mask]_fcn4 : (52, 256, 14, 14) => _[mask]_fcn4 : (52, 256, 14, 14) ------- (op: Relu)
INFO net.py: 271: _[mask]_fcn4 : (52, 256, 14, 14) => conv5_mask : (52, 256, 28, 28) ------- (op: ConvTranspose)
INFO net.py: 271: conv5_mask : (52, 256, 28, 28) => conv5_mask : (52, 256, 28, 28) ------- (op: Relu)
INFO net.py: 271: conv5_mask : (52, 256, 28, 28) => mask_fcn_logits : (52, 10, 28, 28) ------- (op: Conv)
INFO net.py: 271: mask_fcn_logits : (52, 10, 28, 28) => loss_mask : () ------- (op: SigmoidCrossEntropyLoss)
INFO net.py: 271: masks_int32 : (52, 7840) => loss_mask : () ------|
INFO net.py: 275: End of model: generalized_rcnn
../anaconda2/lib/python2.7/site-packages/numpy/lib/function_base.py:4033: RuntimeWarning: Invalid value encountered in median
r = func(a, **kwargs)
json_stats: {"accuracy_cls": 0.898438, "eta": "21 days, 12:25:49", "iter": 0, "loss": NaN, "loss_bbox": -0.071702, "loss_cls": 2.302585, "loss_mask": NaN, "loss_rpn_bbox_fpn2": 0.000000, "loss_rpn_bbox_fpn3": 0.000000, "loss_rpn_bbox_fpn4": NaN, "loss_rpn_bbox_fpn5": 0.000000, "loss_rpn_bbox_fpn6": 0.000000, "loss_rpn_cls_fpn2": 0.000000, "loss_rpn_cls_fpn3": NaN, "loss_rpn_cls_fpn4": 0.000000, "loss_rpn_cls_fpn5": 0.000000, "loss_rpn_cls_fpn6": 0.000000, "lr": 0.000333, "mb_qsize": 64, "mem": 7174, "time": 7.150576}
CRITICAL train_net.py: 159: Loss is NaN, exiting...
Tried with lowering the base learning rate.
The text was updated successfully, but these errors were encountered: