You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@chingyaoc when i train the model. i get the negative loss. Batch size with 128 for 4 gpus.
Epoch: [0][600/781] Time 0.605 ( 0.451) Data 0.000 ( 0.026) LR 1.4405e-06 (7.2023e-07) Loss -1.3995e-01 (5.4105e-03)
Epoch: [0][610/781] Time 0.600 ( 0.453) Data 0.000 ( 0.025) LR 1.4645e-06 (7.3223e-07) Loss -2.5937e-01 (1.3689e-03)
Epoch: [0][620/781] Time 0.594 ( 0.455) Data 0.000 ( 0.025) LR 1.4885e-06 (7.4424e-07) Loss -1.5513e-01 (-1.7201e-03)
Epoch: [0][630/781] Time 0.595 ( 0.458) Data 0.000 ( 0.024) LR 1.5125e-06 (7.5624e-07) Loss -2.2810e-01 (-5.3600e-03)
Epoch: [0][640/781] Time 0.596 ( 0.460) Data 0.000 ( 0.024) LR 1.5365e-06 (7.6825e-07) Loss -2.3734e-01 (-8.6938e-03)
Epoch: [0][650/781] Time 0.599 ( 0.462) Data 0.000 ( 0.024) LR 1.5605e-06 (7.8025e-07) Loss -1.6923e-01 (-1.2377e-02)
Epoch: [0][660/781] Time 0.298 ( 0.462) Data 0.000 ( 0.023) LR 1.5845e-06 (7.9225e-07) Loss -2.4142e-01 (-1.5588e-02)
Epoch: [0][670/781] Time 0.294 ( 0.459) Data 0.000 ( 0.023) LR 1.6085e-06 (8.0426e-07) Loss -2.9520e-01 (-1.9183e-02)
Epoch: [0][680/781] Time 0.298 ( 0.457) Data 0.001 ( 0.023) LR 1.6325e-06 (8.1626e-07) Loss -3.0449e-01 (-2.3525e-02)
Epoch: [0][690/781] Time 0.302 ( 0.455) Data 0.000 ( 0.022) LR 1.6565e-06 (8.2827e-07) Loss -2.5155e-01 (-2.7540e-02)
Epoch: [0][700/781] Time 0.296 ( 0.452) Data 0.000 ( 0.022) LR 1.6805e-06 (8.4027e-07) Loss -3.9192e-01 (-3.1093e-02)
Epoch: [0][710/781] Time 0.292 ( 0.450) Data 0.000 ( 0.022) LR 1.7045e-06 (8.5227e-07) Loss -2.4313e-01 (-3.4482e-02)
Epoch: [0][720/781] Time 0.295 ( 0.448) Data 0.000 ( 0.021) LR 1.7286e-06 (8.6428e-07) Loss -3.2008e-01 (-3.8349e-02)
Epoch: [0][730/781] Time 0.295 ( 0.446) Data 0.000 ( 0.021) LR 1.7526e-06 (8.7628e-07) Loss -2.6550e-01 (-4.2157e-02)
The text was updated successfully, but these errors were encountered:
@chingyaoc when i train the model. i get the negative loss. Batch size with 128 for 4 gpus.
Epoch: [0][600/781] Time 0.605 ( 0.451) Data 0.000 ( 0.026) LR 1.4405e-06 (7.2023e-07) Loss -1.3995e-01 (5.4105e-03)
Epoch: [0][610/781] Time 0.600 ( 0.453) Data 0.000 ( 0.025) LR 1.4645e-06 (7.3223e-07) Loss -2.5937e-01 (1.3689e-03)
Epoch: [0][620/781] Time 0.594 ( 0.455) Data 0.000 ( 0.025) LR 1.4885e-06 (7.4424e-07) Loss -1.5513e-01 (-1.7201e-03)
Epoch: [0][630/781] Time 0.595 ( 0.458) Data 0.000 ( 0.024) LR 1.5125e-06 (7.5624e-07) Loss -2.2810e-01 (-5.3600e-03)
Epoch: [0][640/781] Time 0.596 ( 0.460) Data 0.000 ( 0.024) LR 1.5365e-06 (7.6825e-07) Loss -2.3734e-01 (-8.6938e-03)
Epoch: [0][650/781] Time 0.599 ( 0.462) Data 0.000 ( 0.024) LR 1.5605e-06 (7.8025e-07) Loss -1.6923e-01 (-1.2377e-02)
Epoch: [0][660/781] Time 0.298 ( 0.462) Data 0.000 ( 0.023) LR 1.5845e-06 (7.9225e-07) Loss -2.4142e-01 (-1.5588e-02)
Epoch: [0][670/781] Time 0.294 ( 0.459) Data 0.000 ( 0.023) LR 1.6085e-06 (8.0426e-07) Loss -2.9520e-01 (-1.9183e-02)
Epoch: [0][680/781] Time 0.298 ( 0.457) Data 0.001 ( 0.023) LR 1.6325e-06 (8.1626e-07) Loss -3.0449e-01 (-2.3525e-02)
Epoch: [0][690/781] Time 0.302 ( 0.455) Data 0.000 ( 0.022) LR 1.6565e-06 (8.2827e-07) Loss -2.5155e-01 (-2.7540e-02)
Epoch: [0][700/781] Time 0.296 ( 0.452) Data 0.000 ( 0.022) LR 1.6805e-06 (8.4027e-07) Loss -3.9192e-01 (-3.1093e-02)
Epoch: [0][710/781] Time 0.292 ( 0.450) Data 0.000 ( 0.022) LR 1.7045e-06 (8.5227e-07) Loss -2.4313e-01 (-3.4482e-02)
Epoch: [0][720/781] Time 0.295 ( 0.448) Data 0.000 ( 0.021) LR 1.7286e-06 (8.6428e-07) Loss -3.2008e-01 (-3.8349e-02)
Epoch: [0][730/781] Time 0.295 ( 0.446) Data 0.000 ( 0.021) LR 1.7526e-06 (8.7628e-07) Loss -2.6550e-01 (-4.2157e-02)
The text was updated successfully, but these errors were encountered: