Loss value returns NaN in LeNetMnistWithCustomCallbacks example #8

devcrocod · 2020-09-22T19:10:47Z

When trained, the loss of each batch is NaN. This is also on test data. As a result, the accuracy is 0.098.
This does not happen on every run.

devcrocod · 2020-09-22T19:11:57Z

Output:

Extracting 60000 images of 28x28 from train-images-idx3-ubyte.gz
Extracting 60000 labels from train-labels-idx1-ubyte.gz
Extracting 10000 images of 28x28 from t10k-images-idx3-ubyte.gz
Extracting 10000 labels from t10k-labels-idx1-ubyte.gz
21:49:39.788 [main] DEBUG api.keras.Sequential - Conv2D(filters=32, kernelSize=[5, 5], strides=[1, 1, 1, 1], dilations=[1, 1, 1, 1], activation=Relu, kernelInitializer=HeNormal(seed=12) VarianceScaling(scale=2.0, mode=FAN_IN, distribution=TRUNCATED_NORMAL, seed=12), biasInitializer=api.keras.initializers.Zeros@be35cd9, kernelShape=[5, 5, 1, 32], padding=SAME); outputShape: [-1, 28, 28, 32]
21:49:39.789 [main] DEBUG api.keras.Sequential - MaxPool2D(poolSize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding=SAME); outputShape: [-1, 14, 14, 32]
21:49:39.790 [main] DEBUG api.keras.Sequential - Conv2D(filters=64, kernelSize=[5, 5], strides=[1, 1, 1, 1], dilations=[1, 1, 1, 1], activation=Relu, kernelInitializer=HeNormal(seed=12) VarianceScaling(scale=2.0, mode=FAN_IN, distribution=TRUNCATED_NORMAL, seed=12), biasInitializer=api.keras.initializers.Zeros@1b6e1eff, kernelShape=[5, 5, 32, 64], padding=SAME); outputShape: [-1, 14, 14, 64]
21:49:39.790 [main] DEBUG api.keras.Sequential - MaxPool2D(poolSize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding=SAME); outputShape: [-1, 7, 7, 64]
21:49:39.791 [main] DEBUG api.keras.Sequential - Flatten; outputShape: [3136]
21:49:39.792 [main] DEBUG api.keras.Sequential - Dense(outputSize=512, activation=Relu, kernelInitializer=HeNormal(seed=12) VarianceScaling(scale=2.0, mode=FAN_IN, distribution=TRUNCATED_NORMAL, seed=12), biasInitializer=Constant(constantValue=0.1), kernelShape=[3136, 512], biasShape=[512]); outputShape: [512]
21:49:39.793 [main] DEBUG api.keras.Sequential - Dense(outputSize=10, activation=Linear, kernelInitializer=HeNormal(seed=12) VarianceScaling(scale=2.0, mode=FAN_IN, distribution=TRUNCATED_NORMAL, seed=12), biasInitializer=Constant(constantValue=0.1), kernelShape=[512, 10], biasShape=[10]); outputShape: [10]
Name: default_data_placeholder; Type: Placeholder; Out #tensors:  1
Name: conv2d_1_conv2d_kernel; Type: VariableV2; Out #tensors:  1
Name: conv2d_1_conv2d_bias; Type: VariableV2; Out #tensors:  1
Name: Const; Type: Const; Out #tensors:  1
Name: Const_1; Type: Const; Out #tensors:  1
Name: StatelessTruncatedNormal; Type: StatelessTruncatedNormal; Out #tensors:  1
Name: Const_2; Type: Const; Out #tensors:  1
Name: Cast; Type: Cast; Out #tensors:  1
Name: Init_conv2d_1_conv2d_kernel; Type: Mul; Out #tensors:  1
Name: Assign_conv2d_1_conv2d_kernel; Type: Assign; Out #tensors:  1
Name: Const_3; Type: Const; Out #tensors:  1
Name: Init_conv2d_1_conv2d_bias/Zero; Type: Const; Out #tensors:  1
Name: Init_conv2d_1_conv2d_bias/Fill; Type: Fill; Out #tensors:  1
Name: Assign_conv2d_1_conv2d_bias; Type: Assign; Out #tensors:  1
Name: conv2d_3_conv2d_kernel; Type: VariableV2; Out #tensors:  1
Name: conv2d_3_conv2d_bias; Type: VariableV2; Out #tensors:  1
Name: Const_4; Type: Const; Out #tensors:  1
Name: Const_5; Type: Const; Out #tensors:  1
Name: StatelessTruncatedNormal_1; Type: StatelessTruncatedNormal; Out #tensors:  1
Name: Const_6; Type: Const; Out #tensors:  1
Name: Cast_1; Type: Cast; Out #tensors:  1
Name: Init_conv2d_3_conv2d_kernel; Type: Mul; Out #tensors:  1
Name: Assign_conv2d_3_conv2d_kernel; Type: Assign; Out #tensors:  1
Name: Const_7; Type: Const; Out #tensors:  1
Name: Init_conv2d_3_conv2d_bias/Zero; Type: Const; Out #tensors:  1
Name: Init_conv2d_3_conv2d_bias/Fill; Type: Fill; Out #tensors:  1
Name: Assign_conv2d_3_conv2d_bias; Type: Assign; Out #tensors:  1
Name: Const_8; Type: Const; Out #tensors:  1
Name: dense_6_dense_kernel; Type: VariableV2; Out #tensors:  1
Name: dense_6_dense_bias; Type: VariableV2; Out #tensors:  1
Name: Const_9; Type: Const; Out #tensors:  1
Name: Const_10; Type: Const; Out #tensors:  1
Name: StatelessTruncatedNormal_2; Type: StatelessTruncatedNormal; Out #tensors:  1
Name: Const_11; Type: Const; Out #tensors:  1
Name: Cast_2; Type: Cast; Out #tensors:  1
Name: Init_dense_6_dense_kernel; Type: Mul; Out #tensors:  1
Name: Assign_dense_6_dense_kernel; Type: Assign; Out #tensors:  1
Name: Const_12; Type: Const; Out #tensors:  1
Name: Const_13; Type: Const; Out #tensors:  1
Name: Init_dense_6_dense_bias; Type: Fill; Out #tensors:  1
Name: Assign_dense_6_dense_bias; Type: Assign; Out #tensors:  1
Name: dense_7_dense_kernel; Type: VariableV2; Out #tensors:  1
Name: dense_7_dense_bias; Type: VariableV2; Out #tensors:  1
Name: Const_14; Type: Const; Out #tensors:  1
Name: Const_15; Type: Const; Out #tensors:  1
Name: StatelessTruncatedNormal_3; Type: StatelessTruncatedNormal; Out #tensors:  1
Name: Const_16; Type: Const; Out #tensors:  1
Name: Cast_3; Type: Cast; Out #tensors:  1
Name: Init_dense_7_dense_kernel; Type: Mul; Out #tensors:  1
Name: Assign_dense_7_dense_kernel; Type: Assign; Out #tensors:  1
Name: Const_17; Type: Const; Out #tensors:  1
Name: Const_18; Type: Const; Out #tensors:  1
Name: Init_dense_7_dense_bias; Type: Fill; Out #tensors:  1
Name: Assign_dense_7_dense_bias; Type: Assign; Out #tensors:  1
Name: Placeholder; Type: Placeholder; Out #tensors:  1
Name: Conv2d; Type: Conv2D; Out #tensors:  1
Name: BiasAdd; Type: BiasAdd; Out #tensors:  1
Name: Activation_conv2d_1; Type: Relu; Out #tensors:  1
Name: Const_19; Type: Const; Out #tensors:  1
Name: Const_20; Type: Const; Out #tensors:  1
Name: MaxPool; Type: MaxPoolV2; Out #tensors:  1
Name: Conv2d_1; Type: Conv2D; Out #tensors:  1
Name: BiasAdd_1; Type: BiasAdd; Out #tensors:  1
Name: Activation_conv2d_3; Type: Relu; Out #tensors:  1
Name: Const_21; Type: Const; Out #tensors:  1
Name: Const_22; Type: Const; Out #tensors:  1
Name: MaxPool_1; Type: MaxPoolV2; Out #tensors:  1
Name: Reshape; Type: Reshape; Out #tensors:  1
Name: MatMul; Type: MatMul; Out #tensors:  1
Name: Add; Type: Add; Out #tensors:  1
Name: Activation_dense_6; Type: Relu; Out #tensors:  1
Name: MatMul_1; Type: MatMul; Out #tensors:  1
Name: Add_1; Type: Add; Out #tensors:  1
Name: SquaredDifference; Type: SquaredDifference; Out #tensors:  1
Name: Const_23; Type: Const; Out #tensors:  1
Name: Mean; Type: Mean; Out #tensors:  1
Name: Const_24; Type: Const; Out #tensors:  1
Name: default_training_loss; Type: Sum; Out #tensors:  1
Name: Gradients/OnesLike; Type: OnesLike; Out #tensors:  1
Name: Gradients/Shape; Type: Shape; Out #tensors:  1
Name: Gradients/Const; Type: Const; Out #tensors:  1
Name: Gradients/Const_1; Type: Const; Out #tensors:  1
Name: Gradients/Size; Type: Size; Out #tensors:  1
Name: Gradients/Add; Type: Add; Out #tensors:  1
Name: Gradients/Mod; Type: Mod; Out #tensors:  1
Name: Gradients/Range; Type: Range; Out #tensors:  1
Name: Gradients/OnesLike_1; Type: OnesLike; Out #tensors:  1
Name: Gradients/DynamicStitch; Type: DynamicStitch; Out #tensors:  1
Name: Gradients/Const_2; Type: Const; Out #tensors:  1
Name: Gradients/Maximum; Type: Maximum; Out #tensors:  1
Name: Gradients/Div; Type: Div; Out #tensors:  1
Name: Gradients/Reshape; Type: Reshape; Out #tensors:  1
Name: Gradients/Tile; Type: Tile; Out #tensors:  1
Name: Gradients/Shape_1; Type: Shape; Out #tensors:  1
Name: Gradients/Const_3; Type: Const; Out #tensors:  1
Name: Gradients/Const_4; Type: Const; Out #tensors:  1
Name: Gradients/Size_1; Type: Size; Out #tensors:  1
Name: Gradients/Add_1; Type: Add; Out #tensors:  1
Name: Gradients/Mod_1; Type: Mod; Out #tensors:  1
Name: Gradients/Range_1; Type: Range; Out #tensors:  1
Name: Gradients/OnesLike_2; Type: OnesLike; Out #tensors:  1
Name: Gradients/DynamicStitch_1; Type: DynamicStitch; Out #tensors:  1
Name: Gradients/Const_5; Type: Const; Out #tensors:  1
Name: Gradients/Maximum_1; Type: Maximum; Out #tensors:  1
Name: Gradients/Div_1; Type: Div; Out #tensors:  1
Name: Gradients/Reshape_1; Type: Reshape; Out #tensors:  1
Name: Gradients/Tile_1; Type: Tile; Out #tensors:  1
Name: Gradients/Shape_2; Type: Shape; Out #tensors:  1
Name: Gradients/Shape_3; Type: Shape; Out #tensors:  1
Name: Gradients/Const_6; Type: Const; Out #tensors:  1
Name: Gradients/Prod; Type: Prod; Out #tensors:  1
Name: Gradients/Prod_1; Type: Prod; Out #tensors:  1
Name: Gradients/Const_7; Type: Const; Out #tensors:  1
Name: Gradients/Maximum_2; Type: Maximum; Out #tensors:  1
Name: Gradients/Div_2; Type: Div; Out #tensors:  1
Name: Gradients/Cast; Type: Cast; Out #tensors:  1
Name: Gradients/Div_3; Type: Div; Out #tensors:  1
Name: Gradients/Const_8; Type: Const; Out #tensors:  1
Name: Gradients/Cast_1; Type: Cast; Out #tensors:  1
Name: Gradients/Subtract; Type: Sub; Out #tensors:  1
Name: Gradients/Multiply; Type: Mul; Out #tensors:  1
Name: Gradients/Multiply_1; Type: Mul; Out #tensors:  1
Name: Gradients/Negate; Type: Neg; Out #tensors:  1
Name: Gradients/Shape_4; Type: Shape; Out #tensors:  1
Name: Gradients/Shape_5; Type: Shape; Out #tensors:  1
Name: Gradients/BroadcastGradientArgs; Type: BroadcastGradientArgs; Out #tensors:  2
Name: Gradients/Sum; Type: Sum; Out #tensors:  1
Name: Gradients/Reshape_2; Type: Reshape; Out #tensors:  1
Name: Gradients/Sum_1; Type: Sum; Out #tensors:  1
Name: Gradients/Reshape_3; Type: Reshape; Out #tensors:  1
Name: Gradients/Identity; Type: Identity; Out #tensors:  1
Name: Gradients/Identity_1; Type: Identity; Out #tensors:  1
Name: Gradients/Shape_6; Type: Shape; Out #tensors:  1
Name: Gradients/Shape_7; Type: Shape; Out #tensors:  1
Name: Gradients/BroadcastGradientArgs_1; Type: BroadcastGradientArgs; Out #tensors:  2
Name: Gradients/Sum_2; Type: Sum; Out #tensors:  1
Name: Gradients/Reshape_4; Type: Reshape; Out #tensors:  1
Name: Gradients/Sum_3; Type: Sum; Out #tensors:  1
Name: Gradients/Reshape_5; Type: Reshape; Out #tensors:  1
Name: Gradients/MatMul; Type: MatMul; Out #tensors:  1
Name: Gradients/MatMul_1; Type: MatMul; Out #tensors:  1
Name: Gradients/ReluGrad; Type: ReluGrad; Out #tensors:  1
Name: Gradients/Identity_2; Type: Identity; Out #tensors:  1
Name: Gradients/Identity_3; Type: Identity; Out #tensors:  1
Name: Gradients/Shape_8; Type: Shape; Out #tensors:  1
Name: Gradients/Shape_9; Type: Shape; Out #tensors:  1
Name: Gradients/BroadcastGradientArgs_2; Type: BroadcastGradientArgs; Out #tensors:  2
Name: Gradients/Sum_4; Type: Sum; Out #tensors:  1
Name: Gradients/Reshape_6; Type: Reshape; Out #tensors:  1
Name: Gradients/Sum_5; Type: Sum; Out #tensors:  1
Name: Gradients/Reshape_7; Type: Reshape; Out #tensors:  1
Name: Gradients/MatMul_2; Type: MatMul; Out #tensors:  1
Name: Gradients/MatMul_3; Type: MatMul; Out #tensors:  1
Name: Gradients/Shape_10; Type: Shape; Out #tensors:  1
Name: Gradients/Reshape_8; Type: Reshape; Out #tensors:  1
Name: Gradients/MaxPoolGradV2; Type: MaxPoolGradV2; Out #tensors:  1
Name: Gradients/ReluGrad_1; Type: ReluGrad; Out #tensors:  1
Name: Gradients/BiasAddGrad; Type: BiasAddGrad; Out #tensors:  1
Name: Gradients/Identity_4; Type: Identity; Out #tensors:  1
Name: Gradients/Shape_11; Type: Shape; Out #tensors:  1
Name: Gradients/Conv2DBackpropInput; Type: Conv2DBackpropInput; Out #tensors:  1
Name: Gradients/Shape_12; Type: Shape; Out #tensors:  1
Name: Gradients/Conv2DBackpropFilter; Type: Conv2DBackpropFilter; Out #tensors:  1
Name: Gradients/MaxPoolGradV2_1; Type: MaxPoolGradV2; Out #tensors:  1
Name: Gradients/ReluGrad_2; Type: ReluGrad; Out #tensors:  1
Name: Gradients/BiasAddGrad_1; Type: BiasAddGrad; Out #tensors:  1
Name: Gradients/Identity_5; Type: Identity; Out #tensors:  1
Name: Gradients/Shape_13; Type: Shape; Out #tensors:  1
Name: Gradients/Conv2DBackpropInput_1; Type: Conv2DBackpropInput; Out #tensors:  1
Name: Gradients/Shape_14; Type: Shape; Out #tensors:  1
Name: Gradients/Conv2DBackpropFilter_1; Type: Conv2DBackpropFilter; Out #tensors:  1
Name: Const_25; Type: Const; Out #tensors:  1
Name: ApplyGradientDescent; Type: ApplyGradientDescent; Out #tensors:  1
Name: Const_26; Type: Const; Out #tensors:  1
Name: ApplyGradientDescent_1; Type: ApplyGradientDescent; Out #tensors:  1
Name: Const_27; Type: Const; Out #tensors:  1
Name: ApplyGradientDescent_2; Type: ApplyGradientDescent; Out #tensors:  1
Name: Const_28; Type: Const; Out #tensors:  1
Name: ApplyGradientDescent_3; Type: ApplyGradientDescent; Out #tensors:  1
Name: Const_29; Type: Const; Out #tensors:  1
Name: ApplyGradientDescent_4; Type: ApplyGradientDescent; Out #tensors:  1
Name: Const_30; Type: Const; Out #tensors:  1
Name: ApplyGradientDescent_5; Type: ApplyGradientDescent; Out #tensors:  1
Name: Const_31; Type: Const; Out #tensors:  1
Name: ApplyGradientDescent_6; Type: ApplyGradientDescent; Out #tensors:  1
Name: Const_32; Type: Const; Out #tensors:  1
Name: ApplyGradientDescent_7; Type: ApplyGradientDescent; Out #tensors:  1

21:49:39.814 [main] DEBUG api.keras.Sequential - Initialization of TensorFlow Graph variables
Train begins
Epoch 1 begins.
Training batch 0 begins.
21:49:40.380 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: 682.76715 metricValue: 0.102 }
Training batch 0 ends with loss 682.7671508789062.
Training batch 1 begins.
21:49:40.644 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: 463139.56 metricValue: 0.13 }
Training batch 1 ends with loss 463139.5625.
Training batch 2 begins.
21:49:40.909 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: 2.80750587E14 metricValue: 0.09 }
Training batch 2 ends with loss 2.80750586855424E14.
Training batch 3 begins.
21:49:41.154 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: Infinity metricValue: 0.094 }
Training batch 3 ends with loss Infinity.
Training batch 4 begins.
21:49:41.414 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.086 }
Training batch 4 ends with loss NaN.
Training batch 5 begins.
21:49:41.664 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.102 }
Training batch 5 ends with loss NaN.
Training batch 6 begins.
21:49:41.904 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.094 }
Training batch 6 ends with loss NaN.
Training batch 7 begins.
21:49:42.167 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.094 }
Training batch 7 ends with loss NaN.
Training batch 8 begins.
21:49:42.416 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.098 }
Training batch 8 ends with loss NaN.
Training batch 9 begins.
21:49:42.687 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.102 }
Training batch 9 ends with loss NaN.
Training batch 10 begins.
21:49:42.954 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.124 }
Training batch 10 ends with loss NaN.
Training batch 11 begins.
21:49:43.194 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.102 }
Training batch 11 ends with loss NaN.
Training batch 12 begins.
21:49:43.424 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.094 }
Training batch 12 ends with loss NaN.
Training batch 13 begins.
21:49:43.666 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.104 }
Training batch 13 ends with loss NaN.
Training batch 14 begins.
21:49:43.914 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.108 }
Training batch 14 ends with loss NaN.
Training batch 15 begins.
21:49:44.221 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.1 }
Training batch 15 ends with loss NaN.
Training batch 16 begins.
21:49:44.465 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.112 }
Training batch 16 ends with loss NaN.
Training batch 17 begins.
21:49:44.717 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.08 }
Training batch 17 ends with loss NaN.
Training batch 18 begins.
21:49:44.977 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.112 }
Training batch 18 ends with loss NaN.
Training batch 19 begins.
21:49:45.235 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.108 }
Training batch 19 ends with loss NaN.
Training batch 20 begins.
21:49:45.488 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.112 }
Training batch 20 ends with loss NaN.
Training batch 21 begins.
21:49:45.736 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.082 }
Training batch 21 ends with loss NaN.
Training batch 22 begins.
21:49:45.986 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.1 }
Training batch 22 ends with loss NaN.
Training batch 23 begins.
21:49:46.227 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.116 }
Training batch 23 ends with loss NaN.
Training batch 24 begins.
21:49:46.467 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.09 }
Training batch 24 ends with loss NaN.
Training batch 25 begins.
21:49:46.706 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.096 }
Training batch 25 ends with loss NaN.
Training batch 26 begins.
21:49:46.943 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.114 }
Training batch 26 ends with loss NaN.
Training batch 27 begins.
21:49:47.189 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.098 }
Training batch 27 ends with loss NaN.
Training batch 28 begins.
21:49:47.436 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.096 }
Training batch 28 ends with loss NaN.
Training batch 29 begins.
21:49:47.687 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.086 }
Training batch 29 ends with loss NaN.
Training batch 30 begins.
21:49:47.927 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.092 }
Training batch 30 ends with loss NaN.
Training batch 31 begins.
21:49:48.167 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.128 }
Training batch 31 ends with loss NaN.
Training batch 32 begins.
21:49:48.402 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.092 }
Training batch 32 ends with loss NaN.
Training batch 33 begins.
21:49:48.642 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.08 }
Training batch 33 ends with loss NaN.
Training batch 34 begins.
21:49:48.882 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.098 }
Training batch 34 ends with loss NaN.
Training batch 35 begins.
21:49:49.123 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.084 }
Training batch 35 ends with loss NaN.
Training batch 36 begins.
21:49:49.358 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.088 }
Training batch 36 ends with loss NaN.
Training batch 37 begins.
21:49:49.595 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.104 }
Training batch 37 ends with loss NaN.
Training batch 38 begins.
21:49:49.838 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.104 }
Training batch 38 ends with loss NaN.
Training batch 39 begins.
21:49:50.075 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.126 }
Training batch 39 ends with loss NaN.
Training batch 40 begins.
21:49:50.314 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.088 }
Training batch 40 ends with loss NaN.
Training batch 41 begins.
21:49:50.552 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.104 }
Training batch 41 ends with loss NaN.
Training batch 42 begins.
21:49:50.798 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.084 }
Training batch 42 ends with loss NaN.
Training batch 43 begins.
21:49:51.032 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.116 }
Training batch 43 ends with loss NaN.
Training batch 44 begins.
21:49:51.269 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.098 }
Training batch 44 ends with loss NaN.
Training batch 45 begins.
21:49:51.502 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.102 }
Training batch 45 ends with loss NaN.
Training batch 46 begins.
21:49:51.739 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.088 }
Training batch 46 ends with loss NaN.
Training batch 47 begins.
21:49:51.977 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.094 }
Training batch 47 ends with loss NaN.
Training batch 48 begins.
21:49:52.221 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.086 }
Training batch 48 ends with loss NaN.
Training batch 49 begins.
21:49:52.462 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.096 }
Training batch 49 ends with loss NaN.
Training batch 50 begins.
21:49:52.702 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.088 }
Training batch 50 ends with loss NaN.
Training batch 51 begins.
21:49:52.948 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.09 }
Training batch 51 ends with loss NaN.
Training batch 52 begins.
21:49:53.189 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.098 }
Training batch 52 ends with loss NaN.
Training batch 53 begins.
21:49:53.429 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.086 }
Training batch 53 ends with loss NaN.
Training batch 54 begins.
21:49:53.668 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.106 }
Training batch 54 ends with loss NaN.
Training batch 55 begins.
21:49:53.905 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.088 }
Training batch 55 ends with loss NaN.
Training batch 56 begins.
21:49:54.144 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.11 }
Training batch 56 ends with loss NaN.
Training batch 57 begins.
21:49:54.379 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.1 }
Training batch 57 ends with loss NaN.
Training batch 58 begins.
21:49:54.618 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.116 }
Training batch 58 ends with loss NaN.
Training batch 59 begins.
21:49:54.857 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.096 }
Training batch 59 ends with loss NaN.
Training batch 60 begins.
21:49:55.097 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.114 }
Training batch 60 ends with loss NaN.
Training batch 61 begins.
21:49:55.351 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.086 }
Training batch 61 ends with loss NaN.
Training batch 62 begins.
21:49:55.610 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.106 }
Training batch 62 ends with loss NaN.
Training batch 63 begins.
21:49:55.855 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.07 }
Training batch 63 ends with loss NaN.
Training batch 64 begins.
21:49:56.101 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.082 }
Training batch 64 ends with loss NaN.
Training batch 65 begins.
21:49:56.346 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.092 }
Training batch 65 ends with loss NaN.
Training batch 66 begins.
21:49:56.587 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.11 }
Training batch 66 ends with loss NaN.
Training batch 67 begins.
21:49:56.826 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.092 }
Training batch 67 ends with loss NaN.
Training batch 68 begins.
21:49:57.062 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.102 }
Training batch 68 ends with loss NaN.
Training batch 69 begins.
21:49:57.306 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.092 }
Training batch 69 ends with loss NaN.
Training batch 70 begins.
21:49:57.546 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.088 }
Training batch 70 ends with loss NaN.
Training batch 71 begins.
21:49:57.790 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.112 }
Training batch 71 ends with loss NaN.
Training batch 72 begins.
21:49:58.037 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.092 }
Training batch 72 ends with loss NaN.
Training batch 73 begins.
21:49:58.285 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.098 }
Training batch 73 ends with loss NaN.
Training batch 74 begins.
21:49:58.535 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.07 }
Training batch 74 ends with loss NaN.
Training batch 75 begins.
21:49:58.779 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.108 }
Training batch 75 ends with loss NaN.
Training batch 76 begins.
21:49:59.022 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.11 }
Training batch 76 ends with loss NaN.
Training batch 77 begins.
21:49:59.278 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.092 }
Training batch 77 ends with loss NaN.
Training batch 78 begins.
21:49:59.518 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.094 }
Training batch 78 ends with loss NaN.
Training batch 79 begins.
21:49:59.755 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.116 }
Training batch 79 ends with loss NaN.
Training batch 80 begins.
21:50:00.005 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.136 }
Training batch 80 ends with loss NaN.
Training batch 81 begins.
21:50:00.248 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.084 }
Training batch 81 ends with loss NaN.
Training batch 82 begins.
21:50:00.493 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.124 }
Training batch 82 ends with loss NaN.
Training batch 83 begins.
21:50:00.734 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.094 }
Training batch 83 ends with loss NaN.
Training batch 84 begins.
21:50:00.982 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.1 }
Training batch 84 ends with loss NaN.
Training batch 85 begins.
21:50:01.222 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.096 }
Training batch 85 ends with loss NaN.
Training batch 86 begins.
21:50:01.459 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.096 }
Training batch 86 ends with loss NaN.
Training batch 87 begins.
21:50:01.700 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.074 }
Training batch 87 ends with loss NaN.
Training batch 88 begins.
21:50:01.955 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.092 }
Training batch 88 ends with loss NaN.
Training batch 89 begins.
21:50:02.200 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.112 }
Training batch 89 ends with loss NaN.
Training batch 90 begins.
21:50:02.441 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.098 }
Training batch 90 ends with loss NaN.
Training batch 91 begins.
21:50:02.676 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.094 }
Training batch 91 ends with loss NaN.
Training batch 92 begins.
21:50:02.922 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.09 }
Training batch 92 ends with loss NaN.
Training batch 93 begins.
21:50:03.177 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.112 }
Training batch 93 ends with loss NaN.
Training batch 94 begins.
21:50:03.429 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.118 }
Training batch 94 ends with loss NaN.
Training batch 95 begins.
21:50:03.673 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.09 }
Training batch 95 ends with loss NaN.
Training batch 96 begins.
21:50:03.920 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.096 }
Training batch 96 ends with loss NaN.
Training batch 97 begins.
21:50:04.165 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.096 }
Training batch 97 ends with loss NaN.
Training batch 98 begins.
21:50:04.408 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.112 }
Training batch 98 ends with loss NaN.
Training batch 99 begins.
21:50:04.646 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.102 }
Training batch 99 ends with loss NaN.
Training batch 100 begins.
21:50:04.886 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.106 }
Training batch 100 ends with loss NaN.
Training batch 101 begins.
21:50:05.128 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.094 }
Training batch 101 ends with loss NaN.
Training batch 102 begins.
21:50:05.376 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.11 }
Training batch 102 ends with loss NaN.
Training batch 103 begins.
21:50:05.625 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.094 }
Training batch 103 ends with loss NaN.
Training batch 104 begins.
21:50:05.898 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.1 }
Training batch 104 ends with loss NaN.
Training batch 105 begins.
21:50:06.145 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.098 }
Training batch 105 ends with loss NaN.
Training batch 106 begins.
21:50:06.384 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.11 }
Training batch 106 ends with loss NaN.
Training batch 107 begins.
21:50:06.646 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.096 }
Training batch 107 ends with loss NaN.
Training batch 108 begins.
21:50:06.892 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.092 }
Training batch 108 ends with loss NaN.
Training batch 109 begins.
21:50:07.145 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.104 }
Training batch 109 ends with loss NaN.
Training batch 110 begins.
21:50:07.395 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.096 }
Training batch 110 ends with loss NaN.
Training batch 111 begins.
21:50:07.639 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.096 }
Training batch 111 ends with loss NaN.
Training batch 112 begins.
21:50:07.877 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.094 }
Training batch 112 ends with loss NaN.
Training batch 113 begins.
21:50:08.121 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.098 }
Training batch 113 ends with loss NaN.
Training batch 114 begins.
21:50:08.363 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.104 }
Training batch 114 ends with loss NaN.
Training batch 115 begins.
21:50:08.605 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.102 }
Training batch 115 ends with loss NaN.
Training batch 116 begins.
21:50:08.845 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.096 }
Training batch 116 ends with loss NaN.
Training batch 117 begins.
21:50:09.090 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.096 }
Training batch 117 ends with loss NaN.
Training batch 118 begins.
21:50:09.335 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.104 }
Training batch 118 ends with loss NaN.
Training batch 119 begins.
21:50:09.573 [main] DEBUG api.keras.Sequential - Batch stat: { lossValue: NaN metricValue: 0.092 }
Training batch 119 ends with loss NaN.
21:50:09.574 [main] INFO  api.keras.Sequential - epochs: 1 loss: NaN metric: 0.098999985
Epoch 1 ends.
Train ends with last loss NaN
Test begins
Test batch 0 begins.
Test batch 0 ends with loss NaN..
Test batch 1 begins.
Test batch 1 ends with loss NaN..
Test batch 2 begins.
Test batch 2 ends with loss NaN..
Test batch 3 begins.
Test batch 3 ends with loss NaN..
Test batch 4 begins.
Test batch 4 ends with loss NaN..
Test batch 5 begins.
Test batch 5 ends with loss NaN..
Test batch 6 begins.
Test batch 6 ends with loss NaN..
Test batch 7 begins.
Test batch 7 ends with loss NaN..
Test batch 8 begins.
Test batch 8 ends with loss NaN..
Test batch 9 begins.
Test batch 9 ends with loss NaN..
Train ends with last loss NaN
Accuracy: 0.09800000488758087

zaleslaw · 2020-10-22T09:24:17Z

@devcrocod Please, load and try a new version 0.0.9. This should be fixed.

devcrocod · 2020-10-22T12:31:45Z

I get new exception:

Exception in thread "main" java.lang.IllegalArgumentException: Negative dimension size caused by subtracting 2 from 1 for 'MaxPool_4' (op: 'MaxPoolV2') with input shapes: [?,1,1,128], [4], [4] and with computed input tensors: input[1] = <1 2 2 1>, input[2] = <1 2 2 1>.
	at org.tensorflow.GraphOperationBuilder.finish(Native Method)
	at org.tensorflow.GraphOperationBuilder.build(GraphOperationBuilder.java:42)
	at org.tensorflow.GraphOperationBuilder.build(GraphOperationBuilder.java:21)
	at org.tensorflow.op.nn.MaxPool.create(MaxPool.java:85)
	at org.tensorflow.op.NnOps.maxPool(NnOps.java:621)
	at api.core.layer.twodim.MaxPool2D.transformInput(MaxPool2D.kt:57)
	at api.core.Sequential.transformInputWithNNModel(Sequential.kt:714)
	at api.core.Sequential.compile(Sequential.kt:246)
	at api.core.Sequential.compile(Sequential.kt:214)
	at api.core.TrainableModel.compile$default(TrainableModel.kt:89)
	at examples.keras.mnist.VGGMnistKt.main(VGGMnist.kt:175)
	at examples.keras.mnist.VGGMnistKt.main(VGGMnist.kt)

zaleslaw · 2020-10-22T12:36:37Z

That's great! Seems like the previous bug is fixed, but a new one is created! чт, 22 окт. 2020 г. в 15:32, Pavel Gorgulov <notifications@github.com>:

…

I get new exception: Exception in thread "main" java.lang.IllegalArgumentException: Negative dimension size caused by subtracting 2 from 1 for 'MaxPool_4' (op: 'MaxPoolV2') with input shapes: [?,1,1,128], [4], [4] and with computed input tensors: input[1] = <1 2 2 1>, input[2] = <1 2 2 1>. at org.tensorflow.GraphOperationBuilder.finish(Native Method) at org.tensorflow.GraphOperationBuilder.build(GraphOperationBuilder.java:42) at org.tensorflow.GraphOperationBuilder.build(GraphOperationBuilder.java:21) at org.tensorflow.op.nn.MaxPool.create(MaxPool.java:85) at org.tensorflow.op.NnOps.maxPool(NnOps.java:621) at api.core.layer.twodim.MaxPool2D.transformInput(MaxPool2D.kt:57) at api.core.Sequential.transformInputWithNNModel(Sequential.kt:714) at api.core.Sequential.compile(Sequential.kt:246) at api.core.Sequential.compile(Sequential.kt:214) at api.core.TrainableModel.compile$default(TrainableModel.kt:89) at examples.keras.mnist.VGGMnistKt.main(VGGMnist.kt:175) at examples.keras.mnist.VGGMnistKt.main(VGGMnist.kt) — You are receiving this because you commented. Reply to this email directly, view it on GitHub <https://github.com/zaleslaw/KotlinDL/issues/8#issuecomment-714461173>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAJEUHN5YYNHSYLV5OTA4L3SMAQ4BANCNFSM4RWEPWGQ> .

zaleslaw · 2020-10-23T07:57:33Z

Hmm, looks like you've got new exception on another example, VGGMnist.kt (at examples.keras.mnist.VGGMnistKt.main(VGGMnist.kt:175)), not on LeNetMnistWithCustomCallbacks

Please, check, that LeNetMnistWithCustomCallbacks has no problems during the run

zaleslaw · 2020-11-03T13:18:05Z

@devcrocod Сould you please update your project with 0.0.10 version and check on your machine that this issue is fixed

devcrocod · 2020-11-03T13:50:03Z

LeNetMnistWithCustomCallbacks and VGGMnist run without exception. Loss function in LeNetMnistWithCustomCallbacks returns correct values.👍

zaleslaw · 2020-11-03T14:16:03Z

Cool, many thanks вт, 3 нояб. 2020 г. в 16:50, Pavel Gorgulov <notifications@github.com>:

…

LeNetMnistWithCustomCallbacks and VGGMnist run without exception. Loss function in LeNetMnistWithCustomCallbacks returns correct values.👍 — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#8 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAJEUHPBIKTUNSN375XET5TSOADBVANCNFSM4RWEPWGQ> .

zaleslaw closed this as completed Nov 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Loss value returns NaN in LeNetMnistWithCustomCallbacks example #8

Loss value returns NaN in LeNetMnistWithCustomCallbacks example #8

devcrocod commented Sep 22, 2020

devcrocod commented Sep 22, 2020

zaleslaw commented Oct 22, 2020

devcrocod commented Oct 22, 2020

zaleslaw commented Oct 22, 2020 via email

zaleslaw commented Oct 23, 2020

zaleslaw commented Nov 3, 2020

devcrocod commented Nov 3, 2020

zaleslaw commented Nov 3, 2020 via email

Loss value returns NaN in LeNetMnistWithCustomCallbacks example #8

Loss value returns NaN in LeNetMnistWithCustomCallbacks example #8

Comments

devcrocod commented Sep 22, 2020

devcrocod commented Sep 22, 2020

zaleslaw commented Oct 22, 2020

devcrocod commented Oct 22, 2020

zaleslaw commented Oct 22, 2020 via email

zaleslaw commented Oct 23, 2020

zaleslaw commented Nov 3, 2020

devcrocod commented Nov 3, 2020

zaleslaw commented Nov 3, 2020 via email