Segmentation fault (core dumped) during training #1214

sumeetssaurav · 2019-12-04T21:27:22Z

python keras_retinanet/bin/train.py --freeze-backbone --epochs 100 --batch-size 1 --steps 989 --snapshot-path ./CARPK/snapshots --weights ./CARPK/snapshots/resnet50_coco_best_v2.1.0.h5 --random-transform --config ./CARPK/config_carpk.ini --tensorboard-dir ./CARPK/log csv ./CARPK/annotations_CARPK_train.csv ./CARPK/carpk_classes.csv --val-annotations ./CARPK/annotations_CARPK_test.csv

Using TensorFlow backend.
/home/vision/.virtualenvs/dl4cv/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/vision/.virtualenvs/dl4cv/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/vision/.virtualenvs/dl4cv/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/vision/.virtualenvs/dl4cv/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/vision/.virtualenvs/dl4cv/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/vision/.virtualenvs/dl4cv/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
/home/vision/.virtualenvs/dl4cv/lib/python3.5/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/vision/.virtualenvs/dl4cv/lib/python3.5/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/vision/.virtualenvs/dl4cv/lib/python3.5/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/vision/.virtualenvs/dl4cv/lib/python3.5/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:544: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/vision/.virtualenvs/dl4cv/lib/python3.5/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:545: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/vision/.virtualenvs/dl4cv/lib/python3.5/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:550: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
Creating model, this may take a second...
WARNING:tensorflow:From /home/vision/.virtualenvs/dl4cv/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py:4070: The name tf.nn.max_pool is deprecated. Please use tf.nn.max_pool2d instead.

/home/vision/.virtualenvs/dl4cv/lib/python3.5/site-packages/keras/engine/saving.py:1316: UserWarning: Skipping loading of weights for layer regression_submodel due to mismatch in shape ((3, 3, 256, 120) vs (36, 256, 3, 3)).
weight_values[i].shape))
/home/vision/.virtualenvs/dl4cv/lib/python3.5/site-packages/keras/engine/saving.py:1316: UserWarning: Skipping loading of weights for layer regression_submodel due to mismatch in shape ((120,) vs (36,)).
weight_values[i].shape))
/home/vision/.virtualenvs/dl4cv/lib/python3.5/site-packages/keras/engine/saving.py:1316: UserWarning: Skipping loading of weights for layer classification_submodel due to mismatch in shape ((3, 3, 256, 30) vs (720, 256, 3, 3)).
weight_values[i].shape))
/home/vision/.virtualenvs/dl4cv/lib/python3.5/site-packages/keras/engine/saving.py:1316: UserWarning: Skipping loading of weights for layer classification_submodel due to mismatch in shape ((30,) vs (720,)).
weight_values[i].shape))
2019-12-05 02:52:59.216689: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-12-05 02:52:59.221714: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2019-12-05 02:52:59.285383: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-05 02:52:59.285852: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x9804760 executing computations on platform CUDA. Devices:
2019-12-05 02:52:59.285867: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): GeForce GTX 1080, Compute Capability 6.1
2019-12-05 02:52:59.306494: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3696000000 Hz
2019-12-05 02:52:59.307797: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x99312d0 executing computations on platform Host. Devices:
2019-12-05 02:52:59.307843: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): ,
2019-12-05 02:52:59.308035: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-05 02:52:59.308793: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: GeForce GTX 1080 major: 6 minor: 1 memoryClockRate(GHz): 1.7335
pciBusID: 0000:01:00.0
2019-12-05 02:52:59.308920: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcudart.so.10.0'; dlerror: libcudart.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.2/lib64
2019-12-05 02:52:59.308996: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcublas.so.10.0'; dlerror: libcublas.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.2/lib64
2019-12-05 02:52:59.309068: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcufft.so.10.0'; dlerror: libcufft.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.2/lib64
2019-12-05 02:52:59.309126: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcurand.so.10.0'; dlerror: libcurand.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.2/lib64
2019-12-05 02:52:59.309168: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcusolver.so.10.0'; dlerror: libcusolver.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.2/lib64
2019-12-05 02:52:59.309210: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcusparse.so.10.0'; dlerror: libcusparse.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.2/lib64
2019-12-05 02:52:59.311758: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2019-12-05 02:52:59.311772: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1663] Cannot dlopen some GPU libraries. Skipping registering GPU devices...
2019-12-05 02:52:59.311789: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-12-05 02:52:59.311796: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0
2019-12-05 02:52:59.311802: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N
2019-12-05 02:52:59.602293: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1412] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set. If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU. To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile.
tracking <tf.Variable 'Variable:0' shape=(30, 4) dtype=float32> anchors
tracking <tf.Variable 'Variable_1:0' shape=(30, 4) dtype=float32> anchors
tracking <tf.Variable 'Variable_2:0' shape=(30, 4) dtype=float32> anchors
tracking <tf.Variable 'Variable_3:0' shape=(30, 4) dtype=float32> anchors
tracking <tf.Variable 'Variable_4:0' shape=(30, 4) dtype=float32> anchors
WARNING:tensorflow:From keras_retinanet/bin/../../keras_retinanet/backend/tensorflow_backend.py:104: add_dispatch_support..wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
Model: "retinanet"

Layer (type) Output Shape Param # Connected to

input_1 (InputLayer) (None, None, None, 3 0

padding_conv1 (ZeroPadding2D) (None, None, None, 3 0 input_1[0][0]

conv1 (Conv2D) (None, None, None, 6 9408 padding_conv1[0][0]

bn_conv1 (BatchNormalization) (None, None, None, 6 256 conv1[0][0]

conv1_relu (Activation) (None, None, None, 6 0 bn_conv1[0][0]

pool1 (MaxPooling2D) (None, None, None, 6 0 conv1_relu[0][0]

res2a_branch2a (Conv2D) (None, None, None, 6 4096 pool1[0][0]

bn2a_branch2a (BatchNormalizati (None, None, None, 6 256 res2a_branch2a[0][0]

res2a_branch2a_relu (Activation (None, None, None, 6 0 bn2a_branch2a[0][0]

padding2a_branch2b (ZeroPadding (None, None, None, 6 0 res2a_branch2a_relu[0][0]

res2a_branch2b (Conv2D) (None, None, None, 6 36864 padding2a_branch2b[0][0]

bn2a_branch2b (BatchNormalizati (None, None, None, 6 256 res2a_branch2b[0][0]

res2a_branch2b_relu (Activation (None, None, None, 6 0 bn2a_branch2b[0][0]

res2a_branch2c (Conv2D) (None, None, None, 2 16384 res2a_branch2b_relu[0][0]

res2a_branch1 (Conv2D) (None, None, None, 2 16384 pool1[0][0]

bn2a_branch2c (BatchNormalizati (None, None, None, 2 1024 res2a_branch2c[0][0]

bn2a_branch1 (BatchNormalizatio (None, None, None, 2 1024 res2a_branch1[0][0]

res2a (Add) (None, None, None, 2 0 bn2a_branch2c[0][0]
bn2a_branch1[0][0]

res2a_relu (Activation) (None, None, None, 2 0 res2a[0][0]

res2b_branch2a (Conv2D) (None, None, None, 6 16384 res2a_relu[0][0]

bn2b_branch2a (BatchNormalizati (None, None, None, 6 256 res2b_branch2a[0][0]

res2b_branch2a_relu (Activation (None, None, None, 6 0 bn2b_branch2a[0][0]

padding2b_branch2b (ZeroPadding (None, None, None, 6 0 res2b_branch2a_relu[0][0]

res2b_branch2b (Conv2D) (None, None, None, 6 36864 padding2b_branch2b[0][0]

bn2b_branch2b (BatchNormalizati (None, None, None, 6 256 res2b_branch2b[0][0]

res2b_branch2b_relu (Activation (None, None, None, 6 0 bn2b_branch2b[0][0]

res2b_branch2c (Conv2D) (None, None, None, 2 16384 res2b_branch2b_relu[0][0]

bn2b_branch2c (BatchNormalizati (None, None, None, 2 1024 res2b_branch2c[0][0]

res2b (Add) (None, None, None, 2 0 bn2b_branch2c[0][0]
res2a_relu[0][0]

res2b_relu (Activation) (None, None, None, 2 0 res2b[0][0]

res2c_branch2a (Conv2D) (None, None, None, 6 16384 res2b_relu[0][0]

bn2c_branch2a (BatchNormalizati (None, None, None, 6 256 res2c_branch2a[0][0]

res2c_branch2a_relu (Activation (None, None, None, 6 0 bn2c_branch2a[0][0]

padding2c_branch2b (ZeroPadding (None, None, None, 6 0 res2c_branch2a_relu[0][0]

res2c_branch2b (Conv2D) (None, None, None, 6 36864 padding2c_branch2b[0][0]

bn2c_branch2b (BatchNormalizati (None, None, None, 6 256 res2c_branch2b[0][0]

res2c_branch2b_relu (Activation (None, None, None, 6 0 bn2c_branch2b[0][0]

res2c_branch2c (Conv2D) (None, None, None, 2 16384 res2c_branch2b_relu[0][0]

bn2c_branch2c (BatchNormalizati (None, None, None, 2 1024 res2c_branch2c[0][0]

res2c (Add) (None, None, None, 2 0 bn2c_branch2c[0][0]
res2b_relu[0][0]

res2c_relu (Activation) (None, None, None, 2 0 res2c[0][0]

res3a_branch2a (Conv2D) (None, None, None, 1 32768 res2c_relu[0][0]

bn3a_branch2a (BatchNormalizati (None, None, None, 1 512 res3a_branch2a[0][0]

res3a_branch2a_relu (Activation (None, None, None, 1 0 bn3a_branch2a[0][0]

padding3a_branch2b (ZeroPadding (None, None, None, 1 0 res3a_branch2a_relu[0][0]

res3a_branch2b (Conv2D) (None, None, None, 1 147456 padding3a_branch2b[0][0]

bn3a_branch2b (BatchNormalizati (None, None, None, 1 512 res3a_branch2b[0][0]

res3a_branch2b_relu (Activation (None, None, None, 1 0 bn3a_branch2b[0][0]

res3a_branch2c (Conv2D) (None, None, None, 5 65536 res3a_branch2b_relu[0][0]

res3a_branch1 (Conv2D) (None, None, None, 5 131072 res2c_relu[0][0]

bn3a_branch2c (BatchNormalizati (None, None, None, 5 2048 res3a_branch2c[0][0]

bn3a_branch1 (BatchNormalizatio (None, None, None, 5 2048 res3a_branch1[0][0]

res3a (Add) (None, None, None, 5 0 bn3a_branch2c[0][0]
bn3a_branch1[0][0]

res3a_relu (Activation) (None, None, None, 5 0 res3a[0][0]

res3b_branch2a (Conv2D) (None, None, None, 1 65536 res3a_relu[0][0]

bn3b_branch2a (BatchNormalizati (None, None, None, 1 512 res3b_branch2a[0][0]

res3b_branch2a_relu (Activation (None, None, None, 1 0 bn3b_branch2a[0][0]

padding3b_branch2b (ZeroPadding (None, None, None, 1 0 res3b_branch2a_relu[0][0]

res3b_branch2b (Conv2D) (None, None, None, 1 147456 padding3b_branch2b[0][0]

bn3b_branch2b (BatchNormalizati (None, None, None, 1 512 res3b_branch2b[0][0]

res3b_branch2b_relu (Activation (None, None, None, 1 0 bn3b_branch2b[0][0]

res3b_branch2c (Conv2D) (None, None, None, 5 65536 res3b_branch2b_relu[0][0]

bn3b_branch2c (BatchNormalizati (None, None, None, 5 2048 res3b_branch2c[0][0]

res3b (Add) (None, None, None, 5 0 bn3b_branch2c[0][0]
res3a_relu[0][0]

res3b_relu (Activation) (None, None, None, 5 0 res3b[0][0]

res3c_branch2a (Conv2D) (None, None, None, 1 65536 res3b_relu[0][0]

bn3c_branch2a (BatchNormalizati (None, None, None, 1 512 res3c_branch2a[0][0]

res3c_branch2a_relu (Activation (None, None, None, 1 0 bn3c_branch2a[0][0]

padding3c_branch2b (ZeroPadding (None, None, None, 1 0 res3c_branch2a_relu[0][0]

res3c_branch2b (Conv2D) (None, None, None, 1 147456 padding3c_branch2b[0][0]

bn3c_branch2b (BatchNormalizati (None, None, None, 1 512 res3c_branch2b[0][0]

res3c_branch2b_relu (Activation (None, None, None, 1 0 bn3c_branch2b[0][0]

res3c_branch2c (Conv2D) (None, None, None, 5 65536 res3c_branch2b_relu[0][0]

bn3c_branch2c (BatchNormalizati (None, None, None, 5 2048 res3c_branch2c[0][0]

res3c (Add) (None, None, None, 5 0 bn3c_branch2c[0][0]
res3b_relu[0][0]

res3c_relu (Activation) (None, None, None, 5 0 res3c[0][0]

res3d_branch2a (Conv2D) (None, None, None, 1 65536 res3c_relu[0][0]

bn3d_branch2a (BatchNormalizati (None, None, None, 1 512 res3d_branch2a[0][0]

res3d_branch2a_relu (Activation (None, None, None, 1 0 bn3d_branch2a[0][0]

padding3d_branch2b (ZeroPadding (None, None, None, 1 0 res3d_branch2a_relu[0][0]

res3d_branch2b (Conv2D) (None, None, None, 1 147456 padding3d_branch2b[0][0]

bn3d_branch2b (BatchNormalizati (None, None, None, 1 512 res3d_branch2b[0][0]

res3d_branch2b_relu (Activation (None, None, None, 1 0 bn3d_branch2b[0][0]

res3d_branch2c (Conv2D) (None, None, None, 5 65536 res3d_branch2b_relu[0][0]

bn3d_branch2c (BatchNormalizati (None, None, None, 5 2048 res3d_branch2c[0][0]

res3d (Add) (None, None, None, 5 0 bn3d_branch2c[0][0]
res3c_relu[0][0]

res3d_relu (Activation) (None, None, None, 5 0 res3d[0][0]

res4a_branch2a (Conv2D) (None, None, None, 2 131072 res3d_relu[0][0]

bn4a_branch2a (BatchNormalizati (None, None, None, 2 1024 res4a_branch2a[0][0]

res4a_branch2a_relu (Activation (None, None, None, 2 0 bn4a_branch2a[0][0]

padding4a_branch2b (ZeroPadding (None, None, None, 2 0 res4a_branch2a_relu[0][0]

res4a_branch2b (Conv2D) (None, None, None, 2 589824 padding4a_branch2b[0][0]

bn4a_branch2b (BatchNormalizati (None, None, None, 2 1024 res4a_branch2b[0][0]

res4a_branch2b_relu (Activation (None, None, None, 2 0 bn4a_branch2b[0][0]

res4a_branch2c (Conv2D) (None, None, None, 1 262144 res4a_branch2b_relu[0][0]

res4a_branch1 (Conv2D) (None, None, None, 1 524288 res3d_relu[0][0]

bn4a_branch2c (BatchNormalizati (None, None, None, 1 4096 res4a_branch2c[0][0]

bn4a_branch1 (BatchNormalizatio (None, None, None, 1 4096 res4a_branch1[0][0]

res4a (Add) (None, None, None, 1 0 bn4a_branch2c[0][0]
bn4a_branch1[0][0]

res4a_relu (Activation) (None, None, None, 1 0 res4a[0][0]

res4b_branch2a (Conv2D) (None, None, None, 2 262144 res4a_relu[0][0]

bn4b_branch2a (BatchNormalizati (None, None, None, 2 1024 res4b_branch2a[0][0]

res4b_branch2a_relu (Activation (None, None, None, 2 0 bn4b_branch2a[0][0]

padding4b_branch2b (ZeroPadding (None, None, None, 2 0 res4b_branch2a_relu[0][0]

res4b_branch2b (Conv2D) (None, None, None, 2 589824 padding4b_branch2b[0][0]

bn4b_branch2b (BatchNormalizati (None, None, None, 2 1024 res4b_branch2b[0][0]

res4b_branch2b_relu (Activation (None, None, None, 2 0 bn4b_branch2b[0][0]

res4b_branch2c (Conv2D) (None, None, None, 1 262144 res4b_branch2b_relu[0][0]

bn4b_branch2c (BatchNormalizati (None, None, None, 1 4096 res4b_branch2c[0][0]

res4b (Add) (None, None, None, 1 0 bn4b_branch2c[0][0]
res4a_relu[0][0]

res4b_relu (Activation) (None, None, None, 1 0 res4b[0][0]

res4c_branch2a (Conv2D) (None, None, None, 2 262144 res4b_relu[0][0]

bn4c_branch2a (BatchNormalizati (None, None, None, 2 1024 res4c_branch2a[0][0]

res4c_branch2a_relu (Activation (None, None, None, 2 0 bn4c_branch2a[0][0]

padding4c_branch2b (ZeroPadding (None, None, None, 2 0 res4c_branch2a_relu[0][0]

res4c_branch2b (Conv2D) (None, None, None, 2 589824 padding4c_branch2b[0][0]

bn4c_branch2b (BatchNormalizati (None, None, None, 2 1024 res4c_branch2b[0][0]

res4c_branch2b_relu (Activation (None, None, None, 2 0 bn4c_branch2b[0][0]

res4c_branch2c (Conv2D) (None, None, None, 1 262144 res4c_branch2b_relu[0][0]

bn4c_branch2c (BatchNormalizati (None, None, None, 1 4096 res4c_branch2c[0][0]

res4c (Add) (None, None, None, 1 0 bn4c_branch2c[0][0]
res4b_relu[0][0]

res4c_relu (Activation) (None, None, None, 1 0 res4c[0][0]

res4d_branch2a (Conv2D) (None, None, None, 2 262144 res4c_relu[0][0]

bn4d_branch2a (BatchNormalizati (None, None, None, 2 1024 res4d_branch2a[0][0]

res4d_branch2a_relu (Activation (None, None, None, 2 0 bn4d_branch2a[0][0]

padding4d_branch2b (ZeroPadding (None, None, None, 2 0 res4d_branch2a_relu[0][0]

res4d_branch2b (Conv2D) (None, None, None, 2 589824 padding4d_branch2b[0][0]

bn4d_branch2b (BatchNormalizati (None, None, None, 2 1024 res4d_branch2b[0][0]

res4d_branch2b_relu (Activation (None, None, None, 2 0 bn4d_branch2b[0][0]

res4d_branch2c (Conv2D) (None, None, None, 1 262144 res4d_branch2b_relu[0][0]

bn4d_branch2c (BatchNormalizati (None, None, None, 1 4096 res4d_branch2c[0][0]

res4d (Add) (None, None, None, 1 0 bn4d_branch2c[0][0]
res4c_relu[0][0]

res4d_relu (Activation) (None, None, None, 1 0 res4d[0][0]

res4e_branch2a (Conv2D) (None, None, None, 2 262144 res4d_relu[0][0]

bn4e_branch2a (BatchNormalizati (None, None, None, 2 1024 res4e_branch2a[0][0]

res4e_branch2a_relu (Activation (None, None, None, 2 0 bn4e_branch2a[0][0]

padding4e_branch2b (ZeroPadding (None, None, None, 2 0 res4e_branch2a_relu[0][0]

res4e_branch2b (Conv2D) (None, None, None, 2 589824 padding4e_branch2b[0][0]

bn4e_branch2b (BatchNormalizati (None, None, None, 2 1024 res4e_branch2b[0][0]

res4e_branch2b_relu (Activation (None, None, None, 2 0 bn4e_branch2b[0][0]

res4e_branch2c (Conv2D) (None, None, None, 1 262144 res4e_branch2b_relu[0][0]

bn4e_branch2c (BatchNormalizati (None, None, None, 1 4096 res4e_branch2c[0][0]

res4e (Add) (None, None, None, 1 0 bn4e_branch2c[0][0]
res4d_relu[0][0]

res4e_relu (Activation) (None, None, None, 1 0 res4e[0][0]

res4f_branch2a (Conv2D) (None, None, None, 2 262144 res4e_relu[0][0]

bn4f_branch2a (BatchNormalizati (None, None, None, 2 1024 res4f_branch2a[0][0]

res4f_branch2a_relu (Activation (None, None, None, 2 0 bn4f_branch2a[0][0]

padding4f_branch2b (ZeroPadding (None, None, None, 2 0 res4f_branch2a_relu[0][0]

res4f_branch2b (Conv2D) (None, None, None, 2 589824 padding4f_branch2b[0][0]

bn4f_branch2b (BatchNormalizati (None, None, None, 2 1024 res4f_branch2b[0][0]

res4f_branch2b_relu (Activation (None, None, None, 2 0 bn4f_branch2b[0][0]

res4f_branch2c (Conv2D) (None, None, None, 1 262144 res4f_branch2b_relu[0][0]

bn4f_branch2c (BatchNormalizati (None, None, None, 1 4096 res4f_branch2c[0][0]

res4f (Add) (None, None, None, 1 0 bn4f_branch2c[0][0]
res4e_relu[0][0]

res4f_relu (Activation) (None, None, None, 1 0 res4f[0][0]

res5a_branch2a (Conv2D) (None, None, None, 5 524288 res4f_relu[0][0]

bn5a_branch2a (BatchNormalizati (None, None, None, 5 2048 res5a_branch2a[0][0]

res5a_branch2a_relu (Activation (None, None, None, 5 0 bn5a_branch2a[0][0]

padding5a_branch2b (ZeroPadding (None, None, None, 5 0 res5a_branch2a_relu[0][0]

res5a_branch2b (Conv2D) (None, None, None, 5 2359296 padding5a_branch2b[0][0]

bn5a_branch2b (BatchNormalizati (None, None, None, 5 2048 res5a_branch2b[0][0]

res5a_branch2b_relu (Activation (None, None, None, 5 0 bn5a_branch2b[0][0]

res5a_branch2c (Conv2D) (None, None, None, 2 1048576 res5a_branch2b_relu[0][0]

res5a_branch1 (Conv2D) (None, None, None, 2 2097152 res4f_relu[0][0]

bn5a_branch2c (BatchNormalizati (None, None, None, 2 8192 res5a_branch2c[0][0]

bn5a_branch1 (BatchNormalizatio (None, None, None, 2 8192 res5a_branch1[0][0]

res5a (Add) (None, None, None, 2 0 bn5a_branch2c[0][0]
bn5a_branch1[0][0]

res5a_relu (Activation) (None, None, None, 2 0 res5a[0][0]

res5b_branch2a (Conv2D) (None, None, None, 5 1048576 res5a_relu[0][0]

bn5b_branch2a (BatchNormalizati (None, None, None, 5 2048 res5b_branch2a[0][0]

res5b_branch2a_relu (Activation (None, None, None, 5 0 bn5b_branch2a[0][0]

padding5b_branch2b (ZeroPadding (None, None, None, 5 0 res5b_branch2a_relu[0][0]

res5b_branch2b (Conv2D) (None, None, None, 5 2359296 padding5b_branch2b[0][0]

bn5b_branch2b (BatchNormalizati (None, None, None, 5 2048 res5b_branch2b[0][0]

res5b_branch2b_relu (Activation (None, None, None, 5 0 bn5b_branch2b[0][0]

res5b_branch2c (Conv2D) (None, None, None, 2 1048576 res5b_branch2b_relu[0][0]

bn5b_branch2c (BatchNormalizati (None, None, None, 2 8192 res5b_branch2c[0][0]

res5b (Add) (None, None, None, 2 0 bn5b_branch2c[0][0]
res5a_relu[0][0]

res5b_relu (Activation) (None, None, None, 2 0 res5b[0][0]

res5c_branch2a (Conv2D) (None, None, None, 5 1048576 res5b_relu[0][0]

bn5c_branch2a (BatchNormalizati (None, None, None, 5 2048 res5c_branch2a[0][0]

res5c_branch2a_relu (Activation (None, None, None, 5 0 bn5c_branch2a[0][0]

padding5c_branch2b (ZeroPadding (None, None, None, 5 0 res5c_branch2a_relu[0][0]

res5c_branch2b (Conv2D) (None, None, None, 5 2359296 padding5c_branch2b[0][0]

bn5c_branch2b (BatchNormalizati (None, None, None, 5 2048 res5c_branch2b[0][0]

res5c_branch2b_relu (Activation (None, None, None, 5 0 bn5c_branch2b[0][0]

res5c_branch2c (Conv2D) (None, None, None, 2 1048576 res5c_branch2b_relu[0][0]

bn5c_branch2c (BatchNormalizati (None, None, None, 2 8192 res5c_branch2c[0][0]

res5c (Add) (None, None, None, 2 0 bn5c_branch2c[0][0]
res5b_relu[0][0]

res5c_relu (Activation) (None, None, None, 2 0 res5c[0][0]

C5_reduced (Conv2D) (None, None, None, 2 524544 res5c_relu[0][0]

P5_upsampled (UpsampleLike) (None, None, None, 2 0 C5_reduced[0][0]
res4f_relu[0][0]

C4_reduced (Conv2D) (None, None, None, 2 262400 res4f_relu[0][0]

P4_merged (Add) (None, None, None, 2 0 P5_upsampled[0][0]
C4_reduced[0][0]

P4_upsampled (UpsampleLike) (None, None, None, 2 0 P4_merged[0][0]
res3d_relu[0][0]

C3_reduced (Conv2D) (None, None, None, 2 131328 res3d_relu[0][0]

P6 (Conv2D) (None, None, None, 2 4718848 res5c_relu[0][0]

P3_merged (Add) (None, None, None, 2 0 P4_upsampled[0][0]
C3_reduced[0][0]

C6_relu (Activation) (None, None, None, 2 0 P6[0][0]

P3 (Conv2D) (None, None, None, 2 590080 P3_merged[0][0]

P4 (Conv2D) (None, None, None, 2 590080 P4_merged[0][0]

P5 (Conv2D) (None, None, None, 2 590080 C5_reduced[0][0]

P7 (Conv2D) (None, None, None, 2 590080 C6_relu[0][0]

regression_submodel (Model) (None, None, 4) 2636920 P3[0][0]
P4[0][0]
P5[0][0]
P6[0][0]
P7[0][0]

classification_submodel (Model) (None, None, 1) 2429470 P3[0][0]
P4[0][0]
P5[0][0]
P6[0][0]
P7[0][0]

regression (Concatenate) (None, None, 4) 0 regression_submodel[1][0]
regression_submodel[2][0]
regression_submodel[3][0]
regression_submodel[4][0]
regression_submodel[5][0]

classification (Concatenate) (None, None, 1) 0 classification_submodel[1][0]
classification_submodel[2][0]
classification_submodel[3][0]
classification_submodel[4][0]
classification_submodel[5][0]

Total params: 36,624,982
Trainable params: 13,063,830
Non-trainable params: 23,561,152

None
WARNING:tensorflow:From /home/vision/.virtualenvs/dl4cv/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py:422: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.

WARNING:tensorflow:From /home/vision/.virtualenvs/dl4cv/lib/python3.5/site-packages/keras/callbacks/tensorboard_v1.py:200: The name tf.summary.merge_all is deprecated. Please use tf.compat.v1.summary.merge_all instead.

WARNING:tensorflow:From /home/vision/.virtualenvs/dl4cv/lib/python3.5/site-packages/keras/callbacks/tensorboard_v1.py:203: The name tf.summary.FileWriter is deprecated. Please use tf.compat.v1.summary.FileWriter instead.

System Specification

OS: Ubuntu 16.04
keras = 2.3.0
Tensorflow = 1.14.0
CUDA = 10.2

However, I am able to run the demo test notebook without any segmentation fault.
Tried training on coc from scratch, again getting segmentation fault.

Tried every related issues, but did not got any sucees.

Cound anyone please help me out.

Thanking You

sumeetssaurav · 2019-12-04T22:03:27Z

Got it solved!!
In case if anybody else face the similar issue please follow this step:

Add sys.settrace at the first line of the script
for example in my case I added the below line in the train.py script

import sys
sys.settrace
2. gdb python
3.(gdb) run python keras_retinanet/bin/train.py -----------
4. (gdb) backtrace
Finally, I got the conflicting packages. In my case cv2 package was conflicting with the Cython package.

alessandrodignani · 2019-12-06T18:07:34Z

I have the same issue. From your answer, I understood that the problem is a conflict between cv2 and Cyton, but how did you solved it?

I added sys.settrace just below import sys, but it didn't changed anything.
Since settrace is a function, I also tried to call it like sys.settrace(), but it requires a parameter function.

ssusie · 2019-12-19T20:58:48Z

I think sys.settrace and the following steps are for debugging.

I have downgraded opencv-python to 3.4.2.17 and the code worked.

venergiac · 2020-02-03T08:30:12Z

Thanks I got the same error and I solved upgrading to 4.2.0.32

'pip install -U opencv-python'

sumeetssaurav closed this as completed Dec 4, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Segmentation fault (core dumped) during training #1214

Segmentation fault (core dumped) during training #1214

sumeetssaurav commented Dec 4, 2019

sumeetssaurav commented Dec 4, 2019

alessandrodignani commented Dec 6, 2019

ssusie commented Dec 19, 2019

venergiac commented Feb 3, 2020 •

edited

Segmentation fault (core dumped) during training #1214

Segmentation fault (core dumped) during training #1214

Comments

sumeetssaurav commented Dec 4, 2019

Layer (type) Output Shape Param # Connected to

classification (Concatenate) (None, None, 1) 0 classification_submodel[1][0] classification_submodel[2][0] classification_submodel[3][0] classification_submodel[4][0] classification_submodel[5][0]

sumeetssaurav commented Dec 4, 2019

alessandrodignani commented Dec 6, 2019

ssusie commented Dec 19, 2019

venergiac commented Feb 3, 2020 • edited

classification (Concatenate) (None, None, 1) 0 classification_submodel[1][0]
classification_submodel[2][0]
classification_submodel[3][0]
classification_submodel[4][0]
classification_submodel[5][0]

venergiac commented Feb 3, 2020 •

edited