Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test error with main_SemanticKITTI.py #59

Closed
hahakid opened this issue Apr 16, 2020 · 6 comments
Closed

test error with main_SemanticKITTI.py #59

hahakid opened this issue Apr 16, 2020 · 6 comments

Comments

@hahakid
Copy link

hahakid commented Apr 16, 2020

Sorry for trouble you.
When I try vis mode,
labels = flat_inputs[21] //main_SemanticKITTI.py line 238 caused a problem, I find the flat_inputs only contains 0-19 items.

When I try test mode, and caused the following problems:
/home/kid/anaconda3/envs/rlnet/bin/python /home/kid/workspace/RandLA-Net/main_SemanticKITTI.py --gpu 0 --mode test --test_area 14
/home/kid/anaconda3/envs/rlnet/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:523: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/kid/anaconda3/envs/rlnet/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:524: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/kid/anaconda3/envs/rlnet/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/kid/anaconda3/envs/rlnet/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/kid/anaconda3/envs/rlnet/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/kid/anaconda3/envs/rlnet/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:532: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
Initiating input pipelines
WARNING:tensorflow:From /home/kid/workspace/RandLA-Net/RandLANet.py:265: softmax_cross_entropy_with_logits (from tensorflow.python.ops.nn_ops) is deprecated and will be removed in a future version.
Instructions for updating:

Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.

See tf.nn.softmax_cross_entropy_with_logits_v2.

/home/kid/anaconda3/envs/rlnet/lib/python3.5/site-packages/tensorflow/python/ops/gradients_impl.py:108: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
Model restored from results/SemanticKITTI/snapshots/snap-277357
step 0
Traceback (most recent call last):
File "/home/kid/anaconda3/envs/rlnet/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1292, in _do_call
return fn(*args)
File "/home/kid/anaconda3/envs/rlnet/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1277, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/kid/anaconda3/envs/rlnet/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1367, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[901120,16,16] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node layers/Softmax}} = SoftmaxT=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

 [[{{node layers/Encoder_layer_2LFAatt_pooling_1fc/Tensordot/Shape/_929}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_1963_layers/Encoder_layer_2LFAatt_pooling_1fc/Tensordot/Shape", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/kid/workspace/RandLA-Net/main_SemanticKITTI.py", line 226, in
tester.test(model, dataset)
File "/home/kid/workspace/RandLA-Net/tester_SemanticKITTI.py", line 78, in test
stacked_probs, labels, point_inds, cloud_inds = self.sess.run(ops, {model.is_training: False})
File "/home/kid/anaconda3/envs/rlnet/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 887, in run
run_metadata_ptr)
File "/home/kid/anaconda3/envs/rlnet/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1110, in _run
feed_dict_tensor, options, run_metadata)
File "/home/kid/anaconda3/envs/rlnet/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1286, in _do_run
run_metadata)
File "/home/kid/anaconda3/envs/rlnet/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1308, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[901120,16,16] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node layers/Softmax}} = SoftmaxT=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

 [[{{node layers/Encoder_layer_2LFAatt_pooling_1fc/Tensordot/Shape/_929}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_1963_layers/Encoder_layer_2LFAatt_pooling_1fc/Tensordot/Shape", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

Caused by op 'layers/Softmax', defined at:
File "/home/kid/workspace/RandLA-Net/main_SemanticKITTI.py", line 213, in
model = Network(dataset, cfg)
File "/home/kid/workspace/RandLA-Net/RandLANet.py", line 52, in init
self.logits = self.inference(self.inputs, self.is_training)
File "/home/kid/workspace/RandLA-Net/RandLANet.py", line 115, in inference
'Encoder_layer_' + str(i), is_training)
File "/home/kid/workspace/RandLA-Net/RandLANet.py", line 272, in dilated_res_block
f_pc = self.building_block(xyz, f_pc, neigh_idx, d_out, name + 'LFA', is_training)
File "/home/kid/workspace/RandLA-Net/RandLANet.py", line 285, in building_block
f_pc_agg = self.att_pooling(f_concat, d_out // 2, name + 'att_pooling_1', is_training)
File "/home/kid/workspace/RandLA-Net/RandLANet.py", line 352, in att_pooling
att_scores = tf.nn.softmax(att_activation, axis=1)
File "/home/kid/anaconda3/envs/rlnet/lib/python3.5/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "/home/kid/anaconda3/envs/rlnet/lib/python3.5/site-packages/tensorflow/python/ops/nn_ops.py", line 1746, in softmax
return _softmax(logits, gen_nn_ops.softmax, axis, name)
File "/home/kid/anaconda3/envs/rlnet/lib/python3.5/site-packages/tensorflow/python/ops/nn_ops.py", line 1707, in _softmax
output = compute_op(logits)
File "/home/kid/anaconda3/envs/rlnet/lib/python3.5/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 7138, in softmax
"Softmax", logits=logits, name=name)
File "/home/kid/anaconda3/envs/rlnet/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/kid/anaconda3/envs/rlnet/lib/python3.5/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "/home/kid/anaconda3/envs/rlnet/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3272, in create_op
op_def=op_def)
File "/home/kid/anaconda3/envs/rlnet/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1768, in init
self._traceback = tf_stack.extract_stack()

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[901120,16,16] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node layers/Softmax}} = SoftmaxT=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

 [[{{node layers/Encoder_layer_2LFAatt_pooling_1fc/Tensordot/Shape/_929}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_1963_layers/Encoder_layer_2LFAatt_pooling_1fc/Tensordot/Shape", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

Process finished with exit code 1

@QingyongHu
Copy link
Owner

Hi, @hahakid, thanks for your interest in our work!

This is mainly due to insufficient GPU memory, you can try to change val_batch_size to a smaller value and re-run the code.

Hope this would be helpful!

@hahakid
Copy link
Author

hahakid commented Apr 16, 2020

@QingyongHu thanks for your kind repley. I will try small batch size.
My first question is why labels = flat_inputs[21] //main_SemanticKITTI.py line 238 caused a problem, I find the flat_inputs only contains 0-19 items. this does not match in KITTI.

@QingyongHu
Copy link
Owner

Hi @hahakid, can you show the screenshot of the error information and the debug information?

@hahakid
Copy link
Author

hahakid commented Apr 16, 2020

@QingyongHu
/home/kid/anaconda3/envs/rlnet/bin/python /home/kid/workspace/RandLA-Net/main_SemanticKITTI.py --gpu 0 --mode vis --test_area 08
/home/kid/anaconda3/envs/rlnet/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:523: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/kid/anaconda3/envs/rlnet/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:524: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/kid/anaconda3/envs/rlnet/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/kid/anaconda3/envs/rlnet/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/kid/anaconda3/envs/rlnet/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/kid/anaconda3/envs/rlnet/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:532: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
Initiating input pipelines
Traceback (most recent call last):
File "/home/kid/workspace/RandLA-Net/main_SemanticKITTI.py", line 239, in
labels = flat_inputs[21]
IndexError: tuple index out of range

Process finished with exit code 1
I also got a screenshot at debug mode, the flat_inputs only contains 20 tuples, but the 21 is give to labels. I also tried to labels = flat_inputs[19], but not work.
Screenshot from 2020-04-17 00-46-44

@QingyongHu
Copy link
Owner

Hi @hahakid, sorry for my mistake, please modify here as
labels = flat_inputs[17]

@hahakid
Copy link
Author

hahakid commented Apr 17, 2020

@QingyongHu It works, thanks for your outstanding work. By the way, what the means in [18] and [19]?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants