Debug issue on Graph Generation #59

kshitijrajsharma · 2023-01-31T12:40:36Z

Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/celery/app/trace.py", line 451, in trace_task
R = retval = fun(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/celery/app/trace.py", line 734, in protected_call
return self.run(*args, **kwargs)
File "/app/core/tasks.py", line 95, in train_model
raise ex
File "/app/core/tasks.py", line 58, in train_model
final_accuracy, final_model_path = train(
File "/usr/local/lib/python3.8/dist-packages/hot_fair_utilities/training/train.py", line 56, in train
run_main_train_code(cfg)
File "/usr/local/lib/python3.8/dist-packages/hot_fair_utilities/training/run_training.py", line 279, in run_main_train_code
history = the_model.fit(
File "/usr/local/lib/python3.8/dist-packages/keras/utils/traceback_utils.py", line 67, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/eager/execute.py", line 54, in quick_execute
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.ResourceExhaustedError: Graph execution error:

OOM when allocating tensor with shape[16,64,64,64] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node model/decoder_stage2b_relu/Relu-0-1-TransposeNCHWToNHWC-LayoutOptimizer}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.
[Op:__inference_train_function_2701887]

kshitijrajsharma · 2023-03-29T11:56:50Z

Couldn't reproduce this issue as well , will reopen if encountered again

kshitijrajsharma added bug Something isn't working component : backend labels Jan 31, 2023

kshitijrajsharma closed this as completed Mar 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Debug issue on Graph Generation #59

Debug issue on Graph Generation #59

kshitijrajsharma commented Jan 31, 2023

kshitijrajsharma commented Mar 29, 2023

Debug issue on Graph Generation #59

Debug issue on Graph Generation #59

Comments

kshitijrajsharma commented Jan 31, 2023

kshitijrajsharma commented Mar 29, 2023