Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ColTran OOM Error even with small dataset #838

Closed
ketan-lambat opened this issue Sep 29, 2021 · 5 comments
Closed

ColTran OOM Error even with small dataset #838

ketan-lambat opened this issue Sep 29, 2021 · 5 comments

Comments

@ketan-lambat
Copy link

I am getting the OOM error (Out Of Memory) on running the script for Coltran spatial upsampler.

Getting the same error for the imagenet dataset of 100, 100, 50 and even 10 images for both modes (colorize, recolorize).

The /configs/spatial_upsampler.py file already has config.batch_size = 1, so cannot reduce further.
The first 2 steps colourizer and color_upsampler are running fine.

Command used

%%time
!rm -rf $IMG_DIR/.ipynb_checkpoints/ $STORE_DIR/stage2/.ipynb_checkpoints
!python -m coltran.custom_colorize --config=coltran/configs/spatial_upsampler.py --logdir=$LOGDIR/spatial_upsampler --img_dir=$IMG_DIR --store_dir=$STORE_DIR --gen_data_dir=$STORE_DIR/stage2 --mode=$MODE

env variables

os.environ["LOGDIR"] = "/content/drive/MyDrive/Colab_Work/HONORS/ColorTrans/coltran_pretrained/google-research/coltran/logdir/coltran"
os.environ['IMG_DIR'] = "/content/drive/MyDrive/Colab_Work/HONORS/ColorTrans/coltran_pretrained/imagenet_50/RGB/train" 
os.environ['STORE_DIR'] = "/content/drive/MyDrive/Colab_Work/HONORS/ColorTrans/coltran_pretrained/op_imgnet50_recolor_01"
os.environ['MODE'] = "recolorize"

I am running this on Google Colab

nvidia-smi

Wed Sep 29 06:34:34 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.63.01    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla K80           Off  | 00000000:00:04.0 Off |                    0 |
| N/A   40C    P0    65W / 149W |      0MiB / 11441MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Error

2021-09-29 06:33:47.409836: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-09-29 06:33:47.895533: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-09-29 06:33:47.896507: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-09-29 06:33:47.906032: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-09-29 06:33:47.906959: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-09-29 06:33:47.907786: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-09-29 06:33:53.182315: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-09-29 06:33:53.183181: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-09-29 06:33:53.184079: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-09-29 06:33:53.185008: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:39] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
2021-09-29 06:33:53.185080: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10819 MB memory:  -> device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7
2021-09-29 06:33:54.140517: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
2021-09-29 06:34:01.320702: I tensorflow/stream_executor/cuda/cuda_dnn.cc:369] Loaded cuDNN version 8005
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow/python/training/moving_averages.py:457: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
W0929 06:34:01.986204 140536467093376 deprecation.py:345] From /usr/local/lib/python3.7/dist-packages/tensorflow/python/training/moving_averages.py:457: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
I0929 06:34:02.988495 140536467093376 train_utils.py:91] Built with exponential moving average.
I0929 06:34:02.995397 140536467093376 train_utils.py:185] Restoring from /content/drive/MyDrive/Colab_Work/HONORS/ColorTrans/coltran_pretrained/google-research/coltran/logdir/coltran/spatial_upsampler.
I0929 06:34:10.659995 140536467093376 custom_colorize.py:207] Producing sample after 300000 training steps.
I0929 06:34:10.660418 140536467093376 custom_colorize.py:210] 10
2021-09-29 06:34:24.825106: W tensorflow/core/common_runtime/bfc_allocator.cc:457] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.25GiB (rounded to 1342177280)requested by op Softmax
If the cause is memory fragmentation maybe the environment variable 'TF_GPU_ALLOCATOR=cuda_malloc_async' will improve the situation. 
Current allocation summary follows.
Current allocation summary follows.
2021-09-29 06:34:24.825193: I tensorflow/core/common_runtime/bfc_allocator.cc:1004] BFCAllocator dump for GPU_0_bfc
2021-09-29 06:34:24.825235: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (256): 	Total Chunks: 25, Chunks in use: 25. 6.2KiB allocated for chunks. 6.2KiB in use in bin. 113B client-requested in use in bin.
2021-09-29 06:34:24.825262: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (512): 	Total Chunks: 1, Chunks in use: 0. 768B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2021-09-29 06:34:24.825290: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (1024): 	Total Chunks: 3, Chunks in use: 3. 3.2KiB allocated for chunks. 3.2KiB in use in bin. 3.0KiB client-requested in use in bin.
2021-09-29 06:34:24.825318: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (2048): 	Total Chunks: 75, Chunks in use: 74. 151.0KiB allocated for chunks. 149.0KiB in use in bin. 148.0KiB client-requested in use in bin.
2021-09-29 06:34:24.825340: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (4096): 	Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2021-09-29 06:34:24.825395: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (8192): 	Total Chunks: 13, Chunks in use: 12. 104.0KiB allocated for chunks. 96.0KiB in use in bin. 96.0KiB client-requested in use in bin.
2021-09-29 06:34:24.825421: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (16384): 	Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2021-09-29 06:34:24.825452: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (32768): 	Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2021-09-29 06:34:24.825473: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (65536): 	Total Chunks: 2, Chunks in use: 0. 247.0KiB allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2021-09-29 06:34:24.825494: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (131072): 	Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2021-09-29 06:34:24.825514: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (262144): 	Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2021-09-29 06:34:24.825536: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (524288): 	Total Chunks: 10, Chunks in use: 8. 5.50MiB allocated for chunks. 4.50MiB in use in bin. 4.00MiB client-requested in use in bin.
2021-09-29 06:34:24.825559: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (1048576): 	Total Chunks: 77, Chunks in use: 76. 79.00MiB allocated for chunks. 78.00MiB in use in bin. 77.19MiB client-requested in use in bin.
2021-09-29 06:34:24.825581: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (2097152): 	Total Chunks: 3, Chunks in use: 3. 7.75MiB allocated for chunks. 7.75MiB in use in bin. 7.75MiB client-requested in use in bin.
2021-09-29 06:34:24.825603: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (4194304): 	Total Chunks: 1, Chunks in use: 0. 7.50MiB allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2021-09-29 06:34:24.825623: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (8388608): 	Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2021-09-29 06:34:24.825645: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (16777216): 	Total Chunks: 1, Chunks in use: 0. 26.75MiB allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2021-09-29 06:34:24.825665: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (33554432): 	Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2021-09-29 06:34:24.825687: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (67108864): 	Total Chunks: 1, Chunks in use: 0. 73.50MiB allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2021-09-29 06:34:24.825709: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (134217728): 	Total Chunks: 5, Chunks in use: 0. 829.50MiB allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2021-09-29 06:34:24.825730: I tensorflow/core/common_runtime/bfc_allocator.cc:1011] Bin (268435456): 	Total Chunks: 13, Chunks in use: 9. 9.56GiB allocated for chunks. 6.56GiB in use in bin. 6.56GiB client-requested in use in bin.
2021-09-29 06:34:24.825754: I tensorflow/core/common_runtime/bfc_allocator.cc:1027] Bin for 1.25GiB was 256.00MiB, Chunk State: 
2021-09-29 06:34:24.825782: I tensorflow/core/common_runtime/bfc_allocator.cc:1033]   Size: 384.00MiB | Requested Size: 2.0KiB | in_use: 0 | bin_num: 20, prev:   Size: 640.00MiB | Requested Size: 640.00MiB | in_use: 1 | bin_num: -1
2021-09-29 06:34:24.825807: I tensorflow/core/common_runtime/bfc_allocator.cc:1033]   Size: 640.00MiB | Requested Size: 640.00MiB | in_use: 0 | bin_num: 20, next:   Size: 640.00MiB | Requested Size: 640.00MiB | in_use: 1 | bin_num: -1
2021-09-29 06:34:24.825831: I tensorflow/core/common_runtime/bfc_allocator.cc:1033]   Size: 896.00MiB | Requested Size: 2.0KiB | in_use: 0 | bin_num: 20, prev:   Size: 1.25GiB | Requested Size: 1.25GiB | in_use: 1 | bin_num: -1
2021-09-29 06:34:24.825855: I tensorflow/core/common_runtime/bfc_allocator.cc:1033]   Size: 1.12GiB | Requested Size: 2.0KiB | in_use: 0 | bin_num: 20, prev:   Size: 1.25GiB | Requested Size: 1.25GiB | in_use: 1 | bin_num: -1
2021-09-29 06:34:24.825874: I tensorflow/core/common_runtime/bfc_allocator.cc:1040] Next region of size 2097152
2021-09-29 06:34:24.825896: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703ae0000 of size 1280 next 1
2021-09-29 06:34:24.825915: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703ae0500 of size 256 next 2
2021-09-29 06:34:24.825934: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703ae0600 of size 256 next 3
2021-09-29 06:34:24.825953: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703ae0700 of size 256 next 4
2021-09-29 06:34:24.825971: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703ae0800 of size 256 next 5
2021-09-29 06:34:24.826010: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703ae0900 of size 786432 next 6
2021-09-29 06:34:24.826044: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703ba0900 of size 256 next 10
2021-09-29 06:34:24.826092: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703ba0a00 of size 786432 next 13
2021-09-29 06:34:24.826110: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c60a00 of size 256 next 11
2021-09-29 06:34:24.826127: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c60b00 of size 256 next 9
2021-09-29 06:34:24.826160: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c60c00 of size 256 next 15
2021-09-29 06:34:24.826192: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c60d00 of size 256 next 16
2021-09-29 06:34:24.826211: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c60e00 of size 256 next 21
2021-09-29 06:34:24.826229: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c60f00 of size 256 next 22
2021-09-29 06:34:24.826262: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c61000 of size 256 next 26
2021-09-29 06:34:24.826294: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c61100 of size 256 next 27
2021-09-29 06:34:24.826318: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c61200 of size 1024 next 28
2021-09-29 06:34:24.826336: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c61600 of size 256 next 206
2021-09-29 06:34:24.826354: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] Free  at 703c61700 of size 768 next 30
2021-09-29 06:34:24.826383: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c61a00 of size 256 next 33
2021-09-29 06:34:24.826402: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c61b00 of size 256 next 34
2021-09-29 06:34:24.826420: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c61c00 of size 8192 next 38
2021-09-29 06:34:24.826456: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c63c00 of size 256 next 39
2021-09-29 06:34:24.826474: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c63d00 of size 256 next 40
2021-09-29 06:34:24.826492: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c63e00 of size 256 next 48
2021-09-29 06:34:24.826509: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c63f00 of size 256 next 50
2021-09-29 06:34:24.826527: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c64000 of size 2048 next 47
2021-09-29 06:34:24.826545: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c64800 of size 2048 next 52
2021-09-29 06:34:24.826563: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c65000 of size 256 next 49
2021-09-29 06:34:24.826580: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c65100 of size 256 next 46
2021-09-29 06:34:24.826598: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c65200 of size 2048 next 54
2021-09-29 06:34:24.826615: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c65a00 of size 2048 next 56
2021-09-29 06:34:24.826639: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c66200 of size 2048 next 57
2021-09-29 06:34:24.826657: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c66a00 of size 2048 next 58
2021-09-29 06:34:24.826675: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c67200 of size 8192 next 44
2021-09-29 06:34:24.826692: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c69200 of size 2048 next 59
2021-09-29 06:34:24.826710: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c69a00 of size 2048 next 65
2021-09-29 06:34:24.826728: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c6a200 of size 2048 next 66
2021-09-29 06:34:24.826745: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c6aa00 of size 2048 next 69
2021-09-29 06:34:24.826763: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c6b200 of size 2048 next 71
2021-09-29 06:34:24.826787: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c6ba00 of size 2048 next 72
2021-09-29 06:34:24.826806: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c6c200 of size 8192 next 64
2021-09-29 06:34:24.826823: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c6e200 of size 2048 next 73
2021-09-29 06:34:24.826841: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c6ea00 of size 2048 next 79
2021-09-29 06:34:24.826858: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c6f200 of size 2048 next 80
2021-09-29 06:34:24.826876: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c6fa00 of size 2048 next 82
2021-09-29 06:34:24.826893: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c70200 of size 2048 next 84
2021-09-29 06:34:24.826910: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c70a00 of size 2048 next 85
2021-09-29 06:34:24.826928: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c71200 of size 8192 next 78
2021-09-29 06:34:24.826946: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c73200 of size 2048 next 86
2021-09-29 06:34:24.826964: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c73a00 of size 2048 next 92
2021-09-29 06:34:24.826989: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c74200 of size 2048 next 93
2021-09-29 06:34:24.827024: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c74a00 of size 2048 next 95
2021-09-29 06:34:24.827041: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c75200 of size 2048 next 97
2021-09-29 06:34:24.827059: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c75a00 of size 2048 next 98
2021-09-29 06:34:24.827077: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c76200 of size 8192 next 91
2021-09-29 06:34:24.827095: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c78200 of size 2048 next 99
2021-09-29 06:34:24.827112: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c78a00 of size 2048 next 105
2021-09-29 06:34:24.827130: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c79200 of size 2048 next 106
2021-09-29 06:34:24.827148: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c79a00 of size 2048 next 108
2021-09-29 06:34:24.827166: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] Free  at 703c7a200 of size 2048 next 110
2021-09-29 06:34:24.827184: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c7aa00 of size 2048 next 111
2021-09-29 06:34:24.827203: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] Free  at 703c7b200 of size 8192 next 104
2021-09-29 06:34:24.827220: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c7d200 of size 2048 next 112
2021-09-29 06:34:24.827239: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c7da00 of size 2048 next 118
2021-09-29 06:34:24.827271: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c7e200 of size 2048 next 119
2021-09-29 06:34:24.827307: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c7ea00 of size 2048 next 121
2021-09-29 06:34:24.827341: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c7f200 of size 2048 next 123
2021-09-29 06:34:24.827359: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c7fa00 of size 3072 next 117
2021-09-29 06:34:24.827387: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c80600 of size 256 next 126
2021-09-29 06:34:24.827405: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c80700 of size 256 next 12
2021-09-29 06:34:24.827428: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] Free  at 703c80800 of size 126208 next 208
2021-09-29 06:34:24.827447: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c9f500 of size 2048 next 209
2021-09-29 06:34:24.827465: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703c9fd00 of size 2048 next 211
2021-09-29 06:34:24.827482: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703ca0500 of size 2048 next 213
2021-09-29 06:34:24.827499: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703ca0d00 of size 2048 next 215
2021-09-29 06:34:24.827517: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703ca1500 of size 2048 next 217
2021-09-29 06:34:24.827535: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703ca1d00 of size 2048 next 219
2021-09-29 06:34:24.827552: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703ca2500 of size 2048 next 221
2021-09-29 06:34:24.827570: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703ca2d00 of size 2048 next 223
2021-09-29 06:34:24.827587: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703ca3500 of size 2048 next 225
2021-09-29 06:34:24.827604: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703ca3d00 of size 2048 next 227
2021-09-29 06:34:24.827622: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703ca4500 of size 2048 next 229
2021-09-29 06:34:24.827639: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703ca4d00 of size 2048 next 231
2021-09-29 06:34:24.827656: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703ca5500 of size 2048 next 234
2021-09-29 06:34:24.827673: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703ca5d00 of size 1024 next 236
2021-09-29 06:34:24.827691: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703ca6100 of size 2048 next 238
2021-09-29 06:34:24.827708: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703ca6900 of size 2048 next 239
2021-09-29 06:34:24.827726: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703ca7100 of size 2048 next 240
2021-09-29 06:34:24.827744: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703ca7900 of size 2048 next 241
2021-09-29 06:34:24.827761: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703ca8100 of size 2048 next 242
2021-09-29 06:34:24.827778: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703ca8900 of size 2048 next 243
2021-09-29 06:34:24.827796: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703ca9100 of size 2048 next 244
2021-09-29 06:34:24.827813: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703ca9900 of size 2048 next 245
2021-09-29 06:34:24.827830: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703caa100 of size 2048 next 246
2021-09-29 06:34:24.827847: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703caa900 of size 2048 next 247
2021-09-29 06:34:24.827864: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703cab100 of size 2048 next 248
2021-09-29 06:34:24.827881: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703cab900 of size 2048 next 249
2021-09-29 06:34:24.827898: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703cac100 of size 2048 next 250
2021-09-29 06:34:24.827915: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703cac900 of size 2048 next 251
2021-09-29 06:34:24.827934: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703cad100 of size 2048 next 252
2021-09-29 06:34:24.827953: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703cad900 of size 2048 next 253
2021-09-29 06:34:24.827971: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703cae100 of size 2048 next 254
2021-09-29 06:34:24.827988: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703cae900 of size 2048 next 255
2021-09-29 06:34:24.828011: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703caf100 of size 2048 next 256
2021-09-29 06:34:24.901942: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703caf900 of size 2048 next 257
2021-09-29 06:34:24.901999: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703cb0100 of size 2048 next 258
2021-09-29 06:34:24.902042: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703cb0900 of size 2048 next 259
2021-09-29 06:34:24.902076: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703cb1100 of size 2048 next 260
2021-09-29 06:34:24.902096: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703cb1900 of size 2048 next 261
2021-09-29 06:34:24.902116: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703cb2100 of size 8192 next 267
2021-09-29 06:34:24.902137: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703cb4100 of size 8192 next 272
2021-09-29 06:34:24.902159: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703cb6100 of size 8192 next 276
2021-09-29 06:34:24.902181: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703cb8100 of size 8192 next 281
2021-09-29 06:34:24.902218: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703cba100 of size 8192 next 286
2021-09-29 06:34:24.902241: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703cbc100 of size 8192 next 291
2021-09-29 06:34:24.902264: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703cbe100 of size 2048 next 293
2021-09-29 06:34:24.902285: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703cbe900 of size 2048 next 25
2021-09-29 06:34:24.902321: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703cbf100 of size 8192 next 29
2021-09-29 06:34:24.902358: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] Free  at 703cc1100 of size 126720 next 18446744073709551615
2021-09-29 06:34:24.902399: I tensorflow/core/common_runtime/bfc_allocator.cc:1040] Next region of size 4194304
2021-09-29 06:34:24.902443: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703ce0000 of size 256 next 8
2021-09-29 06:34:24.902466: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] Free  at 703ce0100 of size 524288 next 18
2021-09-29 06:34:24.902488: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703d60100 of size 524288 next 17
2021-09-29 06:34:24.902509: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703de0100 of size 1048576 next 42
2021-09-29 06:34:24.902532: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 703ee0100 of size 2096896 next 18446744073709551615
2021-09-29 06:34:24.902553: I tensorflow/core/common_runtime/bfc_allocator.cc:1040] Next region of size 67108864
2021-09-29 06:34:24.902594: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] Free  at 704ae0000 of size 7864320 next 204
2021-09-29 06:34:24.902623: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 705260000 of size 3932160 next 203
2021-09-29 06:34:24.902645: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] Free  at 705620000 of size 28049408 next 182
2021-09-29 06:34:24.902666: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 7070e0000 of size 1048576 next 210
2021-09-29 06:34:24.902687: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 7071e0000 of size 1048576 next 212
2021-09-29 06:34:24.902708: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 7072e0000 of size 1048576 next 214
2021-09-29 06:34:24.902739: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 7073e0000 of size 1048576 next 216
2021-09-29 06:34:24.902760: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 7074e0000 of size 1048576 next 218
2021-09-29 06:34:24.902782: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 7075e0000 of size 1048576 next 220
2021-09-29 06:34:24.902802: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 7076e0000 of size 1048576 next 222
2021-09-29 06:34:24.902823: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 7077e0000 of size 1048576 next 224
2021-09-29 06:34:24.902845: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 7078e0000 of size 1048576 next 226
2021-09-29 06:34:24.902868: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 7079e0000 of size 1048576 next 228
2021-09-29 06:34:24.902889: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 707ae0000 of size 1048576 next 230
2021-09-29 06:34:24.902926: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 707be0000 of size 1048576 next 232
2021-09-29 06:34:24.902964: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 707ce0000 of size 1572864 next 233
2021-09-29 06:34:24.902997: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 707e60000 of size 2097152 next 235
2021-09-29 06:34:24.903035: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 708060000 of size 524288 next 237
2021-09-29 06:34:24.903059: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 7080e0000 of size 524288 next 262
2021-09-29 06:34:24.903082: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 708160000 of size 524288 next 263
2021-09-29 06:34:24.903105: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 7081e0000 of size 1048576 next 264
2021-09-29 06:34:24.903128: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 7082e0000 of size 1048576 next 265
2021-09-29 06:34:24.903151: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 7083e0000 of size 1048576 next 266
2021-09-29 06:34:24.903174: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 7084e0000 of size 1048576 next 268
2021-09-29 06:34:24.903197: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 7085e0000 of size 1048576 next 269
2021-09-29 06:34:24.903219: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 7086e0000 of size 1048576 next 270
2021-09-29 06:34:24.903242: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 7087e0000 of size 1048576 next 271
2021-09-29 06:34:24.903280: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 7088e0000 of size 1048576 next 273
2021-09-29 06:34:24.903318: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 7089e0000 of size 1048576 next 18446744073709551615
2021-09-29 06:34:24.903341: I tensorflow/core/common_runtime/bfc_allocator.cc:1040] Next region of size 134217728
2021-09-29 06:34:24.903364: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] Free  at 708ae0000 of size 134217728 next 18446744073709551615
2021-09-29 06:34:24.903385: I tensorflow/core/common_runtime/bfc_allocator.cc:1040] Next region of size 268435456
2021-09-29 06:34:24.903425: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] Free  at 710ae0000 of size 135266304 next 51
2021-09-29 06:34:24.903450: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 718be0000 of size 1048576 next 53
2021-09-29 06:34:24.903475: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 718ce0000 of size 1048576 next 55
2021-09-29 06:34:24.903496: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 718de0000 of size 1048576 next 60
2021-09-29 06:34:24.903519: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 718ee0000 of size 1048576 next 61
2021-09-29 06:34:24.903541: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 718fe0000 of size 1048576 next 62
2021-09-29 06:34:24.903578: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 7190e0000 of size 1048576 next 70
2021-09-29 06:34:24.903600: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 7191e0000 of size 1048576 next 63
2021-09-29 06:34:24.903622: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 7192e0000 of size 1048576 next 68
2021-09-29 06:34:24.903644: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 7193e0000 of size 1048576 next 74
2021-09-29 06:34:24.903665: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 7194e0000 of size 1048576 next 75
2021-09-29 06:34:24.903687: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 7195e0000 of size 1048576 next 76
2021-09-29 06:34:24.903708: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 7196e0000 of size 1048576 next 83
2021-09-29 06:34:24.903729: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 7197e0000 of size 1048576 next 77
2021-09-29 06:34:24.903752: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 7198e0000 of size 524288 next 296
2021-09-29 06:34:24.903773: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] Free  at 719960000 of size 524288 next 81
2021-09-29 06:34:24.903795: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 7199e0000 of size 1048576 next 87
2021-09-29 06:34:24.903816: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 719ae0000 of size 1048576 next 88
2021-09-29 06:34:24.903837: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 719be0000 of size 1048576 next 89
2021-09-29 06:34:24.903859: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 719ce0000 of size 1048576 next 96
2021-09-29 06:34:24.903881: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 719de0000 of size 1048576 next 90
2021-09-29 06:34:24.903902: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 719ee0000 of size 1048576 next 94
2021-09-29 06:34:24.903925: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 719fe0000 of size 1048576 next 100
2021-09-29 06:34:24.903946: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 71a0e0000 of size 1048576 next 101
2021-09-29 06:34:24.903967: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 71a1e0000 of size 1048576 next 102
2021-09-29 06:34:24.903989: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 71a2e0000 of size 1048576 next 109
2021-09-29 06:34:24.904022: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 71a3e0000 of size 1048576 next 103
2021-09-29 06:34:24.904062: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 71a4e0000 of size 1048576 next 107
2021-09-29 06:34:24.904085: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 71a5e0000 of size 1048576 next 113
2021-09-29 06:34:24.904107: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 71a6e0000 of size 1048576 next 114
2021-09-29 06:34:24.904130: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 71a7e0000 of size 1048576 next 115
2021-09-29 06:34:24.904153: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 71a8e0000 of size 1048576 next 122
2021-09-29 06:34:24.904176: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 71a9e0000 of size 1048576 next 116
2021-09-29 06:34:24.904199: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 71aae0000 of size 1048576 next 120
2021-09-29 06:34:24.904222: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 71abe0000 of size 1048576 next 274
2021-09-29 06:34:24.904245: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 71ace0000 of size 1048576 next 275
2021-09-29 06:34:24.904267: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 71ade0000 of size 1048576 next 277
2021-09-29 06:34:24.904290: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 71aee0000 of size 1048576 next 278
2021-09-29 06:34:24.904313: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 71afe0000 of size 1048576 next 279
2021-09-29 06:34:24.904350: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 71b0e0000 of size 1048576 next 280
2021-09-29 06:34:24.904387: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 71b1e0000 of size 1048576 next 282
2021-09-29 06:34:24.904423: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 71b2e0000 of size 1048576 next 283
2021-09-29 06:34:24.904449: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 71b3e0000 of size 1048576 next 284
2021-09-29 06:34:24.904473: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 71b4e0000 of size 1048576 next 285
2021-09-29 06:34:24.904494: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 71b5e0000 of size 1048576 next 287
2021-09-29 06:34:24.904516: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 71b6e0000 of size 1048576 next 288
2021-09-29 06:34:24.904538: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 71b7e0000 of size 1048576 next 289
2021-09-29 06:34:24.904559: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 71b8e0000 of size 1048576 next 290
2021-09-29 06:34:24.904581: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 71b9e0000 of size 1048576 next 292
2021-09-29 06:34:24.904603: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 71bae0000 of size 1048576 next 294
2021-09-29 06:34:24.904625: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 71bbe0000 of size 1572864 next 295
2021-09-29 06:34:24.904647: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 71bd60000 of size 2097152 next 125
2021-09-29 06:34:24.904669: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 71bf60000 of size 1048576 next 37
2021-09-29 06:34:24.904691: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 71c060000 of size 1048576 next 32
2021-09-29 06:34:24.904712: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] Free  at 71c160000 of size 77070336 next 18446744073709551615
2021-09-29 06:34:24.904759: I tensorflow/core/common_runtime/bfc_allocator.cc:1040] Next region of size 268435456
2021-09-29 06:34:24.904795: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 720ae0000 of size 524288 next 35
2021-09-29 06:34:24.904818: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] Free  at 720b60000 of size 1048576 next 23
2021-09-29 06:34:24.904840: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 720c60000 of size 1048576 next 124
2021-09-29 06:34:24.904861: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 720d60000 of size 1048576 next 129
2021-09-29 06:34:24.904882: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] Free  at 720e60000 of size 264765440 next 18446744073709551615
2021-09-29 06:34:24.904903: I tensorflow/core/common_runtime/bfc_allocator.cc:1040] Next region of size 536870912
2021-09-29 06:34:24.904925: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 730ae0000 of size 335544320 next 205
2021-09-29 06:34:24.904947: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] Free  at 744ae0000 of size 201326592 next 18446744073709551615
2021-09-29 06:34:24.904969: I tensorflow/core/common_runtime/bfc_allocator.cc:1040] Next region of size 1073741824
2021-09-29 06:34:24.904991: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 750ae0000 of size 671088640 next 202
2021-09-29 06:34:24.905024: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] Free  at 778ae0000 of size 402653184 next 18446744073709551615
2021-09-29 06:34:24.905047: I tensorflow/core/common_runtime/bfc_allocator.cc:1040] Next region of size 2147483648
2021-09-29 06:34:24.905069: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] Free  at 7914e0000 of size 671088640 next 200
2021-09-29 06:34:24.905091: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 7b94e0000 of size 671088640 next 201
2021-09-29 06:34:24.905112: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 7e14e0000 of size 671088640 next 198
2021-09-29 06:34:24.905133: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] Free  at 8094e0000 of size 134217728 next 18446744073709551615
2021-09-29 06:34:24.905154: I tensorflow/core/common_runtime/bfc_allocator.cc:1040] Next region of size 4294967296
2021-09-29 06:34:24.905176: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 8114e0000 of size 671088640 next 197
2021-09-29 06:34:24.905212: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 8394e0000 of size 671088640 next 196
2021-09-29 06:34:24.905234: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 8614e0000 of size 671088640 next 195
2021-09-29 06:34:24.905258: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 8894e0000 of size 1342177280 next 193
2021-09-29 06:34:24.905280: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] Free  at 8d94e0000 of size 939524096 next 18446744073709551615
2021-09-29 06:34:24.905301: I tensorflow/core/common_runtime/bfc_allocator.cc:1040] Next region of size 2547712000
2021-09-29 06:34:24.905323: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] InUse at 9114e0000 of size 1342177280 next 191
2021-09-29 06:34:24.905346: I tensorflow/core/common_runtime/bfc_allocator.cc:1060] Free  at 9614e0000 of size 1205534720 next 18446744073709551615
2021-09-29 06:34:24.905368: I tensorflow/core/common_runtime/bfc_allocator.cc:1065]      Summary of in-use Chunks by size: 
2021-09-29 06:34:24.905426: I tensorflow/core/common_runtime/bfc_allocator.cc:1068] 25 Chunks of size 256 totalling 6.2KiB
2021-09-29 06:34:24.905453: I tensorflow/core/common_runtime/bfc_allocator.cc:1068] 2 Chunks of size 1024 totalling 2.0KiB
2021-09-29 06:34:24.905479: I tensorflow/core/common_runtime/bfc_allocator.cc:1068] 1 Chunks of size 1280 totalling 1.2KiB
2021-09-29 06:34:24.905504: I tensorflow/core/common_runtime/bfc_allocator.cc:1068] 73 Chunks of size 2048 totalling 146.0KiB
2021-09-29 06:34:24.905529: I tensorflow/core/common_runtime/bfc_allocator.cc:1068] 1 Chunks of size 3072 totalling 3.0KiB
2021-09-29 06:34:24.905553: I tensorflow/core/common_runtime/bfc_allocator.cc:1068] 12 Chunks of size 8192 totalling 96.0KiB
2021-09-29 06:34:24.905577: I tensorflow/core/common_runtime/bfc_allocator.cc:1068] 6 Chunks of size 524288 totalling 3.00MiB
2021-09-29 06:34:24.905601: I tensorflow/core/common_runtime/bfc_allocator.cc:1068] 2 Chunks of size 786432 totalling 1.50MiB
2021-09-29 06:34:24.905625: I tensorflow/core/common_runtime/bfc_allocator.cc:1068] 73 Chunks of size 1048576 totalling 73.00MiB
2021-09-29 06:34:24.905648: I tensorflow/core/common_runtime/bfc_allocator.cc:1068] 2 Chunks of size 1572864 totalling 3.00MiB
2021-09-29 06:34:24.905671: I tensorflow/core/common_runtime/bfc_allocator.cc:1068] 1 Chunks of size 2096896 totalling 2.00MiB
2021-09-29 06:34:24.905695: I tensorflow/core/common_runtime/bfc_allocator.cc:1068] 2 Chunks of size 2097152 totalling 4.00MiB
2021-09-29 06:34:24.905717: I tensorflow/core/common_runtime/bfc_allocator.cc:1068] 1 Chunks of size 3932160 totalling 3.75MiB
2021-09-29 06:34:24.905742: I tensorflow/core/common_runtime/bfc_allocator.cc:1068] 1 Chunks of size 335544320 totalling 320.00MiB
2021-09-29 06:34:24.905766: I tensorflow/core/common_runtime/bfc_allocator.cc:1068] 6 Chunks of size 671088640 totalling 3.75GiB
2021-09-29 06:34:24.905789: I tensorflow/core/common_runtime/bfc_allocator.cc:1068] 2 Chunks of size 1342177280 totalling 2.50GiB
2021-09-29 06:34:24.905812: I tensorflow/core/common_runtime/bfc_allocator.cc:1072] Sum Total of in-use chunks: 6.65GiB
2021-09-29 06:34:24.905834: I tensorflow/core/common_runtime/bfc_allocator.cc:1074] total_region_allocated_bytes_: 11345264640 memory_limit_: 11345264640 available bytes: 0 curr_region_allocation_bytes_: 17179869184
2021-09-29 06:34:24.905892: I tensorflow/core/common_runtime/bfc_allocator.cc:1080] Stats: 
Limit:                     11345264640
InUse:                      7141325056
MaxInUse:                   8483502336
NumAllocs:                        1162
MaxAllocSize:               1342177280
Reserved:                            0
PeakReserved:                        0
LargestFreeBlock:                    0

2021-09-29 06:34:24.905930: W tensorflow/core/common_runtime/bfc_allocator.cc:468] *__**_****_*******________********************************************_______*************__________
2021-09-29 06:34:24.905992: W tensorflow/core/framework/op_kernel.cc:1692] OP_REQUIRES failed at softmax_op_gpu.cu.cc:219 : Resource exhausted: OOM when allocating tensor with shape[5,256,256,4,256] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Traceback (most recent call last):
  File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/content/drive/MyDrive/Colab_Work/HONORS/ColorTrans/coltran_pretrained/google-research/coltran/custom_colorize.py", line 244, in <module>
    app.run(main)
  File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 300, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "/content/drive/MyDrive/Colab_Work/HONORS/ColorTrans/coltran_pretrained/google-research/coltran/custom_colorize.py", line 227, in main
    out = model.sample(gray_cond=gray, inputs=prev_gen, mode='argmax')
  File "/content/drive/MyDrive/Colab_Work/HONORS/ColorTrans/coltran_pretrained/google-research/coltran/models/upsampler.py", line 254, in sample
    logits = self.upsampler(inputs, gray_cond, training=False)
  File "/content/drive/MyDrive/Colab_Work/HONORS/ColorTrans/coltran_pretrained/google-research/coltran/models/upsampler.py", line 245, in upsampler
    context = self.encoder(channel, training=training)
  File "/usr/local/lib/python3.7/dist-packages/keras/engine/base_layer.py", line 1037, in __call__
    outputs = call_fn(inputs, *args, **kwargs)
  File "/content/drive/MyDrive/Colab_Work/HONORS/ColorTrans/coltran_pretrained/google-research/coltran/models/layers.py", line 669, in call
    output = layer(inputs)
  File "/usr/local/lib/python3.7/dist-packages/keras/engine/base_layer.py", line 1037, in __call__
    outputs = call_fn(inputs, *args, **kwargs)
  File "/content/drive/MyDrive/Colab_Work/HONORS/ColorTrans/coltran_pretrained/google-research/coltran/models/layers.py", line 612, in call
    weights = tf.nn.softmax(alphas)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py", line 206, in wrapper
    return target(*args, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/nn_ops.py", line 3820, in softmax_v2
    return _wrap_2d_function(logits, gen_nn_ops.softmax, axis, name)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/nn_ops.py", line 3739, in _wrap_2d_function
    return compute_op(inputs, name=name)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/gen_nn_ops.py", line 10864, in softmax
    _ops.raise_from_not_ok_status(e, name)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py", line 6941, in raise_from_not_ok_status
    six.raise_from(core._status_to_exception(e.code, message), None)
  File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[5,256,256,4,256] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [Op:Softmax]
CPU times: user 370 ms, sys: 82.2 ms, total: 453 ms
Wall time: 1min

Please guide me on how to solve this.

@MechCoder
Copy link
Contributor

I hardcoded the battch-size for the spatial upsampler here to be 5. Could you change it and see if it works? (https://github.com/google-research/google-research/blob/master/coltran/custom_colorize.py#L168)

You could also set the --batch_size flag instead,

@MechCoder
Copy link
Contributor

Did this fix your issue? Thanks!

@ketan-lambat
Copy link
Author

Yes, I missed updating it here.
Thank you.

@haiderasad
Copy link

haiderasad commented Mar 13, 2022

hi i am having the same probelm with spatial upsampler part, it gives OOM so i have to run it on cpu by uncommenting the line 6 of custom_colorise, i get following error with gpu

OOM when allocating tensor with shape[1,256,256,4,256] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [Op:Softmax]

2022-03-13 23:04:55.443444: W tensorflow/core/common_runtime/bfc_allocator.cc:462] Allocator (GPU_0_bfc) ran out of memory trying to allocate 256.00MiB (rounded to 268435456)requested by op Softmax

i did try batch size 1 in custom_colorize still same error

@MechCoder
Copy link
Contributor

MechCoder commented Mar 14, 2022

@haiderasad Can you try the hack mentioned in this blogpost? (https://habr.com/ru/company/ruvds/blog/563858/)

from tensorflow.keras import mixed_precision
mixed_precision.set_global_policy('mixed_float16')

You can save some memory by changing from float32 to float16.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants