Terminate called after throwing an instance of 'std::bad_alloc' #28

nghiaht · 2020-04-22T10:16:03Z

Hi, I'm using Docker to build the repo and expose via port 5000.
I used the samples/test_examples/low_resolution/astronaut.png as a test image, POST to /model/predict.

Then the Docker container is stopped, showing following logs:

2020-04-22 10:10:28.347782: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 450785280 exceeds 10% of system memory.
2020-04-22 10:10:30.802358: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 450785280 exceeds 10% of system memory.
2020-04-22 10:10:31.210874: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 450785280 exceeds 10% of system memory.
2020-04-22 10:10:31.210834: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 450785280 exceeds 10% of system memory.
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Aborted (core dumped)

My memory info

free -m
              total        used        free      shared  buff/cache   available
Mem:           2000         301        1228           0         470        1533
Swap:          1952         165        1787

I researched on Google about setting Tensorflow logging level which I have tried and get no better results, or adjusting the batch size (but I don't know if we can adjust it in the max-image-resolution-enhancer).

Please give me suggestions.
Thank for your help!

The text was updated successfully, but these errors were encountered:

xuhdev · 2020-04-22T17:28:22Z

Looks like you have used up all your memories: Allocation of 450785280 exceeds 10% of system memory. Perhaps @feihugis knows how to reduce memory usage in tensorflow?

feihugis · 2020-04-23T16:33:06Z

@nghiaht As said here, 2GB memory will be not big enough for this model. Could you try increasing the docker memory as here?

nghiaht · 2020-04-24T04:57:22Z

@nghiaht As said here, 2GB memory will be not big enough for this model. Could you try increasing the docker memory as here?

Thanks for you reply. I knew that By default, a container has no resource constraints and can use as much of a given resource as the host’s kernel scheduler allows.
I did increase memory from 2GB to 4GB and the endpoint could be used, less crashes than 2GB.

And I also reverted to 2GB, try to adjust Tensorflow's settings:

max-image-resolution-enhancer/core/srgan_controller.py - Line 63

config = tf.ConfigProto()
config.gpu_options.allow_growth = True
config.inter_op_parallelism_threads=1
config.intra_op_parallelism_threads=1

The defaults are 4 and 5, I just googled around and adjust it, luckily the endpoint can work and allow to resize image without crashes but slowly.

To summarize:

Should have more memory.
OR try to limit Tensorflow's settings (in my case, the production VM having limited resources)

feihugis · 2020-04-24T15:36:45Z

@nghiaht Thanks for letting us know your solution. Glad the issue is resolved.

nghiaht closed this as completed Apr 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Terminate called after throwing an instance of 'std::bad_alloc' #28

Terminate called after throwing an instance of 'std::bad_alloc' #28

nghiaht commented Apr 22, 2020

xuhdev commented Apr 22, 2020

feihugis commented Apr 23, 2020 •

edited

nghiaht commented Apr 24, 2020 •

edited

feihugis commented Apr 24, 2020

Terminate called after throwing an instance of 'std::bad_alloc' #28

Terminate called after throwing an instance of 'std::bad_alloc' #28

Comments

nghiaht commented Apr 22, 2020

xuhdev commented Apr 22, 2020

feihugis commented Apr 23, 2020 • edited

nghiaht commented Apr 24, 2020 • edited

feihugis commented Apr 24, 2020

feihugis commented Apr 23, 2020 •

edited

nghiaht commented Apr 24, 2020 •

edited