Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Terminate called after throwing an instance of 'std::bad_alloc' #28

Closed
nghiaht opened this issue Apr 22, 2020 · 4 comments
Closed

Terminate called after throwing an instance of 'std::bad_alloc' #28

nghiaht opened this issue Apr 22, 2020 · 4 comments

Comments

@nghiaht
Copy link

nghiaht commented Apr 22, 2020

Hi, I'm using Docker to build the repo and expose via port 5000.
I used the samples/test_examples/low_resolution/astronaut.png as a test image, POST to /model/predict.

Then the Docker container is stopped, showing following logs:

2020-04-22 10:10:28.347782: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 450785280 exceeds 10% of system memory.
2020-04-22 10:10:30.802358: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 450785280 exceeds 10% of system memory.
2020-04-22 10:10:31.210874: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 450785280 exceeds 10% of system memory.
2020-04-22 10:10:31.210834: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 450785280 exceeds 10% of system memory.
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Aborted (core dumped)

My memory info

free -m
              total        used        free      shared  buff/cache   available
Mem:           2000         301        1228           0         470        1533
Swap:          1952         165        1787

I researched on Google about setting Tensorflow logging level which I have tried and get no better results, or adjusting the batch size (but I don't know if we can adjust it in the max-image-resolution-enhancer).

Please give me suggestions.
Thank for your help!

@xuhdev
Copy link
Contributor

xuhdev commented Apr 22, 2020

Looks like you have used up all your memories: Allocation of 450785280 exceeds 10% of system memory. Perhaps @feihugis knows how to reduce memory usage in tensorflow?

@feihugis
Copy link

feihugis commented Apr 23, 2020

@nghiaht As said here, 2GB memory will be not big enough for this model. Could you try increasing the docker memory as here?

@nghiaht
Copy link
Author

nghiaht commented Apr 24, 2020

@nghiaht As said here, 2GB memory will be not big enough for this model. Could you try increasing the docker memory as here?

Thanks for you reply. I knew that By default, a container has no resource constraints and can use as much of a given resource as the host’s kernel scheduler allows.
I did increase memory from 2GB to 4GB and the endpoint could be used, less crashes than 2GB.

And I also reverted to 2GB, try to adjust Tensorflow's settings:

max-image-resolution-enhancer/core/srgan_controller.py - Line 63

config = tf.ConfigProto()
config.gpu_options.allow_growth = True
config.inter_op_parallelism_threads=1
config.intra_op_parallelism_threads=1

The defaults are 4 and 5, I just googled around and adjust it, luckily the endpoint can work and allow to resize image without crashes but slowly.

To summarize:

  • Should have more memory.
  • OR try to limit Tensorflow's settings (in my case, the production VM having limited resources)

@nghiaht nghiaht closed this as completed Apr 24, 2020
@feihugis
Copy link

@nghiaht Thanks for letting us know your solution. Glad the issue is resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants