Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to run inference on GPU with 1GB DRAM? #2

Closed
alexlyzhov opened this issue May 1, 2017 · 1 comment
Closed

Is it possible to run inference on GPU with 1GB DRAM? #2

alexlyzhov opened this issue May 1, 2017 · 1 comment
Assignees

Comments

@alexlyzhov
Copy link

I am still able to run FlowNet2-CS but FlowNet2-CSS and FlowNet2 fail with "Check failed: error == cudaSuccess (2 vs. 0) out of memory". When I query free memory with cudaMemGetInfo() I can see 950MB free before I run run-flownet.py and FlowNet2 weights occupy only 650MB.

Can it still be possible to fit the model into memory with some tricks?

@nikolausmayer
Copy link
Contributor

Hi nikkou,

the straightforward way to use less memory is to feed in smaller images, but that is probably not what you want. It's possible to reduce the raw memory requirements of the larger networks, but not very easy.

  • You could run the network layer-by-layer instead of the full network at once. This will require some familiarity with Caffe, and some script coding (setup tiny one-layer network, read this layer's weights, feed input, save output to disk, continue with next layer.. rinse and repeat). It will also be slower.

  • You could recompile Caffe to use a lower-precision float representation (e.g. FP16). Accuracy of the results will suffer, but possibly not by much. Note that this is certainly more work than the first approach. It might not save enough memory for the full FlowNet2.

Best,
Nikolaus

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants