-
-
Notifications
You must be signed in to change notification settings - Fork 283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve ncnn memory estimation #2352
Comments
if you're able to figure something out, that knowledge could also be used to improve vram estimation for pytorch as well, since they should have the same kind of vram usage. very interested to see what comes of this, if anything |
So this is basically what I was doing, aside from model size. My process was that I generated some synthetic ESRGAN models at 1x, 2x, 4x, and 8x scale, and permuted those scales with varying nf (number of convolution filters, which pertains to layer weight), nb (number of batches, basically how many times the model runs through a sequence of layers), and image size. If I recall correctly, I also directly used maximum layer weight (as in, the largest weight for a single convolution layer in the model). Scale, nf, nb, and layer weight were all taken from the param file. I then generated a series of plots against VRAM usage for each of these variables to look for trends/effects. What I found as I recall was the nb was irrelevant, and nf, scale, and image size each had an effect, and I believe there was an interaction effect between some or all of those three as well. The problem was that my GPU is only 8GB, so my data was incomplete in a way that made it impossible to fully extrapolate the trends. At higher scales and image sizes, I simply could not process the test images. That also doesn't get into the potential differences between model arches as well, since really this was a test of very basic convolution models. This is why it would be valuable to have those tests rerun by someone with a ton of memory to work with. |
Testing with something like compact or maybe even something simpler I think would allow you to see a much clearer trend. Compact is a really simple arch by comparison, but it also doesn't use upconv like esrgan, it uses pixelshuffle, so the VRAM usage when upscaling is going to be different. Anyway, I think what matters most is the max tensor size of the model, given the specific image. Idk how to verify that though |
There are a bunch of configurable options in ncnn that affect memory usage. Winograd convolution is the main one -- it makes things much faster but also uses more memory. It would be nice to know exactly what the impact of those are. |
Splitting out from #2070 since this deserves its own issue.
Quoting @theflyingzamboni:
I'm hoping that there may be some way to parse the ncnn parameter file in a way that reveals the memory usage, so that we don't need to use heuristics and black-box RE like described above; I haven't investigated this yet.
The text was updated successfully, but these errors were encountered: