Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve ncnn memory estimation #2352

Open
JeremyRand opened this issue Nov 29, 2023 · 4 comments
Open

Improve ncnn memory estimation #2352

JeremyRand opened this issue Nov 29, 2023 · 4 comments

Comments

@JeremyRand
Copy link
Contributor

Splitting out from #2070 since this deserves its own issue.

Quoting @theflyingzamboni:

Since you have so much RAM, I'm wondering if you'd be willing to run some tests. I've done some testing on NCNN VRAM estimation in the past (though only with ESRGAN), and I can already tell you that this isn't really correct, and neither is what we already have in place. I've only got 8GB VRAM, so I could never extend my tests far enough to gather fully complete data.

What I can say is that:

  1. Model size is not strongly correlative with VRAM usage. This is because a model can be made larger simply by performing more convolutions, which does not matter for VRAM usage because they are being done in sequence. The only thing it definitely correlates with is how much VRAM it takes to store the model itself.
  2. Individual weight sizes have a correlation with VRAM usage when running a model.
  3. Scale needs to be accounted for, which our estimation does not currently do. This estimation is based around a 4x scale, but an 8x model will blow past it.

I abandoned this back in the day because there seemed to be further factors I couldn't account for with the data I had, but maybe we can finally figure it out. Unfortunately, I seem to have deleted the set of different scale/nf/nb ESRGAN models I generated for these tests. If I can remember how I generated them all, I could send them to you.

I'm hoping that there may be some way to parse the ncnn parameter file in a way that reveals the memory usage, so that we don't need to use heuristics and black-box RE like described above; I haven't investigated this yet.

@joeyballentine
Copy link
Member

if you're able to figure something out, that knowledge could also be used to improve vram estimation for pytorch as well, since they should have the same kind of vram usage. very interested to see what comes of this, if anything

@theflyingzamboni
Copy link
Collaborator

theflyingzamboni commented Nov 29, 2023

I'm hoping that there may be some way to parse the ncnn parameter file in a way that reveals the memory usage, so that we don't need to use heuristics and black-box RE like described above; I haven't investigated this yet.

So this is basically what I was doing, aside from model size. My process was that I generated some synthetic ESRGAN models at 1x, 2x, 4x, and 8x scale, and permuted those scales with varying nf (number of convolution filters, which pertains to layer weight), nb (number of batches, basically how many times the model runs through a sequence of layers), and image size. If I recall correctly, I also directly used maximum layer weight (as in, the largest weight for a single convolution layer in the model). Scale, nf, nb, and layer weight were all taken from the param file.

I then generated a series of plots against VRAM usage for each of these variables to look for trends/effects. What I found as I recall was the nb was irrelevant, and nf, scale, and image size each had an effect, and I believe there was an interaction effect between some or all of those three as well. The problem was that my GPU is only 8GB, so my data was incomplete in a way that made it impossible to fully extrapolate the trends. At higher scales and image sizes, I simply could not process the test images. That also doesn't get into the potential differences between model arches as well, since really this was a test of very basic convolution models. This is why it would be valuable to have those tests rerun by someone with a ton of memory to work with.

@joeyballentine
Copy link
Member

Testing with something like compact or maybe even something simpler I think would allow you to see a much clearer trend. Compact is a really simple arch by comparison, but it also doesn't use upconv like esrgan, it uses pixelshuffle, so the VRAM usage when upscaling is going to be different.

Anyway, I think what matters most is the max tensor size of the model, given the specific image. Idk how to verify that though

@JeremyRand
Copy link
Contributor Author

There are a bunch of configurable options in ncnn that affect memory usage. Winograd convolution is the main one -- it makes things much faster but also uses more memory. It would be nice to know exactly what the impact of those are.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants