Make ncnn memory budget configurable (v2) #2351

JeremyRand · 2023-11-29T17:03:47Z

#1867 didn't support automatic estimation of tile size, which was not great for UX. This PR adds that missing support, by allowing the user to choose a custom memory budget. Choosing a memory budget is likely to be better UX than choosing a tile size (since users are likely to have a better knowledge of how much RAM or VRAM they have, than what tile size will work with their hardware and the model they picked). This should also work fine with Vulkan (and is likely to help with this UX issue), though I wasn't able to test that.

This PR is a rewritten version of #2070 and supersedes it; chaiNNer has had enough code churn since then that it was easier to just rewrite as a separate PR. This PR was substantially easier for me to do than messing with rebasing #2088, and is also hopefully easier to review (though #2088 is still worth doing eventually).

JeremyRand · 2023-11-29T17:10:29Z

@RunDevelopment Your concern from the previous PR about exposing the 1 PiB number to the UI has hopefully been addressed; 0 is now special-cased to mean "no limit".

joeyballentine · 2023-11-29T17:40:51Z

Thanks for taking the time to redo this and sorry for all the code churn making this an issue. I'll review this in a bit

JeremyRand · 2023-11-29T18:04:29Z

Thanks for taking the time to redo this and sorry for all the code churn making this an issue. I'll review this in a bit

The code churn isn't a bad thing in this case; it cut the effort involved in writing and testing the new PR by about an order of magnitude compared to how the old PR worked.

joeyballentine

Code itself looks good. I'll test it out tomorrow when I have some time.

joeyballentine · 2023-12-02T07:25:12Z

Do you have a good test case for this? It doesn't seem to work for me. @theflyingzamboni would you mind also testing this?

JeremyRand · 2023-12-02T08:40:08Z

@joeyballentine I took a 1600x1200 image, and upscaled it 4x in ncnn CPU mode with automatic tiling. With no limit (i.e. set to 0), the max memory usage (near the end of the upscale) was 83 GiB; I then set it to a much lower limit (I think around 20 GiB?), rebooted chaiNNer, did another upscale, and the memory usage didn't go above that limit (IIRC it came within 10% of the limit, I don't remember the exact numbers).

Hopefully that helps you narrow down what's different about your tests and mine.

theflyingzamboni

This appears to have little effect at all when used with a GPU. I set the memory budget to 2GB and upscaled a 2048x2048 image, and it still split to 512x512 tiles and utilized around 6GB. It only drops down to 256x256 tiles if I set it to 1GB, but that really uses more like 2GB. This is using an fp16 esrgan model, so the bin is ~32MB.

Also, is there anything to enforce the budget limit during tile splitting if memory is mis-estimated?

JeremyRand · 2023-12-03T13:45:53Z

This appears to have little effect at all when used with a GPU. I set the memory budget to 2GB and upscaled a 2048x2048 image, and it still split to 512x512 tiles and utilized around 6GB. It only drops down to 256x256 tiles if I set it to 1GB, but that really uses more like 2GB. This is using an fp16 esrgan model, so the bin is ~32MB.

@theflyingzamboni That sounds like the memory usage estimate for your model on a GPU is faulty. This PR doesn't change that logic for Vulkan mode (unless I fucked up when refactoring -- please double-check me on this), all it does in Vulkan mode is lower the memory budget (which gets compared to that estimate) if the user picks one that's lower than what the GPU says it supports. So my tentative guess is that the issue you're seeing is out of scope for this PR (although it would be great to fix this in a follow-up PR, see #2352).

Also, is there anything to enforce the budget limit during tile splitting if memory is mis-estimated?

This PR doesn't change that behavior (again, unless I made a mistake -- I might have). AFAIK the pre-existing behavior is that in GPU mode, an OOM error will be caught and the tile size will be lowered; in CPU mode Very Bad Things (TM) will happen; those should still be the case with this PR applied.

theflyingzamboni · 2023-12-03T17:42:38Z

Regarding your first point, that may indeed be outside the scope of the PR. If you do want to do it in another PR, I wonder if we should only show the memory budget setting if the provider is set to CPU. Otherwise, we have an option that looks like it should apply universally, but is definitely not functioning as intended on GPU.

Looking at the estimation code however, it is bugged. When I manually divide the estimation to get the amount required for a 256x256 tile, I get 1.84GB, which is certainly too much for a 1GB budget. So that calculation is causing us to generate a tile size that uses more memory than the budget allows for. So this isn't a problem with memory estimation necessarily (though we know that's not super accurate as well), it's that we're not even producing a correct tile size based on budget using the estimation we have.

For the latter point, I was wondering if there were some means by which we could set a memory allocation so that the OoM point would align with the budget set. I'm not sure if NCNN provides the tools for that though.

JeremyRand · 2023-12-03T18:51:49Z

Regarding your first point, that may indeed be outside the scope of the PR. If you do want to do it in another PR, I wonder if we should only show the memory budget setting if the provider is set to CPU. Otherwise, we have an option that looks like it should apply universally, but is definitely not functioning as intended on GPU.

@theflyingzamboni I think it's still useful to have this available on Vulkan mode, even if the actual number needs to be fiddled with by the user to get the desired results. Part of my motivation of this feature was to facilitate using chaiNNer simultaneously with something else that uses nontrivial memory (or, for that matter, multiple chaiNNer instances at the same time), and for that purpose, the main requirement is that the budget can be decreased by the user, not so much that the exact number matches reality.

I would be totally fine with adding a warning to the text indicating that the user might have to fiddle with the number because it's not exact yet. Let me know if you'd like me to do that. Also I don't feel incredibly strongly about whether this is enabled for Vulkan yet, so if @joeyballentine wants it disabled completely in Vulkan mode, I'll defer to him until the issues are worked out.

Looking at the estimation code however, it is bugged. When I manually divide the estimation to get the amount required for a 256x256 tile, I get 1.84GB, which is certainly too much for a 1GB budget. So that calculation is causing us to generate a tile size that uses more memory than the budget allows for. So this isn't a problem with memory estimation necessarily (though we know that's not super accurate as well), it's that we're not even producing a correct tile size based on budget using the estimation we have.

That's really interesting, it hadn't even occurred to me that there could be a bug there. Can we factor that out into its own GitHub issue so it doesn't get lost here?

joeyballentine · 2023-12-03T18:53:03Z

A warning and an explanation as to why it's inaccurate would be good enough for me.

JeremyRand · 2023-12-03T18:54:45Z

A warning and an explanation as to why it's inaccurate would be good enough for me.

@joeyballentine OK, works for me. I'll add a warning; hopefully will have it pushed by tomorrow.

theflyingzamboni · 2023-12-03T19:23:55Z

That's really interesting, it hadn't even occurred to me that there could be a bug there. Can we factor that out into its own GitHub issue so it doesn't get lost here?

Actually scratch this, I was miscalculating.

theflyingzamboni · 2023-12-03T19:25:04Z

One other suggestion: Would you change it to allow one decimal place? It would be nice to be a little more specific than a full GB.

JeremyRand · 2023-12-05T21:08:21Z

One other suggestion: Would you change it to allow one decimal place? It would be nice to be a little more specific than a full GB.

@theflyingzamboni No objection in principle, but SettingsParser doesn't have any mechanism for retrieving float settings, so we'd have to either abuse an int or abuse a str, which would not be great for type safety. So I'd prefer to make that change in a subsequent PR once SettingsParser supports either float or decimal types.

JeremyRand · 2023-12-05T21:12:54Z

A warning and an explanation as to why it's inaccurate would be good enough for me.

@joeyballentine Warning added, let me know if the wording needs any changes.

joeyballentine

Looks good to me. Thanks!

JeremyRand mentioned this pull request Nov 29, 2023

Make ncnn memory budget configurable #2070

Closed

joeyballentine previously approved these changes Nov 30, 2023

View reviewed changes

theflyingzamboni self-requested a review December 2, 2023 23:18

theflyingzamboni reviewed Dec 2, 2023

View reviewed changes

Jeremy Rand added 2 commits December 5, 2023 21:11

Make ncnn memory budget configurable

d9ab854

Add warning to memory budget limit UI

fbdf04d

JeremyRand dismissed joeyballentine’s stale review via fbdf04d December 5, 2023 21:11

JeremyRand force-pushed the budget-v2 branch from 89038bb to fbdf04d Compare December 5, 2023 21:11

joeyballentine approved these changes Dec 5, 2023

View reviewed changes

joeyballentine merged commit 747a82f into chaiNNer-org:main Dec 5, 2023
14 checks passed

JeremyRand deleted the budget-v2 branch December 5, 2023 23:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make ncnn memory budget configurable (v2) #2351

Make ncnn memory budget configurable (v2) #2351

JeremyRand commented Nov 29, 2023

JeremyRand commented Nov 29, 2023

joeyballentine commented Nov 29, 2023

JeremyRand commented Nov 29, 2023

joeyballentine left a comment

joeyballentine commented Dec 2, 2023

JeremyRand commented Dec 2, 2023

theflyingzamboni left a comment •

edited

JeremyRand commented Dec 3, 2023

theflyingzamboni commented Dec 3, 2023 •

edited

JeremyRand commented Dec 3, 2023

joeyballentine commented Dec 3, 2023

JeremyRand commented Dec 3, 2023

theflyingzamboni commented Dec 3, 2023

theflyingzamboni commented Dec 3, 2023

JeremyRand commented Dec 5, 2023

JeremyRand commented Dec 5, 2023

joeyballentine left a comment

Make ncnn memory budget configurable (v2) #2351

Make ncnn memory budget configurable (v2) #2351

Conversation

JeremyRand commented Nov 29, 2023

JeremyRand commented Nov 29, 2023

joeyballentine commented Nov 29, 2023

JeremyRand commented Nov 29, 2023

joeyballentine left a comment

Choose a reason for hiding this comment

joeyballentine commented Dec 2, 2023

JeremyRand commented Dec 2, 2023

theflyingzamboni left a comment • edited

Choose a reason for hiding this comment

JeremyRand commented Dec 3, 2023

theflyingzamboni commented Dec 3, 2023 • edited

JeremyRand commented Dec 3, 2023

joeyballentine commented Dec 3, 2023

JeremyRand commented Dec 3, 2023

theflyingzamboni commented Dec 3, 2023

theflyingzamboni commented Dec 3, 2023

JeremyRand commented Dec 5, 2023

JeremyRand commented Dec 5, 2023

joeyballentine left a comment

Choose a reason for hiding this comment

theflyingzamboni left a comment •

edited

theflyingzamboni commented Dec 3, 2023 •

edited