Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements to VRAM usage when loading HF models #105

Merged
merged 4 commits into from
Dec 1, 2022

Conversation

mrwyattii
Copy link
Contributor

@mrwyattii mrwyattii commented Nov 21, 2022

MII can now optionally load models onto system memory initially with the load_with_sys_mem option added to the config. This solves a problem when the model takes up nearly all the GPU memory and performing kernel injection requires some additional space, causing OOM errors despite there being enough GPU memory to hold the model.

Also adding the DTypeEnum from DeepSpeed to the MII config and unit tests to test this new option

@mrwyattii mrwyattii merged commit 06714bb into main Dec 1, 2022
@mrwyattii mrwyattii deleted the mrwyattii/address-poor-vram-usage branch December 1, 2022 22:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants