Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Save tensors in context of memory_efficient_linear #3413

Merged
merged 2 commits into from
May 1, 2023

Conversation

tohtana
Copy link
Contributor

@tohtana tohtana commented Apr 30, 2023

By default, torch.nn.functional.linear is replaced with LinearFunctionForZeroStage3. However, LinearFunctionForZeroStage3 causes memory leak in some usecases.

In PEFT's LoRA mentioned in #3002, the weight is passed after 'transpose'.

result = F.linear(x, transpose(self.weight, self.fan_in_fan_out), bias=self.bias)

LinearFunctionForZeroStage3 saves the weight in a map and the key is the object's ID. But a new transposed weight is created and the ID changes for each iteration. So the saved weights will increase through iterations.

This PR simply saves weight and bias in the context instead of the IDs.
I don't understand the intention of using IDs to store the weight. If saving IDs instead of tensors is a crucial part of this module, we need another approach to fix this.

@tohtana
Copy link
Contributor Author

tohtana commented May 1, 2023

@tjruwase Thank you for merging this PR!

As I mentioned, I didn't understand the intention of saving tensors in a global map.
I would be happy to fix again if we find any problem with this PR.

@tohtana tohtana deleted the tohtana/leak_mem_efficient_linear branch May 1, 2023 23:20
@tjruwase
Copy link
Contributor

tjruwase commented May 1, 2023

@tohtana, I think your solution is the correct one.

@tohtana
Copy link
Contributor Author

tohtana commented May 2, 2023

Thank you for your reviewing, @tjruwase!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants