Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[gemini] add GeminiMemoryManger #832

Merged
merged 1 commit into from
Apr 24, 2022
Merged

[gemini] add GeminiMemoryManger #832

merged 1 commit into from
Apr 24, 2022

Conversation

1SAA
Copy link
Contributor

@1SAA 1SAA commented Apr 22, 2022

  • refactor StatefulTensor, tensor utilities

  • add unitest for GeminiMemoryManager

@1SAA 1SAA requested review from feifeibear and ver217 April 22, 2022 05:58
colossalai/gemini/tensor_utils.py Show resolved Hide resolved
colossalai/gemini/tensorful_state.py Outdated Show resolved Hide resolved
else:
# when from_state is FREE, the tensor is new to manager
# we should add its memory
manager.total_mem[device_type] += size
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you know the tensor is new to manager.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, every payload holds its own memory.

del self._payload

# record new payload
StatefulTensor.trans_state_update(tensor, TensorState.FREE, self.state)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We must call this function when the payload is newed or deleted.
Therefore trans_state_update is coupled with reset_payload and move_to.
Here are some negative effects.
First, it is not applicable to the other place. Because you have to keep in mind that the tensor is exactly allocated or deleted.
Or some others should not use the function. It is dangerous and will mess up your plan.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm gonna change them to private functions.

* refactor StatefulTensor, tensor utilities

* add unitest for GeminiMemoryManager
@1SAA
Copy link
Contributor Author

1SAA commented Apr 24, 2022

My unitest result in torch-1.10
image
image

@1SAA 1SAA merged commit e5ea3fd into hpcaitech:main Apr 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants