-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[tensor] refactor chunk mgr and impl MemStatsCollectorV2 #1077
Conversation
@@ -236,10 +235,9 @@ def access_chunk(self, tensor: torch.Tensor) -> None: | |||
self.accessed_chunks.add(chunk) | |||
self.total_mem[chunk.device_type] += chunk.mem | |||
|
|||
def release_chunk(self, tensor: torch.Tensor) -> None: | |||
def release_chunk(self, chunk: Chunk) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will the API changing affects our old code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I update all relevant code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'd better update the version.txt in this PR. And post a new release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Chunk manager's methods should not be called by users directly. I think user's code won't be influenced by this PR.
def get_chunks(self, tensors: Iterable[torch.Tensor]) -> FrozenSet[Chunk]: | ||
return frozenset([self.get_chunk(tensor) for tensor in tensors]) | ||
|
||
def add_extern_static_tensor(self, tensor: torch.Tensor) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to make sure the static tensor is not registered as a chunk managed tensor later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Done.
Refactor chunk mgr for easily monitoring memory usage.