[tensor] refactor chunk mgr and impl MemStatsCollectorV2 #1077

ver217 · 2022-06-07T08:19:41Z

Refactor chunk mgr for easily monitoring memory usage.

feifeibear · 2022-06-07T08:30:05Z

colossalai/tensor/chunk.py

@@ -236,10 +235,9 @@ def access_chunk(self, tensor: torch.Tensor) -> None:
        self.accessed_chunks.add(chunk)
        self.total_mem[chunk.device_type] += chunk.mem

-    def release_chunk(self, tensor: torch.Tensor) -> None:
+    def release_chunk(self, chunk: Chunk) -> None:


will the API changing affects our old code?

I update all relevant code.

You'd better update the version.txt in this PR. And post a new release.

Chunk manager's methods should not be called by users directly. I think user's code won't be influenced by this PR.

feifeibear · 2022-06-07T08:31:28Z

colossalai/tensor/chunk.py

+    def get_chunks(self, tensors: Iterable[torch.Tensor]) -> FrozenSet[Chunk]:
+        return frozenset([self.get_chunk(tensor) for tensor in tensors])
+
+    def add_extern_static_tensor(self, tensor: torch.Tensor) -> None:


I think you need to make sure the static tensor is not registered as a chunk managed tensor later.

…emini

ver217 added 4 commits June 7, 2022 15:37

polish chunk manager

024c4cc

polish unit test

9b737a4

impl add_extern_static_tensor for chunk mgr

d7d868b

add mem stats collector v2

1ed838e

ver217 added the Run Build and Test label Jun 7, 2022

feifeibear reviewed Jun 7, 2022

View reviewed changes

polish code

b47695e

feifeibear approved these changes Jun 7, 2022

View reviewed changes

ver217 added 5 commits June 9, 2022 11:17

Merge branch 'main' of github.com:hpcaitech/ColossalAI into feature/g…

db120e8

…emini

polish unit test

de5ec15

Merge branch 'main' of github.com:hpcaitech/ColossalAI into feature/g…

0979982

…emini

polish code

312f4ad

polish get chunks

7fae5ef

ver217 merged commit be01db3 into hpcaitech:main Jun 9, 2022

ver217 deleted the feature/gemini branch June 9, 2022 12:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[tensor] refactor chunk mgr and impl MemStatsCollectorV2 #1077

[tensor] refactor chunk mgr and impl MemStatsCollectorV2 #1077

ver217 commented Jun 7, 2022

feifeibear Jun 7, 2022

ver217 Jun 7, 2022

feifeibear Jun 7, 2022

ver217 Jun 7, 2022

feifeibear Jun 7, 2022

ver217 Jun 7, 2022

[tensor] refactor chunk mgr and impl MemStatsCollectorV2 #1077

[tensor] refactor chunk mgr and impl MemStatsCollectorV2 #1077

Conversation

ver217 commented Jun 7, 2022

feifeibear Jun 7, 2022

Choose a reason for hiding this comment

ver217 Jun 7, 2022

Choose a reason for hiding this comment

feifeibear Jun 7, 2022

Choose a reason for hiding this comment

ver217 Jun 7, 2022

Choose a reason for hiding this comment

feifeibear Jun 7, 2022

Choose a reason for hiding this comment

ver217 Jun 7, 2022

Choose a reason for hiding this comment