Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[doc] added documentation to chunk and chunk manager #1094

Merged
merged 4 commits into from
Jun 10, 2022

Conversation

FrankLeeeee
Copy link
Contributor

While I read through the source for chunk.py, I found it lacks documentation so I just added it for better readability.

@FrankLeeeee
Copy link
Contributor Author

Between, there are some APIs which I felt misleading in terms of semantics:

  1. chunk.is_free should be indeed is_empty?
  2. chunk.access looks more like chunk.synchronize to me.

@ver217

@ver217
Copy link
Member

ver217 commented Jun 10, 2022

Between, there are some APIs which I felt misleading in terms of semantics:

  1. chunk.is_free should be indeed is_empty?
  2. chunk.access looks more like chunk.synchronize to me.

@ver217

  1. Yes
  2. access include moving data to CUDA ( and broadcast). Broadcast is not always needed, since ZeRO may not be applied.

@FrankLeeeee
Copy link
Contributor Author

FrankLeeeee commented Jun 10, 2022

According to the code below, broadcast is always called when chunk.access() is invoked.

def access(self) -> None:
        """
        Broadcast the chunk to synchronize the tensors across data parallel processes.
        """
        # recover the chunk on non-owner processes
        # and broadcast the chunk from the source to all processes
        if not self.is_src_rank:
            self.data.storage().resize_(self.size)
        self.data.data = self.data.to(get_current_device())
        dist.broadcast(self.data, self.global_src_rank, group=gpc.get_group(ParallelMode.DATA))

        # update tensor meta info
        self._update_tensors_ptr()
        if not self.is_src_rank:
            self._update_tensors_state(TensorState.HOLD, prev_state=TensorState.FREE)

@ver217
Copy link
Member

ver217 commented Jun 10, 2022

According to the code below, broadcast is always called when chunk.access() is invoked.

def access(self) -> None:
        """
        Broadcast the chunk to synchronize the tensors across data parallel processes.
        """
        # recover the chunk on non-owner processes
        # and broadcast the chunk from the source to all processes
        if not self.is_src_rank:
            self.data.storage().resize_(self.size)
        self.data.data = self.data.to(get_current_device())
        dist.broadcast(self.data, self.global_src_rank, group=gpc.get_group(ParallelMode.DATA))

        # update tensor meta info
        self._update_tensors_ptr()
        if not self.is_src_rank:
            self._update_tensors_state(TensorState.HOLD, prev_state=TensorState.FREE)

Oh, yes. But chunk_manager.access_chunk() don't always broadcast.

@FrankLeeeee
Copy link
Contributor Author

FrankLeeeee commented Jun 10, 2022

@ver217 I have replaced is_free with is_empty. As for the access API, not a big deal but perhaps you can give a clearer name as access prompts the user to get some value in return and does not tell what is done by this API.

@ver217
Copy link
Member

ver217 commented Jun 10, 2022

@ver217 I have replaced is_free with is_empty. As for the access API, not a big deal but perhaps you can give a clearer name as access prompts the user to get some value in return and does not entail what is done by this API.

OK

@FrankLeeeee FrankLeeeee merged commit cb18922 into hpcaitech:main Jun 10, 2022
@FrankLeeeee FrankLeeeee deleted the doc/chunk branch June 13, 2022 02:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants