Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pipeline parallelism for Grok #87

Merged
merged 7 commits into from
Jun 5, 2024
Merged

Conversation

hx89
Copy link
Contributor

@hx89 hx89 commented May 20, 2024

No description provided.

Copy link

google-cla bot commented May 20, 2024

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@@ -834,8 +834,8 @@ def task(self) -> pax_fiddle.Config[tasks_lib.SingleTask]:
assert self.NUM_STAGES is not None
assert self.NUM_LAYERS % (self.NUM_STAGES * self.CIRCULAR_REPEAT) == 0
assert self.NUM_MICROBATCHES is not None or self.MICROBATCH_SIZE is not None
assert self.ICI_MESH_SHAPE is not None and len(self.ICI_MESH_SHAPE) == 4
assert self.DCN_MESH_SHAPE is not None and len(self.DCN_MESH_SHAPE) == 4
assert self.ICI_MESH_SHAPE is not None and len(self.ICI_MESH_SHAPE) >= 4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we put this assert > 4 behind the USE_EXPERT_PARALLEL flag?
Someone using regular PP with an incorrect mesh should get stopped by these assertions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @hx89

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, I've updated the PR.

@zhangqiaorjc zhangqiaorjc added the pull ready Used to import PR as CL label May 28, 2024
@copybara-service copybara-service bot merged commit 7f8f9eb into google:main Jun 5, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pull ready Used to import PR as CL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants