Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Column and Row Parallel Linear for Apex Tensor Parallel #44

Closed
conceptofmind opened this issue Mar 29, 2023 · 1 comment
Closed

Column and Row Parallel Linear for Apex Tensor Parallel #44

conceptofmind opened this issue Mar 29, 2023 · 1 comment

Comments

@conceptofmind
Copy link
Contributor

conceptofmind commented Mar 29, 2023

Hi,

I was exploring using Tensor Parallel when training. I was wondering if you had any input on the correct use of RowParallelLinear when it comes to the feedforward out.

For example:

Column Parallel over q, k, v, and ff inner.

self.fused_attn_ff_proj = apex.transformer.tensor_parallel.ColumnParallelLinear(
  dim, 
  sum(self.fused_dims), 
  bias=False,
  gather_output=False,
  init_method=nn.init.xavier_uniform_
)

Row Parallel over attn out.

self.attn_out =  apex.transformer.tensor_parallel.RowParallelLinear(
  attn_inner_dim, 
  dim, 
  bias=False,
  input_is_parallel=True,
  init_method=nn.init.xavier_uniform_
)

I am not 100% sure whether this should be Row Parallel as well.

self.ff_out = nn.Sequential(
    SwiGLU(),
    apex.transformer.tensor_parallel.RowParallelLinear(
      ff_inner_dim, 
      dim, 
      bias=False,
      input_is_parallel=True,
      init_method=nn.init.xavier_uniform_
    )
)

Normally I would just do Column Parallel, SwiGLU, Row Parallel in a standard FeedForward but it is not super clear to me how to handle this case when it comes to fused attn ff and ff tail.

Any input would be greatly appreciated.

Thank you,

Enrico

@conceptofmind
Copy link
Contributor Author

conceptofmind commented Apr 26, 2023

Decided to go with FSDP instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant