Skip to content

Commit

Permalink
Fix bug in PP output layer shape
Browse files Browse the repository at this point in the history
mostly harmless bug, since output shape of last layer is not used for
send/recv purpose, the runtime value overrides it no matter what value
you configured it with.

However, since adding in/out shape validation to pipeline lib in torch,
this raises an error and has to be fixed.

ghstack-source-id: 950e41529b7b506085ab280d8a492e345eaefd24
Pull Request resolved: #354
  • Loading branch information
wconstab committed May 22, 2024
1 parent 6807909 commit 638ec48
Showing 1 changed file with 5 additions and 1 deletion.
6 changes: 5 additions & 1 deletion torchtitan/parallelisms/parallelize_llama.py
Original file line number Diff line number Diff line change
Expand Up @@ -209,7 +209,11 @@ def pipeline_llama_manual(
batch_size = job_config.training.batch_size
local_seq_len = int(job_config.training.seq_len // parallel_dims.tp)
layers_io_shape = (batch_size, local_seq_len, model_config.dim)
output_layer_shape = (batch_size, local_seq_len, model_config.vocab_size)
output_layer_shape = (
batch_size,
job_config.training.seq_len,
model_config.vocab_size,
)
if pp_rank == 0:
# first layer
input = torch.randint(
Expand Down

0 comments on commit 638ec48

Please sign in to comment.