Description
❓ Questions and Help
What is your question?
Is there an example demonstrating how to generate using Megatron LM that was trained using model parallelism? The Megatron LM page shows how to run evaluation but there's no information on running generation.
What have you tried?
I tried running the below command but got an error.
Command:
fairseq-generate \
$DATA_PATH \
--path $MODEL_PATH \
--task language_modeling \
--gen-subset test \
--max-sentences 8 \
--criterion cross_entropy \
--beam 1 \
--sampling \
--sampling-topp 0.9 \
--temperature 0.01 \
--prefix-size 200 \
--distributed-world-size 8 \
--results-path $RESULTS_PATH \
--model-parallel-size 8;
Error:
/opt/conda/conda-bld/pytorch_1579022034529/work/aten/src/THC/THCTensorScatterGather.cu:100: void THCudaTensor_gatherKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 2]: block: [0,0,0], thread: [3,0,0] Assertion indexValue >= 0 && indexValue < src.sizes[dim] failed.
After some debugging, I found that this line in the code caused the above error. But I'm unsure of the cause. It's possible there are some setup issues (data etc). But an example on how to setup and run generation using model parallel megatron LM would be great. Thank you.