Commit 56af8df
HF <-> megatron checkpoint reshaping and conversion for GPT (#19317)
* HF <-> megatron checkpoint conversion handling reshaping from different tensor and parallel sizes
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* addressing comments
* add doc strings and 🐛 fixes
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>1 parent 41ec5d0 commit 56af8df
File tree
1 file changed
+900
-0
lines changed- src/transformers/models/megatron_gpt2
1 file changed
+900
-0
lines changed
0 commit comments