Patch llama3.1 model support #5

wenxindongwork · 2025-03-07T23:53:58Z

Llama3 tokenizers don't have a default pad_token id. This PR sets it to fall back to other special tokens
Added Llama3 FSDP to PredefinedShardingStrategy
Refactored llama3 loading and saving tests into same test suites as the gemma models
Fixed some bugs on model weights similarity metric calculation

TODO:

Path llama3.1 model support

7f16d8d

wenxindongwork requested a review from chandrasekhard2 March 7, 2025 23:54

chandrasekhard2 approved these changes Mar 8, 2025

View reviewed changes

wenxindongwork merged commit 5957e12 into main Mar 10, 2025
1 check passed

Provide feedback