### Describe the feature Using [BF16 Optimizer](https://huggingface.co/blog/bloom-megatron-deepspeed#bf16optimizer) rather than FP16 is necessary for LLM training, which has been verified by BLOOM, OPT, and Megatron-Turing.