**Is your feature request related to a problem? Please describe.** AMP creates instability during training **Describe the solution you'd like** Extend the AMP dtype selection as HuggingFace suggests [here](https://huggingface.co/docs/transformers/v4.16.2/en/performance#bf16).