Megatron Megatron is a package/module to supplement keras by adding in Transformer and multiheaded attention mechanisms; Note Megatron is likely to be deprecated if keras/tf decides to add a native attention mechanism.