List the top 5 take aways from the landmark paper Attention is All You Need.
1. They propose the Transformer model architecture which relies solely on attention mechanisms to compute representations of its input data. It removes the recurrent or convolutional layers that were prevalent in earlier models.

2. The attention mechanisms allow each position in the input sequence to attend to all positions in the output sequence, enabling it to capture global dependencies. This makes the model more powerful than earlier architectures.

3. The Transformer achieves state-of-the-art results in machine translation, outperforming earlier recurrent and convolutional models. This demonstrated the potential of attention-based models.

4. The paper popularized the attention mechanism and spurred further work on applying attention to various problems in NLP and other domains. Attention mechanisms are now widely used in deep learning models.

5. The paper is an influential work that moved the field towards attention-based models and motivated further research into making models more powerful using the attention mechanism. It has over 28,000 citations to date.