fix missing word

gbrlfaria · Sep 11, 2023 · 3bddc07 · 3bddc07
1 parent eece968
commit 3bddc07
Showing 1 changed file with 1 addition and 1 deletion.
diff --git a/_posts/2023-09-06-neural-networks-computational-power.md b/_posts/2023-09-06-neural-networks-computational-power.md
@@ -377,7 +377,7 @@ Chiang and Cholak <d-cite key="chiang2022overcoming" /> followed up by observing
 Based on this observation, the authors constructed a soft-attention Transformer with layer normalization that can robustly recognize the Parity language.
 Yao et al. <d-cite key="yao2021bounded" /> also followed up, demonstrating that Transformers can recognize Dyck languages with bounded nesting depth.
 According to the authors, this setup could better represent the hierarchical structure of natural language, which could help explain the success of Transformers in natural language processing (NLP).
-Meanwhile, Bhattamishra et al. <d-cite key="bhattamishra2020ability" /> demonstrated that uniform-attention Transformers---in which each layer can attend to multiple tokens with uniform intensity---can recognize the Shuffle-Dyck-$$k$$<d-footnote>The Shuffle-Dyck-$k$ is a Dyck-$k$ language where each type of bracket must be well-balanced while their relative order is unconstrained. For instance, "({)}" is a valid string in the Shuffle-Dyck-2 language.</d-footnote> language.
+Meanwhile, Bhattamishra et al. <d-cite key="bhattamishra2020ability" /> demonstrated that uniform-attention Transformers---in which each layer can attend to multiple tokens with uniform intensity---can recognize the Shuffle-Dyck-$$k$$<d-footnote>The Shuffle-Dyck-$k$ language is a Dyck-$k$ language where each type of bracket must be well-balanced while their relative order is unconstrained. For instance, "({)}" is a valid string in the Shuffle-Dyck-2 language.</d-footnote> language.
 Furthermore, they showed that Transformers can simulate a less powerful kind of counter automaton called Simplified Stateless Counter Machine (SSCM), establishing a first lower bound on the computational power of these models.
 They supported this observation by empirically demonstrating that Transformers can learn several regular and counter languages.