Skip to content

Commit

Permalink
fix missing word
Browse files Browse the repository at this point in the history
  • Loading branch information
gbrlfaria committed Sep 11, 2023
1 parent eece968 commit 3bddc07
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion _posts/2023-09-06-neural-networks-computational-power.md
Original file line number Diff line number Diff line change
Expand Up @@ -377,7 +377,7 @@ Chiang and Cholak <d-cite key="chiang2022overcoming" /> followed up by observing
Based on this observation, the authors constructed a soft-attention Transformer with layer normalization that can robustly recognize the Parity language.
Yao et al. <d-cite key="yao2021bounded" /> also followed up, demonstrating that Transformers can recognize Dyck languages with bounded nesting depth.
According to the authors, this setup could better represent the hierarchical structure of natural language, which could help explain the success of Transformers in natural language processing (NLP).
Meanwhile, Bhattamishra et al. <d-cite key="bhattamishra2020ability" /> demonstrated that uniform-attention Transformers---in which each layer can attend to multiple tokens with uniform intensity---can recognize the Shuffle-Dyck-$$k$$<d-footnote>The Shuffle-Dyck-$k$ is a Dyck-$k$ language where each type of bracket must be well-balanced while their relative order is unconstrained. For instance, "({)}" is a valid string in the Shuffle-Dyck-2 language.</d-footnote> language.
Meanwhile, Bhattamishra et al. <d-cite key="bhattamishra2020ability" /> demonstrated that uniform-attention Transformers---in which each layer can attend to multiple tokens with uniform intensity---can recognize the Shuffle-Dyck-$$k$$<d-footnote>The Shuffle-Dyck-$k$ language is a Dyck-$k$ language where each type of bracket must be well-balanced while their relative order is unconstrained. For instance, "({)}" is a valid string in the Shuffle-Dyck-2 language.</d-footnote> language.
Furthermore, they showed that Transformers can simulate a less powerful kind of counter automaton called Simplified Stateless Counter Machine (SSCM), establishing a first lower bound on the computational power of these models.
They supported this observation by empirically demonstrating that Transformers can learn several regular and counter languages.

Expand Down

0 comments on commit 3bddc07

Please sign in to comment.