Skip to content

Commit

Permalink
Update neox_args.py (#1081)
Browse files Browse the repository at this point in the history
* Update neox_args.py

These attention configuration options were missing from the docs. This will fix that.

* Update NeoXArgs docs automatically

---------

Co-authored-by: github-actions <github-actions@github.com>
  • Loading branch information
jahatef and github-actions committed Nov 16, 2023
1 parent d8028f8 commit 10bf788
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 3 deletions.
4 changes: 2 additions & 2 deletions configs/neox_arguments.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@ Logging Arguments

- **git_hash**: str

Default = c0fd5d9
Default = b18f25c

current git hash of repository

Expand Down Expand Up @@ -334,7 +334,7 @@ Model Arguments
The first item in the list specifies the attention type(s), and should be a list of strings. The second item
specifies the number of times to repeat those attention types in the full list.

attention type choices: [global, local, sparse_fixed, sparse_variable, bslongformer, bigbird]
attention type choices: [global, local, sparse_fixed, sparse_variable, bslongformer, bigbird, "gmlp", "amlp", "flash"]

So a 12 layer network with only global attention could be specified like:
[[[`global`], 12]]
Expand Down
2 changes: 1 addition & 1 deletion megatron/neox_arguments/neox_args.py
Original file line number Diff line number Diff line change
Expand Up @@ -175,7 +175,7 @@ class NeoXArgsModel(NeoXArgsTemplate):
The first item in the list specifies the attention type(s), and should be a list of strings. The second item
specifies the number of times to repeat those attention types in the full list.
attention type choices: [global, local, sparse_fixed, sparse_variable, bslongformer, bigbird]
attention type choices: [global, local, sparse_fixed, sparse_variable, bslongformer, bigbird, "gmlp", "amlp", "flash"]
So a 12 layer network with only global attention could be specified like:
[[[`global`], 12]]
Expand Down

0 comments on commit 10bf788

Please sign in to comment.