Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for GPTNeoX models #32

Merged
merged 17 commits into from
Oct 3, 2023
Merged

Commits on Sep 27, 2023

  1. [add] gpt-neox support

    naubull2 committed Sep 27, 2023
    Configuration menu
    Copy the full SHA
    7baa4f7 View commit details
    Browse the repository at this point in the history
  2. [update] readme

    naubull2 committed Sep 27, 2023
    Configuration menu
    Copy the full SHA
    41977df View commit details
    Browse the repository at this point in the history

Commits on Sep 28, 2023

  1. [fix] some of the bugs preventing fine-tune run

    + There's still bugs in the attention dimensions mismatch
    naubull2 committed Sep 28, 2023
    Configuration menu
    Copy the full SHA
    9c9d0a2 View commit details
    Browse the repository at this point in the history
  2. [fix] dimesion discrepancy between attention mask and the query length

    + group batch attention is skipped to avoid this problem for now
    naubull2 committed Sep 28, 2023
    Configuration menu
    Copy the full SHA
    a5111ef View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    5862050 View commit details
    Browse the repository at this point in the history
  4. Merge branch 'forked-only'

    naubull2 committed Sep 28, 2023
    Configuration menu
    Copy the full SHA
    0cf0dfd View commit details
    Browse the repository at this point in the history

Commits on Sep 30, 2023

  1. Configuration menu
    Copy the full SHA
    1532c4b View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    6fdffbb View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    fe97f86 View commit details
    Browse the repository at this point in the history

Commits on Oct 2, 2023

  1. [add] torch autocast for flash attention safety

    + flash attention only supports in fp16/bf16
    naubull2 committed Oct 2, 2023
    Configuration menu
    Copy the full SHA
    9e30a15 View commit details
    Browse the repository at this point in the history
  2. [fix] HF built-in rotary embedding is not compatible with flash-atten…

    …tion
    
    + cos/sin cache tensor is not trained parameter, so it's not autocast
      along with other model parameters through `torch_dtype`.
    naubull2 committed Oct 2, 2023
    Configuration menu
    Copy the full SHA
    3f9c47c View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    b21e949 View commit details
    Browse the repository at this point in the history
  4. [rollback] torch.cuda autocast causes half precision error

    + Works fine without the torch.cuda autocast context, so rollback.
    naubull2 committed Oct 2, 2023
    Configuration menu
    Copy the full SHA
    b224273 View commit details
    Browse the repository at this point in the history

Commits on Oct 3, 2023

  1. Configuration menu
    Copy the full SHA
    9123e42 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    7203de2 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    8a11ef8 View commit details
    Browse the repository at this point in the history
  4. [remove] unused comments

    naubull2 committed Oct 3, 2023
    Configuration menu
    Copy the full SHA
    02e4c1c View commit details
    Browse the repository at this point in the history