New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Add support for GPTNeoX models #32

Merged

yukang2017 merged 17 commits into dvlab-research:main from naubull2:main

Oct 3, 2023

Commits on Sep 27, 2023

[add] gpt-neox support

naubull2 committed Sep 27, 2023
Configuration menu
View commit details

Copy full SHA for 7baa4f7

Browse repository at this point
Copy the full SHA

7baa4f7 View commit details

Browse the repository at this point in the history
[update] readme

naubull2 committed Sep 27, 2023
Configuration menu
View commit details

Copy full SHA for 41977df

Browse repository at this point
Copy the full SHA

41977df View commit details

Browse the repository at this point in the history

Commits on Sep 28, 2023

[fix] some of the bugs preventing fine-tune run
```
+ There's still bugs in the attention dimensions mismatch
```
naubull2 committed Sep 28, 2023
Configuration menu
View commit details

Copy full SHA for 9c9d0a2

Browse repository at this point
Copy the full SHA

9c9d0a2 View commit details

Browse the repository at this point in the history
[fix] dimesion discrepancy between attention mask and the query length
```
+ group batch attention is skipped to avoid this problem for now
```
naubull2 committed Sep 28, 2023
Configuration menu
View commit details

Copy full SHA for a5111ef

Browse repository at this point
Copy the full SHA

a5111ef View commit details

Browse the repository at this point in the history
[fix] SFT to match the same mods in finetune.py

naubull2 committed Sep 28, 2023
Configuration menu
View commit details

Copy full SHA for 5862050

Browse repository at this point
Copy the full SHA

5862050 View commit details

Browse the repository at this point in the history
Merge branch 'forked-only'

naubull2 committed Sep 28, 2023
Configuration menu
View commit details

Copy full SHA for 0cf0dfd

Browse repository at this point
Copy the full SHA

0cf0dfd View commit details

Browse the repository at this point in the history

Commits on Sep 30, 2023

[add] parallel group attention then reshape back to original form

naubull2 committed Sep 30, 2023
Configuration menu
View commit details

Copy full SHA for 1532c4b

Browse repository at this point
Copy the full SHA

1532c4b View commit details

Browse the repository at this point in the history
[fix] non-contiguous dimensions changing view issue

naubull2 committed Sep 30, 2023
Configuration menu
View commit details

Copy full SHA for 6fdffbb

Browse repository at this point
Copy the full SHA

6fdffbb View commit details

Browse the repository at this point in the history
[add] attention mask to align with the grouped batching

naubull2 committed Sep 30, 2023
Configuration menu
View commit details

Copy full SHA for fe97f86

Browse repository at this point
Copy the full SHA

fe97f86 View commit details

Browse the repository at this point in the history

Commits on Oct 2, 2023

[add] torch autocast for flash attention safety
```
+ flash attention only supports in fp16/bf16
```
naubull2 committed Oct 2, 2023
Configuration menu
View commit details

Copy full SHA for 9e30a15

Browse repository at this point
Copy the full SHA

9e30a15 View commit details

Browse the repository at this point in the history
[fix] HF built-in rotary embedding is not compatible with flash-atten…
```
…tion

+ cos/sin cache tensor is not trained parameter, so it's not autocast
  along with other model parameters through `torch_dtype`.
```
naubull2 committed Oct 2, 2023
Configuration menu
View commit details

Copy full SHA for 3f9c47c

Browse repository at this point
Copy the full SHA

3f9c47c View commit details

Browse the repository at this point in the history
[add] missing local reference for rotate_half

naubull2 committed Oct 2, 2023
Configuration menu
View commit details

Copy full SHA for b21e949

Browse repository at this point
Copy the full SHA

b21e949 View commit details

Browse the repository at this point in the history
[rollback] torch.cuda autocast causes half precision error
```
+ Works fine without the torch.cuda autocast context, so rollback.
```
naubull2 committed Oct 2, 2023
Configuration menu
View commit details

Copy full SHA for b224273

Browse repository at this point
Copy the full SHA

b224273 View commit details

Browse the repository at this point in the history

Commits on Oct 3, 2023

[fix] flash attention causing in-place operation runtime errors

naubull2 committed Oct 3, 2023
Configuration menu
View commit details

Copy full SHA for 9123e42

Browse repository at this point
Copy the full SHA

9123e42 View commit details

Browse the repository at this point in the history
[fix] mixed use of tabs and spaces

naubull2 committed Oct 3, 2023
Configuration menu
View commit details

Copy full SHA for 7203de2

Browse repository at this point
Copy the full SHA

7203de2 View commit details

Browse the repository at this point in the history
[change] readme back to where it came from the original repo

naubull2 committed Oct 3, 2023
Configuration menu
View commit details

Copy full SHA for 8a11ef8

Browse repository at this point
Copy the full SHA

8a11ef8 View commit details

Browse the repository at this point in the history
[remove] unused comments

naubull2 committed Oct 3, 2023
Configuration menu
View commit details

Copy full SHA for 02e4c1c

Browse repository at this point
Copy the full SHA

02e4c1c View commit details

Browse the repository at this point in the history