Add Falcon support (new) #592

zhuohan123 · 2023-07-27T01:20:16Z

This PR replaces PR #321

Also, revert an early all-reduce optimization that stores all all-reduce results in a shared buffer. This can lead to wrong results in distributed settings for models with two parallel all-reduce branches.

Correctness check:

…into falcon-zhuohan

zhuohan123 · 2023-07-28T22:17:25Z

@WoosukKwon This PR is ready for review.

WoosukKwon

@zhuohan123 Awesome! Thanks a million for this hard work! Left minor comments.

vllm/transformers_utils/config.py

vllm/transformers_utils/configs/__init__.py

vllm/config.py

vllm/model_executor/layers/attention.py

vllm/model_executor/parallel_utils/tensor_parallel/layers.py

csrc/pos_encoding_kernels.cu

vllm/model_executor/models/falcon.py

gesanqiu · 2023-08-03T02:47:34Z

This PR seems lead wrong outputs to BigCoder models when TP=2.

zhuohan123 added 6 commits July 26, 2023 06:20

[WIP] implement falcon based on the offical hf code, still not correct.

d997df9

Fix falcon bugs

cc337eb

Test falcon-7b

bd39d40

Add RWConfig for compatibility

484050c

Fix for falcon 40b

0641074

Fix Falcon-40B correctness

bc7514e

zhuohan123 force-pushed the falcon-zhuohan branch from b492519 to bc7514e Compare July 28, 2023 00:31

zhuohan123 changed the title ~~[WIP] Add Falcon support (new)~~ Add Falcon support (new) Jul 28, 2023

zhuohan123 added 6 commits July 28, 2023 06:31

Remove error-prone all-reduce optimization

e4ea68c

Reduce one extra all reduce

ced54cd

format

8ace2bf

Delete falcon_hf.py

6f0562d

Fix config type

db8eccc

Merge branch 'falcon-zhuohan' of https://github.com/vllm-project/vllm …

ffb8abc

…into falcon-zhuohan

zhuohan123 requested a review from WoosukKwon July 28, 2023 22:05

zhuohan123 added 3 commits July 28, 2023 15:10

format

6888020

Add comments and modify readme

8f05a21

Merge branch 'main' into falcon-zhuohan

91e3b06

This was referenced Jul 29, 2023

[WIP] Add Falcon #321

Closed

New release? #638

Closed

WoosukKwon approved these changes Aug 2, 2023

View reviewed changes

Fix review comments

ca89819

zhuohan123 merged commit 1b0bd0f into main Aug 2, 2023
2 checks passed

zhuohan123 mentioned this pull request Aug 2, 2023

Anyone adapting falcon 40B&7B models now? #356

Closed

zhuohan123 deleted the falcon-zhuohan branch August 4, 2023 05:50

casper-hansen mentioned this pull request Sep 5, 2023

15% less speed after rotary_embedding_neox_kernel update in #592 #956

Closed

hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024

Add Falcon support (new) (vllm-project#592)

a43f48d

sjchoi1 pushed a commit to casys-kaist-internal/vllm that referenced this pull request May 7, 2024

Add Falcon support (new) (vllm-project#592)

7fed772

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Falcon support (new) #592

Add Falcon support (new) #592

zhuohan123 commented Jul 27, 2023 •

edited

zhuohan123 commented Jul 28, 2023

WoosukKwon left a comment

gesanqiu commented Aug 3, 2023

Add Falcon support (new) #592

Add Falcon support (new) #592

Conversation

zhuohan123 commented Jul 27, 2023 • edited

zhuohan123 commented Jul 28, 2023

WoosukKwon left a comment

Choose a reason for hiding this comment

gesanqiu commented Aug 3, 2023

zhuohan123 commented Jul 27, 2023 •

edited