-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compiled model with torch.compile, unfortunately without performance improvements #2131
base: main
Are you sure you want to change the base?
Conversation
2bf9d5c
to
179a630
Compare
|
using your code, I run vicuna-7b in one L40, torch.__version__2.1.0+cu121; vllm = 0.2.2, it seems using torch.compile,without performance improvements; before |
179a630
to
45ee43e
Compare
For the latest version v0.2.7, is there any meaningful acceleration in terms of the compiler? |
45ee43e
to
2637c51
Compare
2637c51
to
73f0f1a
Compare
A follow-up of #42 cc @zhuohan123
torch.jit.script
and TorchScript can't be used as forward methods use parameters not compatible with it https://pytorch.org/docs/stable/jit_language_reference.html#supported-type.torch.jit.trace
looks even more challenging.I was only able to make it run by using
torch.compile
with minimal@torch.compiler.disable
addition. Unfortunately, I only see performance degradation(RTX 3090)llama
This PR can be considered as a first step to use
torch.compiler
for further improvements.BTW
onnrt
backend returns