Intel ARC/XPU Improvements #2052

Steve-Tech · 2023-07-22T06:47:06Z

Why are these changes needed?

I wasn't clearing the cache after conversations before, now it's fixed.
The --gpus arg should work to select between multiple Intel GPUs now (but I only have one so I can't really test it).
IPEX was broken in the model workers for the web UI (weirdly, just moving import touch up a few lines fixed it).

Also, for a while float16 was broken on Intel but it seems fixed now, I was wondering should I use float16 so it's consistent with cuda and metal or just stick with bfloat16?

Related issue number (if applicable)

N/A

Checks

I've run format.sh to lint the changes in this PR.
I've included any doc changes needed.
I've made sure the relevant tests are passing (if applicable).

I can't test this without multiple GPUs, but it doesn't break anything for one GPU.

Turns out this does actually speed things up.

fastchat/serve/model_worker.py

fastchat/serve/multi_model_worker.py

fastchat/serve/model_worker.py

Steve-Tech added 6 commits July 22, 2023 15:49

Empty XPU Cache

2217d46

A little cleanup around XPU code

aaaab51

Set XPU_VISIBLE_DEVICES from gpus arg

55691ae

I can't test this without multiple GPUs, but it doesn't break anything for one GPU.

IPEX breaks if import torch isn't before fastapi in model_workers

3a97720

Empty XPU Cache in other spots

c34cc80

Re-add dtype to torch.xpu.optimize

2b352d2

Turns out this does actually speed things up.

merrymercy requested changes Jul 23, 2023

View reviewed changes

fastchat/serve/model_worker.py Outdated Show resolved Hide resolved

fastchat/serve/multi_model_worker.py Outdated Show resolved Hide resolved

fastchat/serve/multi_model_worker.py Outdated Show resolved Hide resolved

merrymercy added 3 commits July 22, 2023 21:02

Update fastchat/serve/model_worker.py

1cffaca

Update fastchat/serve/multi_model_worker.py

061f32a

Update fastchat/serve/multi_model_worker.py

fc1ce38

merrymercy reviewed Jul 23, 2023

View reviewed changes

fastchat/serve/model_worker.py Outdated Show resolved Hide resolved

Update fastchat/serve/model_worker.py

aa015b6

merrymercy merged commit 2c7f31a into lm-sys:main Jul 23, 2023
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Intel ARC/XPU Improvements #2052

Intel ARC/XPU Improvements #2052

Steve-Tech commented Jul 22, 2023

Intel ARC/XPU Improvements #2052

Intel ARC/XPU Improvements #2052

Conversation

Steve-Tech commented Jul 22, 2023

Why are these changes needed?

Related issue number (if applicable)

Checks