You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
model=mistralai/Mistral-7B-Instruct-v0.1
volume=$PWD/data # share a volume with the Docker container to avoid downloading weights every run
docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data ghcr.io/predibase/lorax:latest --model-id $model --num-shard 2
Then try to generate:
❯ curl 127.0.0.1:8080/generate \
-X POST \
-d '{"inputs": "[INST] Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May? [/INST]", "parameters": {"max_new_tokens": 64, "adapter_id": "vineetsharma/qlora-adapter-Mistral-7B-Instruct-v0.1-gsm8k"}}' \
-H 'Content-Type: application/json'
{"error":"Request failed during generation: Server error: local variable 'lora_b' referenced before assignment","error_type":"generation"}%
Basically the issue that when trying to multiply first lora_a matrix, we get it sharded with shape [2048, r] while input is not sharded and has shape [49, 4096] .
Expected behavior
Generation completed successfully
The text was updated successfully, but these errors were encountered:
Hey @markovalexander and @abhibst, thanks for your patience with this. I just put up #47, which should address this issue. Feel free to test it out. Alternatively, I'll try and land this tonight, so new docker images should hopefully be available in shortly (next couple of hours).
System Info
Model info:
2 A100 gpus,
NVIDIA-SMI 535.104.05 Driver Version: 535.104.05 CUDA Version: 12.2
outside docker.Information
Tasks
Reproduction
Run mistral example with docker on 2 gpus:
Then try to generate:
Basically the issue that when trying to multiply first lora_a matrix, we get it sharded with shape
[2048, r]
while input is not sharded and has shape[49, 4096]
.Expected behavior
Generation completed successfully
The text was updated successfully, but these errors were encountered: