-
Notifications
You must be signed in to change notification settings - Fork 493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LOADING TIME ISSUE] 2hrs of loading time Mistral 8x7b in AWS sagemaker #565
Comments
you seem to have quantization params but peft is not enabled which effectively disables quantization. try adding also, the time to load the model depends on download time and hardware used. |
I have tried with Peft as well On hardware regards I used 8xA10G GPUs and the model is already downloaded. The delay I am facing is because it loads model in the RAM first and then later the GPU. I tried finetuning mistral 8x7b across both the cases. Any advice to reduce the model load time. Preferrably load to the GPU directly. If this is expected behaviour curious why we load the model to the RAM and then to the GPU later. |
it takes time to load but it takes like a few minutes not hours. are you using ssd? |
mixtral on h100s: https://huggingface.co/blog/abhishek/autotrain-mixtral-dgx-cloud-local |
This issue is stale because it has been open for 15 days with no activity. |
This issue was closed because it has been inactive for 2 days since being marked as stale. |
Prerequisites
Backend
Other cloud providers
Interface Used
CLI
CLI Command
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 autotrain llm
--train
--model mistralai/Mixtral-8x7B-Instruct-v0.1
--project-name finetuning_8x7b
--data-path ./
--text-column text
--lr ${LEARNING_RATE}
--batch-size 4
--epochs 6
--block-size 1024
--warmup-ratio 0.1
--lora-r 16
--lora-alpha 32
--lora-dropout 0.05
--weight-decay 0.01
--gradient-accumulation 4
--quantization int4
--mixed-precision fp16
UI Screenshots & Parameters
No response
Error Logs
2 hours for loading the model before the finetuning job starts, for large models like mistral 8x7b
Instance used was AWS sagemaker ml.g5.48x.large
Is this expected behaviour because 7b, and 13b models loaded within a few minutes but in this its taking about 2 hours.
I used the google colab notebook for reference
Additional Information
No response
The text was updated successfully, but these errors were encountered: