Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LOADING TIME ISSUE] 2hrs of loading time Mistral 8x7b in AWS sagemaker #565

Closed
2 tasks done
RishubAmpliforce opened this issue Apr 1, 2024 · 7 comments
Closed
2 tasks done
Labels
bug Something isn't working stale

Comments

@RishubAmpliforce
Copy link

Prerequisites

  • I have read the documentation.
  • I have checked other issues for similar problems.

Backend

Other cloud providers

Interface Used

CLI

CLI Command

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 autotrain llm
--train
--model mistralai/Mixtral-8x7B-Instruct-v0.1
--project-name finetuning_8x7b
--data-path ./
--text-column text
--lr ${LEARNING_RATE}
--batch-size 4
--epochs 6
--block-size 1024
--warmup-ratio 0.1
--lora-r 16
--lora-alpha 32
--lora-dropout 0.05
--weight-decay 0.01
--gradient-accumulation 4
--quantization int4
--mixed-precision fp16

UI Screenshots & Parameters

No response

Error Logs

2 hours for loading the model before the finetuning job starts, for large models like mistral 8x7b

Instance used was AWS sagemaker ml.g5.48x.large

Is this expected behaviour because 7b, and 13b models loaded within a few minutes but in this its taking about 2 hours.

I used the google colab notebook for reference

Additional Information

No response

@RishubAmpliforce RishubAmpliforce added the bug Something isn't working label Apr 1, 2024
@abhishekkrthakur
Copy link
Member

you seem to have quantization params but peft is not enabled which effectively disables quantization. try adding --peft param.

also, the time to load the model depends on download time and hardware used.

@RishubAmpliforce
Copy link
Author

I have tried with Peft as well

On hardware regards I used 8xA10G GPUs and the model is already downloaded.

The delay I am facing is because it loads model in the RAM first and then later the GPU.

image

I tried finetuning mistral 8x7b across both the cases.

Any advice to reduce the model load time. Preferrably load to the GPU directly.

If this is expected behaviour curious why we load the model to the RAM and then to the GPU later.

@abhishekkrthakur
Copy link
Member

it takes time to load but it takes like a few minutes not hours. are you using ssd?

@abhishekkrthakur
Copy link
Member

mixtral on h100s: https://huggingface.co/blog/abhishek/autotrain-mixtral-dgx-cloud-local

@RishubAmpliforce
Copy link
Author

image

I am using SSD.

Copy link

This issue is stale because it has been open for 15 days with no activity.

@github-actions github-actions bot added the stale label Apr 21, 2024
Copy link

github-actions bot commented May 1, 2024

This issue was closed because it has been inactive for 2 days since being marked as stale.

@github-actions github-actions bot closed this as completed May 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale
Projects
None yet
Development

No branches or pull requests

2 participants