[LOADING TIME ISSUE] 2hrs of loading time Mistral 8x7b in AWS sagemaker #565

RishubAmpliforce · 2024-04-01T08:27:44Z

Prerequisites

I have read the documentation.
I have checked other issues for similar problems.

Backend

Other cloud providers

Interface Used

CLI

CLI Command

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 autotrain llm
--train
--model mistralai/Mixtral-8x7B-Instruct-v0.1
--project-name finetuning_8x7b
--data-path ./
--text-column text
--lr ${LEARNING_RATE}
--batch-size 4
--epochs 6
--block-size 1024
--warmup-ratio 0.1
--lora-r 16
--lora-alpha 32
--lora-dropout 0.05
--weight-decay 0.01
--gradient-accumulation 4
--quantization int4
--mixed-precision fp16

UI Screenshots & Parameters

No response

Error Logs

2 hours for loading the model before the finetuning job starts, for large models like mistral 8x7b

Instance used was AWS sagemaker ml.g5.48x.large

Is this expected behaviour because 7b, and 13b models loaded within a few minutes but in this its taking about 2 hours.

I used the google colab notebook for reference

Additional Information

No response

abhishekkrthakur · 2024-04-01T09:29:00Z

you seem to have quantization params but peft is not enabled which effectively disables quantization. try adding --peft param.

also, the time to load the model depends on download time and hardware used.

RishubAmpliforce · 2024-04-01T12:34:51Z

I have tried with Peft as well

On hardware regards I used 8xA10G GPUs and the model is already downloaded.

The delay I am facing is because it loads model in the RAM first and then later the GPU.

I tried finetuning mistral 8x7b across both the cases.

Any advice to reduce the model load time. Preferrably load to the GPU directly.

If this is expected behaviour curious why we load the model to the RAM and then to the GPU later.

abhishekkrthakur · 2024-04-01T12:36:54Z

it takes time to load but it takes like a few minutes not hours. are you using ssd?

abhishekkrthakur · 2024-04-01T12:39:15Z

mixtral on h100s: https://huggingface.co/blog/abhishek/autotrain-mixtral-dgx-cloud-local

RishubAmpliforce · 2024-04-01T12:46:38Z

I am using SSD.

github-actions · 2024-04-21T15:01:33Z

This issue is stale because it has been open for 15 days with no activity.

github-actions · 2024-05-01T15:01:46Z

This issue was closed because it has been inactive for 2 days since being marked as stale.

RishubAmpliforce added the bug Something isn't working label Apr 1, 2024

github-actions bot added the stale label Apr 21, 2024

github-actions bot closed this as completed May 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LOADING TIME ISSUE] 2hrs of loading time Mistral 8x7b in AWS sagemaker #565

[LOADING TIME ISSUE] 2hrs of loading time Mistral 8x7b in AWS sagemaker #565

RishubAmpliforce commented Apr 1, 2024

abhishekkrthakur commented Apr 1, 2024

RishubAmpliforce commented Apr 1, 2024

abhishekkrthakur commented Apr 1, 2024

abhishekkrthakur commented Apr 1, 2024

RishubAmpliforce commented Apr 1, 2024

github-actions bot commented Apr 21, 2024

github-actions bot commented May 1, 2024

[LOADING TIME ISSUE] 2hrs of loading time Mistral 8x7b in AWS sagemaker #565

[LOADING TIME ISSUE] 2hrs of loading time Mistral 8x7b in AWS sagemaker #565

Comments

RishubAmpliforce commented Apr 1, 2024

Prerequisites

Backend

Interface Used

CLI Command

UI Screenshots & Parameters

Error Logs

Additional Information

abhishekkrthakur commented Apr 1, 2024

RishubAmpliforce commented Apr 1, 2024

abhishekkrthakur commented Apr 1, 2024

abhishekkrthakur commented Apr 1, 2024

RishubAmpliforce commented Apr 1, 2024

github-actions bot commented Apr 21, 2024

github-actions bot commented May 1, 2024