You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
WARNING | 2024-04-23 11:10:39 | autotrain.trainers.common:init:170 - Parameters not supplied by user and set to default: model, push_to_hub, task, numerical_columns, num_trials, token, repo_id, id_column, data_path, time_limit, seed, username, train_split, valid_split, categorical_columns, target_columns, project_name
WARNING | 2024-04-23 11:10:39 | autotrain.trainers.common:init:170 - Parameters not supplied by user and set to default: scheduler, lora_alpha, lora_dropout, max_target_length, target_column, text_column, data_path, seed, save_total_limit, peft, gradient_accumulation, model, warmup_ratio, optimizer, push_to_hub, weight_decay, lora_r, token, repo_id, batch_size, max_grad_norm, quantization, max_seq_length, username, logging_steps, lr, train_split, valid_split, evaluation_strategy, epochs, auto_find_batch_size, project_name
WARNING | 2024-04-23 11:10:39 | autotrain.trainers.common:init:170 - Parameters not supplied by user and set to default: model, warmup_ratio, optimizer, scheduler, push_to_hub, image_column, weight_decay, save_strategy, token, target_column, repo_id, batch_size, max_grad_norm, data_path, seed, save_total_limit, username, gradient_accumulation, logging_steps, lr, train_split, valid_split, evaluation_strategy, epochs, auto_find_batch_size, project_name
WARNING | 2024-04-23 11:10:39 | autotrain.trainers.common:init:170 - Parameters not supplied by user and set to default: model, warmup_ratio, optimizer, scheduler, push_to_hub, weight_decay, save_strategy, token, target_column, repo_id, text_column, batch_size, max_grad_norm, data_path, max_seq_length, seed, save_total_limit, username, gradient_accumulation, logging_steps, lr, train_split, valid_split, evaluation_strategy, epochs, auto_find_batch_size, project_name
WARNING | 2024-04-23 11:10:39 | autotrain.trainers.common:init:170 - Parameters not supplied by user and set to default: trainer, scheduler, use_flash_attention_2, lora_alpha, lora_dropout, merge_adapter, model_ref, text_column, data_path, dpo_beta, add_eos_token, seed, save_total_limit, prompt_text_column, gradient_accumulation, model, warmup_ratio, optimizer, push_to_hub, model_max_length, weight_decay, lora_r, token, repo_id, disable_gradient_checkpointing, rejected_text_column, batch_size, max_grad_norm, username, logging_steps, evaluation_strategy, train_split, valid_split, lr, max_prompt_length, auto_find_batch_size, project_name
INFO | 2024-04-23 11:10:39 | autotrain.app::31 - Starting AutoTrain...
Your installed package nvidia-ml-py is corrupted. Skip patch functions nvmlDeviceGetMemoryInfo. You may get incorrect or incomplete results. Please consider reinstall package nvidia-ml-py via pip3 install --force-reinstall nvidia-ml-py nvitop.
Your installed package nvidia-ml-py is corrupted. Skip patch functions nvmlDeviceGet{Compute,Graphics,MPSCompute}RunningProcesses. You may get incorrect or incomplete results. Please consider reinstall package nvidia-ml-py via pip3 install --force-reinstall nvidia-ml-py nvitop.
Additional Information
No response
The text was updated successfully, but these errors were encountered:
I get the following error while trying to train the Llama3 model. Appreciate any thoughts. Thanks.
Prerequisites
Backend
Hugging Face Space/Endpoints
Interface Used
UI
CLI Command
No response
UI Screenshots & Parameters
Error Logs
Device 0: NVIDIA A10G - 307.6MiB/22.49GiB
INFO | 2024-04-23 11:12:23 | autotrain.app:handle_form:454 - hardware: Local
INFO | 2024-04-23 11:11:16 | autotrain.app:fetch_params:212 - Task: llm:sft
INFO | 2024-04-23 11:10:40 | autotrain.app::154 - AutoTrain started successfully
WARNING | 2024-04-23 11:10:39 | autotrain.trainers.common:init:170 - Parameters not supplied by user and set to default: model, warmup_ratio, optimizer, scheduler, push_to_hub, tags_column, weight_decay, save_strategy, token, repo_id, batch_size, max_grad_norm, data_path, max_seq_length, seed, save_total_limit, username, gradient_accumulation, logging_steps, lr, train_split, tokens_column, valid_split, evaluation_strategy, epochs, auto_find_batch_size, project_name
WARNING | 2024-04-23 11:10:39 | autotrain.trainers.common:init:170 - Parameters not supplied by user and set to default: adam_beta2, warmup_steps, scheduler, class_image_path, adam_epsilon, checkpoints_total_limit, revision, text_encoder_use_attention_mask, image_path, seed, prior_preservation, xl, adam_beta1, prior_loss_weight, validation_images, prior_generation_precision, tokenizer_max_length, model, logging, push_to_hub, rank, center_crop, allow_tf32, local_rank, num_validation_images, token, validation_prompt, repo_id, scale_lr, checkpointing_steps, sample_batch_size, class_labels_conditioning, class_prompt, max_grad_norm, adam_weight_decay, num_class_images, username, tokenizer, resume_from_checkpoint, lr_power, num_cycles, pre_compute_text_embeddings, validation_epochs, epochs, dataloader_num_workers, project_name
WARNING | 2024-04-23 11:10:39 | autotrain.trainers.common:init:170 - Parameters not supplied by user and set to default: model, push_to_hub, task, numerical_columns, num_trials, token, repo_id, id_column, data_path, time_limit, seed, username, train_split, valid_split, categorical_columns, target_columns, project_name
WARNING | 2024-04-23 11:10:39 | autotrain.trainers.common:init:170 - Parameters not supplied by user and set to default: scheduler, lora_alpha, lora_dropout, max_target_length, target_column, text_column, data_path, seed, save_total_limit, peft, gradient_accumulation, model, warmup_ratio, optimizer, push_to_hub, weight_decay, lora_r, token, repo_id, batch_size, max_grad_norm, quantization, max_seq_length, username, logging_steps, lr, train_split, valid_split, evaluation_strategy, epochs, auto_find_batch_size, project_name
WARNING | 2024-04-23 11:10:39 | autotrain.trainers.common:init:170 - Parameters not supplied by user and set to default: model, warmup_ratio, optimizer, scheduler, push_to_hub, image_column, weight_decay, save_strategy, token, target_column, repo_id, batch_size, max_grad_norm, data_path, seed, save_total_limit, username, gradient_accumulation, logging_steps, lr, train_split, valid_split, evaluation_strategy, epochs, auto_find_batch_size, project_name
WARNING | 2024-04-23 11:10:39 | autotrain.trainers.common:init:170 - Parameters not supplied by user and set to default: model, warmup_ratio, optimizer, scheduler, push_to_hub, weight_decay, save_strategy, token, target_column, repo_id, text_column, batch_size, max_grad_norm, data_path, max_seq_length, seed, save_total_limit, username, gradient_accumulation, logging_steps, lr, train_split, valid_split, evaluation_strategy, epochs, auto_find_batch_size, project_name
WARNING | 2024-04-23 11:10:39 | autotrain.trainers.common:init:170 - Parameters not supplied by user and set to default: trainer, scheduler, use_flash_attention_2, lora_alpha, lora_dropout, merge_adapter, model_ref, text_column, data_path, dpo_beta, add_eos_token, seed, save_total_limit, prompt_text_column, gradient_accumulation, model, warmup_ratio, optimizer, push_to_hub, model_max_length, weight_decay, lora_r, token, repo_id, disable_gradient_checkpointing, rejected_text_column, batch_size, max_grad_norm, username, logging_steps, evaluation_strategy, train_split, valid_split, lr, max_prompt_length, auto_find_batch_size, project_name
INFO | 2024-04-23 11:10:39 | autotrain.app::31 - Starting AutoTrain...
Your installed package
nvidia-ml-py
is corrupted. Skip patch functionsnvmlDeviceGetMemoryInfo
. You may get incorrect or incomplete results. Please consider reinstall packagenvidia-ml-py
viapip3 install --force-reinstall nvidia-ml-py nvitop
.Your installed package
nvidia-ml-py
is corrupted. Skip patch functionsnvmlDeviceGet{Compute,Graphics,MPSCompute}RunningProcesses
. You may get incorrect or incomplete results. Please consider reinstall packagenvidia-ml-py
viapip3 install --force-reinstall nvidia-ml-py nvitop
.Additional Information
No response
The text was updated successfully, but these errors were encountered: