-
Notifications
You must be signed in to change notification settings - Fork 416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
torchrun command not found #553
Comments
use PS, could you tell us the purpose of using this script, pretraining or finetune? thanks |
Update: For users with multiple GPUs, manually setting slots=1 and running |
The problem that just came up:
~/examples/Aquila/Aquila-chat/aquila_chat.py
~/_init_.py is None, edit: from . import env_trainer_v1 be ineffective What more can be done? |
Because https://github.com/FlagAI-Open/Aquila2 .issues close |
Run:
bash dist_trigger_docker.sh hostfile Aquila-chat.yaml aquila-7b aquila_experiment
Error:
envs: 1 * 4090
hostfile: 192.168.1.5 slots=1
edit ~/FlagAI/flagai/env_args.py
self.parser.add_argument('--local_rank', default=0, type=int, help='start training from saved checkpoint')
be ineffective
What more can be done? I don't feel comfortable downgrading at this time. Are there any other options?
Originally posted by @Micla-SHL in #511 (comment)
The text was updated successfully, but these errors were encountered: