Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

train_dreambooth_inpaint.py throwing "returned non-zero exit status 1" error #2444

Closed
lakshman111 opened this issue Feb 21, 2023 · 7 comments
Labels
bug Something isn't working stale Issues that haven't received updates

Comments

@lakshman111
Copy link

lakshman111 commented Feb 21, 2023

Describe the bug

@thedarkzeno @patil-suraj

I've been running train_dreambooth_inpaint.py from https://github.com/huggingface/diffusers/tree/main/examples/research_projects/dreambooth_inpaint for the last few days with the same environment configurations. Today I ran it again and got:

Traceback (most recent call last):
File "train_dreambooth_inpaint.py", line 825, in
main()
File "train_dreambooth_inpaint.py", line 421, in main
accelerator = Accelerator(
TypeError: init() got an unexpected keyword argument 'accelerator_project_config'
Traceback (most recent call last):
File "/usr/local/envs/laksh/bin/accelerate", line 8, in
sys.exit(main())
File "/usr/local/envs/laksh/lib/python3.8/site-packages/accelerate/commands/accelerate_cli.py", line 45, in main
args.func(args)
File "/usr/local/envs/laksh/lib/python3.8/site-packages/accelerate/commands/launch.py", line 1097, in launch_command
simple_launcher(args)
File "/usr/local/envs/laksh/lib/python3.8/site-packages/accelerate/commands/launch.py", line 552, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/usr/local/envs/laksh/bin/python', 'train_dreambooth_inpaint.py', '--pretrained_model_name_or_path=runwayml/stable-diffusion-inpainting', '--instance_data_dir=/workspace/diffusers/examples/research_projects/dreambooth_inpaint/modern_512_images', '--output_dir=/workspace/diffusers/examples/research_projects/dreambooth_inpaint/modern_512_400steps_model', '--instance_prompt=a photo of mmodern furniture', '--resolution=512', '--train_batch_size=1', '--gradient_accumulation_steps=1', '--learning_rate=5e-6', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--max_train_steps=400']' returned non-zero exit status 1.

Reproduction

  1. SSH into GPU
  2. Create a conda environment: conda create -n YOUR_ENV_NAME python=3.8
  3. Activate the session: source activate YOUR_ENV_NAME
  4. CD into https://github.com/huggingface/diffusers/tree/main/examples/research_projects/dreambooth_inpaint
  5. pip install -r requirements.txt
  6. pip install git+https://github.com/huggingface/diffusers
  7. Run the first code snippet with custom inputs (my code is below)

`export MODEL_NAME="runwayml/stable-diffusion-inpainting"
export INSTANCE_DIR="/workspace/diffusers/examples/research_projects/dreambooth_inpaint/modern_512_images"
export OUTPUT_DIR="/workspace/diffusers/examples/research_projects/dreambooth_inpaint/modern_512_400steps_model"

accelerate launch train_dreambooth_inpaint.py
--pretrained_model_name_or_path=$MODEL_NAME
--instance_data_dir=$INSTANCE_DIR
--output_dir=$OUTPUT_DIR
--instance_prompt="a photo of mmodern furniture"
--resolution=512
--train_batch_size=1
--gradient_accumulation_steps=1
--learning_rate=5e-6
--lr_scheduler="constant"
--lr_warmup_steps=0
--max_train_steps=400`

Logs

No response

System Info

  • diffusers version: 0.14.0.dev0
  • Platform: Linux-5.15.0-47-generic-x86_64-with-glibc2.17
  • Python version: 3.8.16
  • PyTorch version (GPU?): 1.13.1+cu117 (True)
  • Huggingface_hub version: 0.12.1
  • Transformers version: 4.26.1
  • Accelerate version: 0.16.0
  • xFormers version: not installed
  • Using GPU in script?: yes
  • Using distributed or parallel set-up in script?: no
@lakshman111 lakshman111 added the bug Something isn't working label Feb 21, 2023
@lakshman111
Copy link
Author

Solved! I changed "accelerator_project_config" to "project_config" and it worked. Looks like the variable Accelerate is looking for is "project_config": https://github.com/huggingface/accelerate/blob/main/src/accelerate/accelerator.py

Didn't get around to submitting a PR yet. I've never submitted one, so need to still read through the guidelines on how to do that.

@sayakpaul
Copy link
Member

Thanks and glad that it got resolved!

@kadirnar
Copy link
Contributor

kadirnar commented Mar 4, 2023

Solved! I changed "accelerator_project_config" to "project_config" and it worked. Looks like the variable Accelerate is looking for is "project_config": https://github.com/huggingface/accelerate/blob/main/src/accelerate/accelerator.py

Didn't get around to submitting a PR yet. I've never submitted one, so need to still read through the guidelines on how to do that.

Thank you. This error should be fixed immediately. @thedarkzeno @patil-suraj

@sayakpaul
Copy link
Member

@lakshman111 could we close this issue?

@kadirnar
Copy link
Contributor

@lakshman111bu konuyu kapatabilir miyiz?

No. The error still persists. It needs to be fixed. If you want, I can do it.

@patrickvonplaten
Copy link
Contributor

@lakshman111bu konuyu kapatabilir miyiz?

No. The error still persists. It needs to be fixed. If you want, I can do it.

That'd be very nice!

@github-actions
Copy link

github-actions bot commented Apr 7, 2023

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions bot added the stale Issues that haven't received updates label Apr 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale Issues that haven't received updates
Projects
None yet
Development

No branches or pull requests

4 participants