Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to use #2

Closed
hosnasn1987 opened this issue Oct 28, 2023 · 7 comments
Closed

how to use #2

hosnasn1987 opened this issue Oct 28, 2023 · 7 comments

Comments

@hosnasn1987
Copy link

hi

could you please guide me to use your script for fine tuning stable diffusion inpainting with my own dataset?

thank you

@sshh12
Copy link
Owner

sshh12 commented Oct 28, 2023

Hey!

You'll first want to create a dataset. You can see https://github.com/sshh12/terrain-diffusion/blob/main/scripts/build_text2rgb_dataset.py for the standard huggingface dataset format.

Then follow the instructions in https://github.com/sshh12/terrain-diffusion/blob/main/scripts/train_text_to_image_lora_sd2_inpaint.py to actually train it.

Hope this helps!

@hosnasn1987
Copy link
Author

hosnasn1987 commented Oct 29, 2023 via email

@sshh12
Copy link
Owner

sshh12 commented Oct 29, 2023

Sure, what's the dataset you are trying to use?

@hosnasn1987
Copy link
Author

hosnasn1987 commented Oct 29, 2023 via email

@hosnasn1987
Copy link
Author

hosnasn1987 commented Oct 29, 2023 via email

@sshh12
Copy link
Owner

sshh12 commented Oct 31, 2023

Ah ok interesting, yeah I think this should work although I'll suggest one possible slightly easier alternative first:

  1. Take input image and mask original chair and delete from image
  2. Find a picture of your chair with a similar angle (assuming you can identify the angle of the original and have enough pictures for every angle) and paste it masked. Delete some buffer between the pasted chair and the original image. Then use a pretrained inpainting model to fill in any gaps.

As for training a new inpainting model, this should work for that use case as well.

I would start with creating the right dataset format. You should be able to just adapt this script to write out the image + meta data to a folder: https://github.com/sshh12/terrain-diffusion/blob/main/scripts/build_text2rgb_dataset.py

The core part of the script is just for each image:

img = Image.open(data["rgb_fn"])
save_fn = f"{id_:06d}.png"
img.save(os.path.join(train_dir, save_fn))

meta = dict(file_name=save_fn, text=data["caption"])
metacsv.write(f"{json.dumps(meta)}\n")

If you don't have captions for the chairs then just put "a picture of a chair" or something like that.

Training once you have that dataset formatted should be as easy as just running the command in this file https://github.com/sshh12/terrain-diffusion/blob/main/scripts/train_text_to_image_lora_sd2_inpaint.py but with a path to your dataset.

@hosnasn1987
Copy link
Author

Ah ok interesting, yeah I think this should work although I'll suggest one possible slightly easier alternative first:

  1. Take input image and mask original chair and delete from image
  2. Find a picture of your chair with a similar angle (assuming you can identify the angle of the original and have enough pictures for every angle) and paste it masked. Delete some buffer between the pasted chair and the original image. Then use a pretrained inpainting model to fill in any gaps.

As for training a new inpainting model, this should work for that use case as well.

I would start with creating the right dataset format. You should be able to just adapt this script to write out the image + meta data to a folder: https://github.com/sshh12/terrain-diffusion/blob/main/scripts/build_text2rgb_dataset.py

The core part of the script is just for each image:

img = Image.open(data["rgb_fn"])
save_fn = f"{id_:06d}.png"
img.save(os.path.join(train_dir, save_fn))

meta = dict(file_name=save_fn, text=data["caption"])
metacsv.write(f"{json.dumps(meta)}\n")

If you don't have captions for the chairs then just put "a picture of a chair" or something like that.

Training once you have that dataset formatted should be as easy as just running the command in this file https://github.com/sshh12/terrain-diffusion/blob/main/scripts/train_text_to_image_lora_sd2_inpaint.py but with a path to your dataset.

hi

thank you for your complete answer

@sshh12 sshh12 closed this as completed Nov 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants