the model is image2image or text2image #6

leangovern · 2023-07-25T08:54:12Z

hi,i am reading the code.paper shows that the model uses img2img and textual inversion to generate data.but i did not find a img2img model in this code,only found a txt2img model.whether it is that i missed it.I am confused,hoping for your reply!

brandontrabucco · 2023-07-25T23:35:40Z

Hello leangovern,

Thanks for your interest in our code! The underlying Stable Diffusion model is used as a [txt+img]2img model in our codebase. That is, given a real image to use as a guide, (1) we first add a small amount of random noise to that real image, and (2) we use Stable Diffusion to denoise that image, conditioned on a text prompt.

DA-Fusion uses Textual Inversion to learn the text prompt, but the prompt can also be prompt engineered if desired.

The image input helps to preserve the structure of the real image, and the text prompt input allows you to specify how you want the image to be augmented.

This class (https://github.com/brandontrabucco/da-fusion/blob/main/semantic_aug/augmentations/textual_inversion.py#L86) has the relevant code if you are interested in the implementation details.

Let me know how I can help if you have other questions!

-Brandon

leangovern · 2023-07-26T02:39:20Z

Hello Brandon

Thank you very much for your detailed reply, which is of great help to me. Thank you very much for your work, which is a great inspiration to me. Thank you very much, I will continue to read your code and apply it to my work. Thank you very much.

-Govern

leangovern · 2023-07-26T13:15:04Z

Hi Brandon,

I am back again.my english is poor,so i will try my best to express myself.i have a question in your paper.you mentioned that stable diffusion may leak internet images when generate pictures, if we use it to generate synthetic data may cause a diemma that the synthetic data is with great effect while it may be unfair.because stable diffusion is trained under huge number of images.so we should erase the concept of the class we will generate, to ensure that we use the generative ability of SD instead of the extra training data.hoping for your reply!

-Govern with repect

brandontrabucco · 2023-08-03T01:10:03Z

Hi again,

You got it! Though in downstream applications, the model leaking internet images is not usually a big issue. We evaluated the model in our paper this way because we are interested in studying how the model generalizes to novel classes.

-Brandon

leangovern · 2023-08-03T11:54:54Z

thank you brandon，your work is so inspiring!

JiaojiaoYe1994 · 2024-06-25T16:42:19Z

Thank you for your work! I am curious if there is quantitative implementation of synthetic images in this work?

brandontrabucco self-assigned this Jul 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

the model is image2image or text2image #6

the model is image2image or text2image #6

leangovern commented Jul 25, 2023

brandontrabucco commented Jul 25, 2023

leangovern commented Jul 26, 2023

leangovern commented Jul 26, 2023

brandontrabucco commented Aug 3, 2023 •

edited

Loading

leangovern commented Aug 3, 2023

JiaojiaoYe1994 commented Jun 25, 2024

the model is image2image or text2image #6

the model is image2image or text2image #6

Comments

leangovern commented Jul 25, 2023

brandontrabucco commented Jul 25, 2023

leangovern commented Jul 26, 2023

leangovern commented Jul 26, 2023

brandontrabucco commented Aug 3, 2023 • edited Loading

leangovern commented Aug 3, 2023

JiaojiaoYe1994 commented Jun 25, 2024

brandontrabucco commented Aug 3, 2023 •

edited

Loading