Skip to content

reyllama/caption-diffusion

Repository files navigation

caption-diffusion

Requirements

To install dependencies for Blended Diffusion, run
$ pip install ftfy regex matplotlib lpips kornia opencv-python torch==1.9.0+cu111 torchvision==0.10.0+cu111 -f https://download.pytorch.org/whl/torch_stable.html
We recommend using virtual environments such as conda.

To install dependencies for BLIP, run
$ pip install -r BLIP/requirements.txt

For automatic mask proposal, you will need a pretrained BLIP checkpoint as well as 256x256 imagenet-trained unconditional diffusion model.
Note that BLIP decoder checkpoint is downloaded online.


Auto-mask running script example

$ python3 main.py -p 'v-neck t shirts' -i 'validation/fashion-shop/basic_crew_neck_tee.jpg' --mask_auto --output_path 'output/' --vit --mask_thresh 0.35 --style_lambda 0.01 --ot

Hyperparameters

Name Role Type
mask_auto whether to automatically generate mask; you need manual mask otherwise with --mask store_true
mask_n_iter how much iterations to run backprops for mask generation int
mask_lr learning rate for mask backprop float
mask_flip whether to flip signs for maximum likelihood during mask proposal store_true
mask_lambda relative loss weight for pseudo capion and target caption float
mask_thresh threshold for binarizing the proposed mask float
mask_base_cap manual base caption input; we do not use BLIP pseudo caption str
vit whether to use BLIP ViT encoder for style loss; we use VGG otherwise store_true
pseudo_cap whether to generate pseudo caption for loss guidance (vector difference) store_true
blur whether to apply Gaussian blur to the proposed mask store_ture
ot whether to use optimal transport (feature l2) instead of Gram Matrix style loss store_true

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published