TTL text2img

Text-to-image generation using diffusion models involves iteratively refining images based on textual descriptions. This approach gradually enhances image quality, ensuring coherence with the given text. It enables the generation of visually appealing and contextually relevant images that accurately represent the textual input.
Demo in huggingface: TTL-text2img

Installation

Instructions on how to install and set up TTL-text2img

git clone https://github.com/TranafLee/TTL-text2img.git

cd TTL-text2img/

# create a virtual environment to keep global install clean
python -m venv .venv 
source .venv/bin/activate

# install all necessary packages
pip install -r requirements.txt

Usage

Stage 1: Fine-tune the base model

Exact path leading to your dataset folder has to be ./data. Images and text files has to be all together in the folder. If an image is named 001.jpg its relative txt file should be named 001.txt and so on.

python train.py \
  --data_dir './data' \
  --train_upsample False \
  --project_name 'base_tuning_wandb' \
  --batch_size 4 \
  --learning_rate 1e-04 \
  --side_x 64 \
  --side_y 64 \
  --resize_ratio 1.0 \
  --uncond_p 0.2 \
  --resume_ckpt 'ckpt_to_resume_from.pt' \
  --checkpoints_dir 'my_local_checkpoint_directory' \

Stage 2: Fine-tune the super-resolution model

python train.py \
  --data_dir '/userdir/data/mscoco' \
  --train_upsample True \
  --image_to_upsample './images/low_res_img.png' \
  --upscale_factor 4 \
  --side_x 64 \
  --side_y 64 \
  --uncond_p 0.0 \
  --resume_ckpt 'ckpt_to_resume_from.pt' \
  --checkpoints_dir 'my_local_checkpoint_directory' \

Full Usage

usage: train.py [-h] 
                [--data_dir DATA_DIR] 
                [--batch_size BATCH_SIZE]
                [--learning_rate LEARNING_RATE]
                [--adam_weight_decay ADAM_WEIGHT_DECAY] 
                [--side_x SIDE_X]
                [--side_y SIDE_Y] 
                [--resize_ratio RESIZE_RATIO]
                [--uncond_p UNCOND_P] 
                [--train_upsample]
                [--resume_ckpt RESUME_CKPT]
                [--checkpoints_dir CHECKPOINTS_DIR] [--use_fp16]
                [--device DEVICE] 
                [--log_frequency LOG_FREQUENCY]
                [--freeze_transformer] 
                [--freeze_diffusion]
                [--project_name PROJECT_NAME] [--activation_checkpointing]
                [--use_captions] 
                [--epochs EPOCHS] 
                [--test_prompt TEST_PROMPT]
                [--test_batch_size TEST_BATCH_SIZE]
                [--test_guidance_scale TEST_GUIDANCE_SCALE] 
                [--use_webdataset]
                [--wds_image_key WDS_IMAGE_KEY]
                [--wds_caption_key WDS_CAPTION_KEY]
                [--wds_dataset_name WDS_DATASET_NAME] 
                [--seed SEED]
                [--cudnn_benchmark] 
                [--upscale_factor UPSCALE_FACTOR]

Reference

OpenAI/glide-text2im
OpenAI/guided-diffusion

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
data		data
docker		docker
images		images
scripts		scripts
.gitignore		.gitignore
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

docker

docker

images

images

scripts

scripts

.gitignore

.gitignore

README.md

README.md

app.py

app.py

requirements.txt

requirements.txt

train.py

train.py

Repository files navigation

TTL text2img

Installation

Usage

Stage 1: Fine-tune the base model

Stage 2: Fine-tune the super-resolution model

Full Usage

Reference

About

Releases

Packages

Languages

TranafLee/TTL-text2img

Folders and files

Latest commit

History

Repository files navigation

TTL text2img

Installation

Usage

Stage 1: Fine-tune the base model

Stage 2: Fine-tune the super-resolution model

Full Usage

Reference

About

Resources

Stars

Watchers

Forks

Languages