Skip to content
/ ChatSD Public

ChatSD is designed to make image generation tasks easily

Notifications You must be signed in to change notification settings

axzml/ChatSD

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ChatSD


ChatSD is designed to make image generation tasks easily.

ChatSD is based on LLM(Large Language Model) and Stable Diffusion model. So when you communicate with ChatSD, it can understand your intentions and interpret them to appropriate prompts, and pass them into Stable Diffusion model for image generation.

At this point, ChatSD uses ChatGLM-6B and Openjourney, it may support more LLMs and Diffusion models in the future. (Note: this is a project for me to understand llm/diffusion/langchain better)

Quick Start

  1. Clone the project and go to the project workspace:
# clone the project
git clone ....

# go to directory
cd ChatSD
  1. Create a conda environment named chatsd and activate it:
# create a environment named `chatsd` and activate it
conda env create -f environment.yaml
conda activate chatsd

Note: if you want to remove the environment, then execute:

conda deactivate
conda remove -n chatsd --all
  1. Install the cuda version of torch:

refer to https://pytorch.org/ and execute:

## cuda version 11.8
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
  1. Run the main.py script:
python main.py

Note: run the script will download pretrained models from Hugging Face, and if the download process is interrupted due to the unstable network, you can re-execute the script for multiple times for downloading the models continuously.

If you want to input your instructions to ChatSD, then execute:

python main.py --input "Generate an image of cat for me" --grid_rows 2 --grid_cols 2 --image_output_dir "images"

Examples

I want to generate a logo for this project, so I execute the following command for 4 times:

python main.py --input "logo of cat, cute, happy, smile" --grid_rows=3 --grid_cols=3

and the results are:

cat_logo

Acknowledgement

I appreciate the open source of the following projects. Thanks to all the developers, your efforts make the world a better place:

visual-chatgptHugging FaceLangChainStable DiffusionChatGLM-6Bclip-interrogatortext2image-prompt-generatorprompt-generatoropenjourney

About

ChatSD is designed to make image generation tasks easily

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages