Prompt generator
Supports parsing prompt descriptions from images, and can be extended based on descriptions for secondary image generation. Support Chinese throughChatGLMExtend the Prompt description.
✅ Models used in this project
All models are lazy loaded, downloaded and loaded only when used, and will not occupy video memory.
-
graphic text
-
Wen Shengwen
-
Chinese extensionChatGLM-6B
-
translate
🚩 This project exists independently and is not integrated intoautomatic111/webui, which is convenient to close at any time to save video memory.
- online demohug face demo
- Graphics and text functions require GPU deployment
- Some models use CPU (translation, Wen Shengwen) to prevent GPU memory overflow
- support
stable diffusion
andmidjourney
twoprompt
Generation method - useChatGlam-6B-Net4save video memory
The ChatGLM model needs to be downloaded separately (download the int4 version), and put it under the models directory of the program
- v1.0Extraction code: 79sk
- v1.5Extraction code: eb33
- v1.8Extract code: 7hbt
- offline modelExtraction code: 6ti4
webui.bat
The main functionwebui_chat.bat
Main function +chatGLM chat interfacewebui_imagetools.bat
image processing toolwebui_offline.bat
use offline mode- Revise
settings.offline.toml
inside the model path - Model
git clone
arrivemodels
Directory (cannot be copied directly from cache)
- Revise
webui_venv.bat
Install it manuallyvenv
Environment, start with this, defaultvenv
Table of contents.- The first run will automatically download the model, and the default download is in the user directory
.cache/huggingface
main
webui.bat
The main functionwebui_chat.bat
Main function +chatGLM chat interfacewebui_imagetools.bat
image processing toolwebui_offline.bat
use offline mode- Revise
settings.offline.toml
inside the model path - Model
git clone
arrivemodels
Directory (cannot be copied directly from cache)
- Revise
webui_venv.bat
Install it manuallyvenv
Environment, start with this, defaultvenv
Table of contents.- The first run will automatically download the model, and the default download is in the user directory
.cache/huggingface
cd image2text_prompt_generator
git pull
orgithub
Package and download zip, overwrite the program directory
使用方法
mircosoft
Generate a simple description (stable diffusion
)mj
Generate a random description (midjourney
)gpt2 650k
andgpt_neo_125M
generate more complex descriptions
- Chinese to English translation
- Chinese passChatGlam-6B-Net4extended to complex description
- translate to english
- Optimize model generation through prompt
- clip is used for multiple people, complex scenes, high video memory usage (>8G)
- blip for simple characters and scenes
- wd14 for figures
- Prompt generation will automatically merge blip or clip + wd14
- Batch buckle background
- paste face (for refining clothes)
- Buckle up
- Batch rename (regular)
- Tagging (Clip+W14 tagging and translation)
quantization level | Minimum GPU memory(reasoning) | Minimum GPU memory(Efficient parameter fine-tuning) |
---|---|---|
FP16 (no quantization) | 13 GB | 14 GB |
INT8 | 8 GB | 9 GB |
INT4 | 6 GB | 7 GB |
fromchatGPTBox
Project, modify some prompt words
-
use
api.bat
start up -
configuration
chatGPTBox
Plugins for custom modelshttp://localhost:8000/chat/completions
-
existreleaseDownload the plugin inside
- not support
cuda
, it is not recommended to use clip - Video memory <6G, it is not recommended to use ChatGLM
配置文件
settings.toml
[server]
port = 7869 # 端口
host = '127.0.0.1' # 局域网访问需要改成 "0.0.0.0"
enable_queue = true # chat功能需要开启,如错误,需要关闭代理
queue_size = 10
show_api = false
debug = true
[chatglm]
model = "THUDM/chatglm-6b-int4" # THUDM/chatglm-6b-int4 THUDM/chatglm-6b-int8 THUDM/chatglm-6b
# 本地模型
# model = "./models/chatglm-6b-int8"
device = "cuda" # cpu mps cuda
enable_chat = false # 是否启用聊天功能
local_files_only = false # 是否只使用本地模型
Please refer toChatGLM loads the model locallyModelgit clone
arrivemodels
directory (not directly fromcache
copy), then modify thesettings-offline.toml
inside the model path
- The windows path is best to use an absolute path, do not contain Chinese
- linux/mac paths can use relative paths
- Model Directory Structure Reference
settings-offline.toml
[generator]
enable = true # 是否启用generator功能
device = "cuda" # cpu mps cuda
fix_sd_prompt = true # 是否修复sd prompt
# models
microsoft_model = "./Promptist"
gpt2_650k_model = "./gpt2-650k-stable-diffusion-prompt-generator"
gpt_neo_125m_model = "./StableDiffusion-Prompt-Generator-GPT-Neo-125M"
mj_model = "./text2image-prompt-generator"
local_files_only = true # 是否只使用本地模型
[translate]
enable = true # 是否启用翻译功能
device = "cuda" # cpu mps cuda
local_files_only = true # 是否只使用本地模型
zh2en_model = "./models/opus-mt-zh-en"
en2zh_model = "./models/opus-mt-en-zh"
cache_dir = "./data/translate_cache" # 翻译缓存目录
[chatglm]
# 本地模型 https://github.com/THUDM/ChatGLM-6B#从本地加载模型
model = ".\\models\\chatglm-6b-int4" # ./chatglm-6b-int4 ./chatglm-6b-int8 ./chatglm-6b
## windows 绝对路径配置方法
# model = "E:\\zhangsan\\models\\chatglm-6b-int4"
device = "cuda" # cpu mps cuda
enable_chat = true # 是否启用聊天功能
local_files_only = true # 是否只使用本地模型
To prevent the c drive from being full, it can be configuredcache
directory to another disk
手动安装
First, make sure your computer has thePython3.10
. If you have not installed
Python, go to the official site (https://www.python.org/downloads/) to download and install the latest version ofPython3.10
.
Next, download and unzip our tools installation package.
Open the command line window (Windows users can press Win + R keys, enter "cmd" in the run box and press Enter to open the command line window), and enter the directory where the tool installation package is located.
Enter the following command in a command line window to install the required dependencies:
git clone https://github.com/zhongpei/image2text_prompt_generator
cd image2text_prompt_generator
# 建立虚拟环境
python -m "venv" venv
# 激活环境 linux & mac
./venv/bin/activate
# 激活环境 windows
.\venv\Scripts\activate
# gpu 加速
pip install torch==2.0.0+cu118 torchvision==0.15.1+cu118 --extra-index-url https://download.pytorch.org/whl/cu118
pip install --upgrade -r requirements.txt
This will automatically install the required Python dependencies. Once installed, you can start the tool by running:
# 激活环境 linux & mac
source ./venv/bin/activate
# 激活环境 windows
.\venv\Scripts\activate
# 运行程序
python app.py
This will launch the tool and open the tool's home page in your browser. If your browser does not open automatically, please manually enter the following URL: http://localhost:7869/ The tools are now successfully installed and started. You can follow the tool's documentation to start using it to process your image data.
- v2.0 LangChain (local file question and answer)
- v1.8 labeling tool
- v1.7 translate local tag cache, translation cache, API
- v1.6 picture tools
- v1.5 add chatGLM model
- v1.0 add webui
- web
- configuration file
- image2text
- clip
- blip
- wd14
- text2text
- ChatGLM
- gpt2 650k
- gpt_neo_125M
- mj
- cutout tool
- cut background
- pick people's heads
- Covering people's faces
- Modify file names in batches
- Load catalog tags and translate
- translate
- f2m, f2f
- WD14 tags translation local cache
- translation cache
- Label
- clip + w14 mixed batch image tags
- LangChain
- index
- question and answer