This repository contains the scripts that are used to generate the art found on @bmetaldiffusion
Black metal diffusion is using Llama 2 and Stable Diffusion XL to generate images. A server with the following minimum specs is required.
- OS: Ubuntu server 22.04.1 LTS
- CPU: Intel Core i3-6100
- RAM: 16 GB
- GPU: Nvidia 3060 12 GB
- Disk: at least 50 GB of free space to store the models
You have to manually download the Llama 2 model from TheBloke. Create a folder called models
and download in there https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q4_K_M.gguf
The scripts require Python >=3.10. It is recommented to create a virtual environment before you install the requirements.
First you need to install llama-cpp-python
with GPU support enabled.
CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python
The rest of the requirements can be installed using the requirements file.
pip install -r requirements.txt
Start the application using the bmd.py
script.
Use the generate-prompt
subcommand to generate a prompt. This subcommand will use
Llama 2 to convert a song verse into a stable diffusion prompt. Use only a single
verse, instead of the whole song lyrics, for best results.
python bmd.py generate-prompt
Use the generate-image
subcommand to generate an image. This subcommand will use the
prompt generated by the previous script and Stable Diffusion XL to generate an image.
python bmd.py generate-image
If you want to use the Stable Diffusion XL refiner add the --refiner
command line argument.
python bmd.py generate-image --refiner
Keep in mind that the first time you run the scripts, they will download from Hugging Face the Stable Diffusion XL and Stable Diffusion XL refiner models.
- if your system doesn't have the latest cuda drivers you will have to download
manually a version of
torch
that matches the drivers of your system.
pip install torch==2.0.1+cu117 --extra-index-url https://download.pytorch.org/whl/cu117