We propose a framework that synthesizes artistic landscape sketches using a diffusion model-based approach. Furthermore, we suggest a three-channel perspective map (3CPM) that mimics the artistic skill used by real artists. We employ Stable Diffusion, which leads us to use ControlNet to process 3CPM in Stable Diffusion. Additionally, we adopt the Low Rank Adaptation (LoRA) method to fine-tune our framework, thereby enhancing the quality of sketch and resolving the color-remaining problem, which is a frequently observed artifact in the sketch images using diffusion models. We implement a bimodal sketch generation interface: text to sketch and image to sketch. In producing a sketch, a guide token is used so that our method synthesizes an artistic sketch in both cases. Finally, we evaluate our framework using quantitative and quantitative schemes. Various sketch images synthesized by our framework demonstrate the excellence of our study.
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git
you can follow instructions here.
- Weights (about 10GB+): download link
- Stable Diffusion weight:
<git repo path>/stable-diffusion-webui/models/Stable-diffusion
- LoRA weight:
<git repo path>/stable-diffusion-webui/models/Lora
- ControlNet (3CPM) weight:
<git repo path>/stable-diffusion-webui/models/ControlNet
cd stable-diffusion-webui
bash webui.sh --api
cd <git repo>
python main.py