# QRt: QR Code Art Generator

This project is a part of the elective course on Generative Artificial Intelligence at Innopolis University.

It is a collaborative effort by our team members:

- Polina Zelenskaya, p.zelenskaya@innopolis.university
- Leila Khaertdinova, l.khaertdinova@innopolis.university
- Karina Denisova, k.denisova@innopolis.university
---
This notebook is a evaluation part of our pipeline. The code provided below was designed to run in a cuda-compatable environment with at least 16GB of available memory on the Ubuntu system. For that reason, collab with the `T4 GPU` works perfectly.

In [None]:
!sudo apt-get install zbar-tools
!pip install -q git+https://github.com/huggingface/diffusers accelerate transformers==4.30.0 qrcode pyzbar

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
zbar-tools is already the newest version (0.23.92-4build2).
0 upgraded, 0 newly installed, 0 to remove and 10 not upgraded.
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m261.4/261.4 kB[0m [31m5.3 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.2/7.2 MB[0m [31m78.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m46.2/46.2 kB[0m [31m6.3 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.8/7.8 MB[0m [31m52.1 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m58.1/58.1 kB[0m [31m6.3 MB/s[0m eta [36m0:00:00[0m
[?25h  

In [None]:
import qrcode
from PIL import Image
from pyzbar.pyzbar import decode
import torch
import transformers
from diffusers import UniPCMultistepScheduler
from diffusers import DPMSolverMultistepScheduler
from diffusers import ControlNetModel
from diffusers import StableDiffusionControlNetPipeline

## Create a simple QR-code generator for a text/link

In [None]:
class BasicQR:
    """
    Main class to read/generate basic (target) QR-codes that are readable by any device
    """

    @staticmethod
    def generate(text: str, box_size: int = 10, border: int = 4) -> Image:
        """
        Generates valid qr-code for a given `text`
            > QR version is determined automatically
            > Error correction is at minimal (for smaller image sizes)
            > Black/white colors is used
        """

        # define generator
        qr = qrcode.QRCode(
            error_correction=qrcode.constants.ERROR_CORRECT_L,
            box_size=box_size,
            border=border
        )

        # add text and determine qr version
        qr.add_data(text)
        qr.make(fit=True)  # automatically determine qr version

        # generate and convert to PIL.Image
        return qr.make_image().convert('RGB')

    @staticmethod
    def read(img: Image, print_info: bool = False) -> str or int:
        """
        Reads text data from given QR
            > Works on zbar-tools [please download them using apt-get]
            > If several qr-codes found, return only one (this functionallity is enough for this task)
            > Reader is robust enough so that even stable-diffusion generated once works nicely
            > If no valid QR-codes found, return -1
        """

        # decode image
        decoded = decode(img)

        # if nothing found, print warning and return `-1`
        if not decoded:
            if print_info:
                print('Failed to find any QR')
            return -1
        else:
            if print_info:
                print('The QR for:')
            return decoded[0].data.decode("utf-8")

## Define the models

We used Stable Diffusion model and Control Nets models using the following checkpoints:

1. control net brightness: `ioclab/control_v1p_sd15_brightness`
2. control net title: `lllyasviel/control_v11f1e_sd15_tile`
3. stable diffusion: `SG161222/Realistic_Vision_V2.0`

In [None]:
class StableDiffusionWithControlNet:
    """
    Stable diffusion model with control net for QR-code generation
    """

    def __init__(
        self,
        device: str = 'cuda',
        brightness: str = "ioclab/control_v1p_sd15_brightness",
        title: str = "lllyasviel/control_v11f1e_sd15_tile",
        stable_diffusion_cp: str = "SG161222/Realistic_Vision_V2.0"
    ):

        # load controlnet models
        self.controlnet_brightness = ControlNetModel.from_pretrained(brightness)
        self.controlnet_tile = ControlNetModel.from_pretrained(title)

        # load stable diffusion
        self.stable_diffusion = StableDiffusionControlNetPipeline.from_pretrained(
            stable_diffusion_cp,
            controlnet=[self.controlnet_brightness, self.controlnet_tile]
            ).to(device)

        # define scheduler
        self.stable_diffusion.scheduler = DPMSolverMultistepScheduler.from_config(self.stable_diffusion.scheduler.config, use_karras_sigmas='true')

    def generate(
        self,
        prompt: str,
        qr_text: str,
        width: int = 768,
        height: str = 768,
        num_images_per_prompt: int = 1,
        num_inference_steps: int = 40
    ):
        """
        Generates QR-code based on prompt (given to stable diffusion) and qr_text (feeded to basic qr generator)
            > prompt - text that describes image style
            > qr_text - text that should be stored in qr
            > width - width of output image (better bigger for better performance)
            > height - height of output image (better bigger for better performance)
            > num_images_per_prompt - number of output images
            > num_inference_steps - number of inference steps (preferable 30, but 50+ provide better accuracy)
        """

        # define weights and guidance
        controlnets_weights = [0.35, 0.6]
        guidance_starts = [0, 0.3]
        guidance_stops = [1, 0.7]

        # generate target qr image
        qr_img = BasicQR.generate(qr_text)

        # generate images
        results = self.stable_diffusion(
            prompt,
            image=[qr_img, qr_img],
            num_inference_steps=num_inference_steps,
            width=width, height=height,
            num_images_per_prompt=num_images_per_prompt,
            control_guidance_start=guidance_starts,
            control_guidance_end=guidance_stops,
            controlnet_conditioning_scale=controlnets_weights
        )

        img = results.images[0]

        return img


model = StableDiffusionWithControlNet('cuda')

config.json:   0%|          | 0.00/1.03k [00:00<?, ?B/s]

diffusion_pytorch_model.safetensors:   0%|          | 0.00/1.45G [00:00<?, ?B/s]

The config attributes {'dropout': 0.0, 'sample_size': 32} were passed to ControlNetModel, but are not expected and will be ignored. Please verify your config.json configuration file.


config.json:   0%|          | 0.00/955 [00:00<?, ?B/s]

diffusion_pytorch_model.bin:   0%|          | 0.00/1.45G [00:00<?, ?B/s]

model_index.json:   0%|          | 0.00/577 [00:00<?, ?B/s]

text_encoder/model.safetensors not found


Fetching 15 files:   0%|          | 0/15 [00:00<?, ?it/s]

pytorch_model.bin:   0%|          | 0.00/492M [00:00<?, ?B/s]

safety_checker/config.json:   0%|          | 0.00/4.89k [00:00<?, ?B/s]

tokenizer/merges.txt:   0%|          | 0.00/525k [00:00<?, ?B/s]

(…)ature_extractor/preprocessor_config.json:   0%|          | 0.00/518 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/1.22G [00:00<?, ?B/s]

tokenizer/vocab.json:   0%|          | 0.00/1.06M [00:00<?, ?B/s]

scheduler/scheduler_config.json:   0%|          | 0.00/341 [00:00<?, ?B/s]

tokenizer/tokenizer_config.json:   0%|          | 0.00/806 [00:00<?, ?B/s]

tokenizer/special_tokens_map.json:   0%|          | 0.00/472 [00:00<?, ?B/s]

text_encoder/config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

unet/config.json:   0%|          | 0.00/901 [00:00<?, ?B/s]

vae/config.json:   0%|          | 0.00/548 [00:00<?, ?B/s]

diffusion_pytorch_model.bin:   0%|          | 0.00/3.44G [00:00<?, ?B/s]

diffusion_pytorch_model.bin:   0%|          | 0.00/335M [00:00<?, ?B/s]

Loading pipeline components...:   0%|          | 0/7 [00:00<?, ?it/s]

`text_config_dict` is provided which will be used to initialize `CLIPTextConfig`. The value `text_config["id2label"]` will be overriden.


# Read all prompts

We provide our own prompts for evaluation partly generatd by ourselves and partly by phind.com, with we provide a list of 100 difference nature-related prompts that you can download from [here](https://drive.google.com/file/d/19Ris16qbeu_JQ-965h2QoOfGMy8Zhgbu/view?usp=sharing). Please make sure you save file to `sample_data/` or change prompts_path variable for code below to work.

In [None]:
!gdown "https://drive.google.com/uc?id=19Ris16qbeu_JQ-965h2QoOfGMy8Zhgbu" -O sample_data/qr-prompts.txt

Downloading...
From: https://drive.google.com/uc?id=19Ris16qbeu_JQ-965h2QoOfGMy8Zhgbu
To: /content/sample_data/qr-prompts.txt
  0% 0.00/3.44k [00:00<?, ?B/s]100% 3.44k/3.44k [00:00<00:00, 14.8MB/s]


In [None]:
import os

prompts_path = os.path.join(os.getcwd(), 'sample_data', 'qr-prompts.txt')
with open(prompts_path, 'r', encoding='UTF-8') as file:
    prompts = list(map(str.strip, file.readlines()))

prompts[:10]

['A cute fox in a bamboo forest',
 'A majestic lion in the savannah',
 'A playful dolphin in the ocean',
 'A serene deer in a meadow',
 'A colorful butterfly in a garden',
 'A majestic eagle soaring in the sky',
 'A peaceful river flowing through a lush green valley',
 'A vibrant sunset over a mountain range',
 'A snowy mountain peak under a clear blue sky',
 'A bustling city at night with skyscrapers']

# Evaluation

Evaluate each prompt. For greater diversity we will generate QR-code based on prompt, and them make art with the same QR-code.

In [None]:
images = []
for prompt in prompts:
    images.append(model.generate(prompt=prompt, qr_text=prompt))

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

In [None]:
correct = 0
total = 0

for img, prompt in zip(images, prompts):
    img = img.resize((128,128))
    transcribed = str(BasicQR.read(img))

    if transcribed == '-1':
        print('Failed to identify prompt:', prompt)

    total += 1
    correct += int(prompt.strip() == transcribed.strip())


print('Total number of samples:', total)
print('Correct samples:', correct)
print('Accuracy of readable results:', correct / total)

Failed to identify prompt: A majestic mountain peak covered in snow
Failed to identify prompt: A quiet forest at dawn
Total number of samples: 100
Correct samples: 98
Accuracy of readable results: 0.98


Our approach demonstrates a remarkable level of **accuracy**, with a performance rating of **0.98**, providing queit efficient result.

In [None]:
import os
import hashlib


def md5_hash(img):
   return hashlib.md5(img.tobytes()).hexdigest()


os.makedirs('generated/', exist_ok=True)

for (img, prompt) in zip(images, prompts):
    img.save(f'generated/{prompt}-{md5_hash(img)}.png')

In [None]:
!zip -r /content/generated-qr-codes-100.zip /content/generated

  adding: content/generated/ (stored 0%)
  adding: content/generated/A vibrant city skyline at dawn-54ffe2e595606b645e4557bba6bb1e2a.png (deflated 0%)
  adding: content/generated/A serene forest filled with trees-85284b15a4704388a94a5b9335043bc4.png (deflated 0%)
  adding: content/generated/A bustling city street at night-f9b8d4baa3a6de05066fc0ff6e3f97df.png (deflated 0%)
  adding: content/generated/A peaceful forest in the evening-ce59b615d3301b480459acada4dbbfd5.png (deflated 0%)
  adding: content/generated/A serene forest filled with trees-690bebe0690dda5b03d118244572287d.png (deflated 0%)
  adding: content/generated/A serene forest filled with animals-a0558fe549ac87c32de02c150d4b8349.png (deflated 0%)
  adding: content/generated/A bustling city street at morning-e9f510d94be9ba339a6e5805999459fc.png (deflated 0%)
  adding: content/generated/A beautiful sunrise over a serene lake-bf315b8631efcdab9368d3c2c75061c3.png (deflated 0%)
  adding: content/generated/A colorful butterfly in a 