# Creating Artistic QR Codes at Scale Using LangChain and ControlNet

## Summary
We built a tool that can generate artistic QR codes for a specific website/url with the use of [Deep Lake](https://www.activeloop.ai/), [LangChain](https://python.langchain.com/docs/get_started/introduction.html), [Stable Diffusion](https://www.activeloop.ai/resources/glossary/stable-diffusion/) and [ControlNet](https://github.com/Mikubill/sd-webui-controlnet) via [AUTOMATIC1111](https://github.com/AUTOMATIC1111/stable-diffusion-webui) and [ComfyUI](https://github.com/comfyanonymous/ComfyUI). If you want to try the code directly from our notebook, just download it from the [repository](https://github.com/efenocchi/QRCodeGenerator/blob/main/QRCode_article.ipynb).

Deep Lake is a Database for AI, designed to efficiently store and search large-scale AI data including audio, video or embeddings from text documents, which will also be utilized in this article. It offers unique storage optimization for deep learning applications, featuring data streaming, vector search, data versioning, and seamless integration with other popular frameworks such as LangChain. This comprehensive toolkit is designed to simplify the process of developing workflows of large language model (LLM), and in our case we will focus on its capability to summarize and answer questions from large-scale documents such as web pages.

Stable diffusion is a recent development in the field of image synthesis, with exciting potential for reducing high computational demand. It is primarily used for text-to-image generation, but is capable of variety of other tasks such as image modification, inpainting, outpainting, upscaling and generating image-to-image conditioned on text input. Meanwhile, ControlNet is an innovative neural network architecture that is a game-changer in managing the control of these diffusion models by integrating extra conditions. These control techniques include edge and line detection, human poses, image segmentation, depth maps, image styles or simple user scribbles. By applying these techniques, it is then possible to condition our output image with QR codes as well. In case you would be interested in more details, we recommend reading the [original ControlNet article](https://arxiv.org/abs/2302.05543).


By combining all of this, we can achieve a scalable generation of QR codes that are very unique and more likely will attract attention. Overall, there are many possibilities you can approach this problem, and in this article, we will present those that we believe have the highest potential to impact advertising in the future. These are the steps that we are going to walk you through:

## Steps
1. Scraping the Content From a Website and Splitting It Into Documents
2. Saving the Documents Along With Their Embeddings to Deep Lake
3. Extracting the Most Relevant Documents
4. Creating Prompts to Generate an Image Based on Documents
    - 4.1 Custom summary prompt + LLMChain
    - 4.2 QA retrieval + LLM
5. Summarizing the Created Prompts
6. Generating Simple QR From URL and inserting custom logo
7. Generating Artistic QR Codes for Activeloop
    - 7.1. Txt2Img
        - 7.1.1 Content prompt
        - 7.1.2 Portrait prompt
        - 7.1.3 Deep Lake prompt
    - 7.2. Img2Img with logo
        - 7.2.1 Content prompt
        - 7.2.2 Portrait prompt
        - 7.2.3 Deep Lake prompt
8. Generating Artistic QR Codes for E-commerce
    - 8.1. Img2Img with logo - Tommy Hilfiger
    - 8.2. Img2Img with logo - Patagonia
9. Hands on with ComfyUI
10. Limitations of Our Approach
11. Conclusion
12. FAQs      

Before we start, we need to install requirements, import LangChain and set the following API tokens:
- Apify token - web scraping/crawling
- Activeloop token - Deep Lake Vector Store
- OpenAI token - language model

In [None]:
!pip install langchain deeplake openai qrcode apify_client tiktoken langchain-openai
!apt install libzbar0
!pip install qreader opencv-python python-dotenv

In [None]:
import cv2
import numpy as np
import sys
import time

In [None]:
# Install dependencies
## pip install openai langchain deeplake apify-client tiktoken pydantic==1.10.8

# Import libraries
from langchain_community.vectorstores import DeepLake
from langchain_openai import OpenAIEmbeddings
from langchain.utilities import ApifyWrapper
from langchain.text_splitter import CharacterTextSplitter
#from langchain.document_loaders.base import Document
from langchain.docstore.document import Document
from langchain.chains import RetrievalQA
from langchain_openai import OpenAI
from langchain.chains import LLMChain
from langchain import PromptTemplate
import os
from dotenv import load_dotenv

load_dotenv()

# Set API tokens
os.environ['OPENAI_API_KEY'] = os.getenv('OPENAI_API_KEY')
os.environ['ACTIVELOOP_TOKEN'] = os.getenv('ACTIVELOOP_TOKEN')
os.environ["APIFY_API_TOKEN"] = os.getenv('APIFY_API_TOKEN')

### Step 1: Scraping the Content From a Website and Splitting It Into Documents
First of all, we need to collect data that will be used as a content used to generate QR codes. Since the goal is to personalize it to a specific website, we provide a simple pipeline that can crawl data from a given URL. As an example, we use https://www.activeloop.ai/ from which we scraped 20 pages, but you could use any other website as long as it does not violate the Terms of Use. Or, if you wish to use other type of content, LangChain provide many other [File loaders](https://js.langchain.com/docs/modules/indexes/document_loaders/examples/file_loaders/) and [Website loaders](https://js.langchain.com/docs/modules/indexes/document_loaders/examples/web_loaders/) and you can personalize QR codes for them too!


In [None]:
# We use crawler from ApifyWrapper(), which is available in Langchain
# For convenience, we set 20 maximum pages to crawl with a timeout of 300 seconds.
apify = ApifyWrapper()
loader = apify.call_actor(
    actor_id="apify/website-content-crawler",
    run_input={"startUrls": [{"url": "https://www.activeloop.ai/"}], "maxCrawlPages": 20},
    dataset_mapping_function=lambda item: Document(
        page_content=item["text"] or "", metadata={"source": item["url"]}
    ),
    timeout_secs=300,
)

# Now the pages are loaded and split into chunks with a maximum size of 1000 tokens
pages = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0, separator = ".")
docs = text_splitter.split_documents(pages)

In [None]:
docs


### Step 2: Saving the Documents Along With Their Embeddings to Deep Lake
Once the website is scraped and pages are split into documents, it's time to generate the embeddings and save them to the Deep Lake. This means that we can come back to our previously scraped data at any time and don't need to recalculate the embeddings again. To do that, you need to set your `ACTIVELOOP_ORGANIZATION_ID`.

In [None]:
activeloop_org = "YOUR_ACTIVELOOP_ORG_ID"
# initialize the embedding model
embeddings = OpenAIEmbeddings()

# initialize the database, can also be used to load the database
db = DeepLake(
    dataset_path=f"hub://{activeloop_org}/scraped-websites",
    embedding=embeddings,
    overwrite=False,
)

# save the documents
db.add_documents(docs)

### Step 3: Extracting the Most Relevant Documents
Since we want to generate an image in the context of the given website that can have hundreds of pages, it is useful to filter documents that are the most relevant for our query, in order to save money on chained API calls to LLM. For this, we are going to leverage [Deep Lake Vector Store](https://docs.activeloop.ai/tutorials/vector-store/deep-lake-vector-store-in-langchain) similarity search as well as retrieval functionality.

To pre-filter the documents based on a query, choose the query that works best for you based on the type of information you have scraped from the internet


In [None]:
query_for_company = 'Business core of the company'

result = db.similarity_search(query_for_company, k=10)
result

For question-answering pipeline, we can then define the retriever

In [None]:
retriever = db.as_retriever(
    search_kwargs={"k":10}
)

### Step 4: Creating Prompts to Generate an Image Based on Documents
The goal is to understand the content and generate prompts in an automated way, so that the process can be scalable. We start by initializing the LLM with a default `gpt-3.5-turbo-instruct` model and set medium temperature to introduce some randomness.

In [None]:
# Initialize LLM
llm = OpenAI(temperature=0.5)

One of many advantages of LangChain are also prompt templates, which significantly help with clarity and readability. To make the output description more precise, we should also provide examples as can be seen here.

In [None]:
query = "You are a prompt generator. Based on the content, write a detailed one sentence description that can be used to generate an image"

In [None]:
prompt_template = """{query}:

Content: {text}
"""

# set the prompt template
PROMPT = PromptTemplate(
    template=prompt_template,
    input_variables=["text"],
    partial_variables={"query": query}
)

The `query` is used to indicate some information we expect to output, `text` is the content provided to LLM, whereby it is supposed to provide a detailed description of the image. Additionally, to have more control over the output, we also create an alternative prompt that can generate a specific image type.

Using this, we then experimented with 2 following approaches, that differ in what kind of `text` is provided.


#### Option 1: Custom summary prompt with LLMChain

The idea is simple, we chain the description prompt on each filtered document and then apply it once again on the summarized descriptions. In other words, `text` will be a variable that is iterated during `LMMChain` operation.

In [None]:
# Initialize the chain
chain = LLMChain(llm=llm, prompt=PROMPT)

# Filter the most relevant documents
result = db.similarity_search(query_for_company, k=10)
# Run the Chain
image_prompt = chain.invoke(result)
image_prompt = image_prompt["text"]
image_prompt

#### Option 2: Retrieval Question-Answering with LLM

Here we initialize QA retriever, which will allow us to ask to explain a particular concept on the filtered documents.

In [None]:
qa = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type='stuff',
    retriever=retriever
)

chain_answer = qa.invoke("Explain what is Deep Lake")
chain_answer

The `answer` is then used as `text` in the `PromptTemplate` without the need for any chain.

In [None]:
answer = llm(prompt=PROMPT.format(text=answer))
answer

### Step 5: Summarizing the Created Prompts
We experimented with different prompt setups in the previous section, and yet there is more to explore. In case you would be interested in perfectionizing your LLM prompts even further, we have an [amazing course](https://learn.activeloop.ai/courses/take/langchain/multimedia/46317727-intro-to-prompt-engineering-tips-and-tricks) that will provide you many useful tips and tricks. Mastering prompts for image generation is, however, more art than science. Nevertheless, by providing the LLM with examples we can see that it can do a pretty good job by generating very specific image descriptions. Here are 3 different types of prompts that we were able to generate with our approach:

#### 1. Content prompt
This prompt summarizes all relevant documents scraped from Activeloop into a general but detailed image description: `high-tech, futuristic, AI-driven, advanced, complex, computer-generated, robot, machine learning, data visualization, interactive, cutting-edge technology, automation, precision, efficiency, innovation, digital transformation, smart technology, science fiction-inspired.
`

#### 2. Portrait prompt
Additionally to previous prompt, we also condition on the type of the image, which is in this case a detailed image description of a portrait: `High quality portrait of a developer working on LangChain, surrounded by computer screens and programming tools, with a focus on the keyboard and coding on the screen.
`

#### 3. Deep Lake prompt
Here we show a Question-Answering example with a detailed image description of Deep Lake: `An aerial view of a serene, glassy lake surrounded by trees and mountains, with giant blocks of data floating on the surface, each block representing a different data type such as images, videos, audio, and tabular data, all stored as tensors, while a team of data scientists in a nearby cabin focus on their work to build advanced deep learning models, powered by GPUs that are seamlessly integrated with Deep Lake.
`

### Step 6: Generating Simple QR From URL and inserting custom logo
Before we generate the art, it is important to prepare the simple QR code for ControlNet, which can be created directly from Python code. It is important to set the error correction level to 'H', which increases the probability of QR being readable, as 30% of the code can be covered/destroyed by an image. To generate a QR code with a logo, we created a function that takes the logo image and places it on the previously generated QR code. It is also important to note, that some of the URLs might be too long to generate a QR that is not too complicated and reliable enough for scanning. For this purpose, we can use url shorteners such as [bit.ly](bit.ly).

In [None]:
import qrcode
from PIL import Image

def create_qrcode(url:str):
    QRcode = qrcode.QRCode(
        error_correction=qrcode.constants.ERROR_CORRECT_H
    )

    # taking url or text
    url = 'https://www.activeloop.com/'

    # adding URL or text to QRcode
    QRcode.add_data(url)

    # adding color to QR code
    QRimg = QRcode.make_image(
        back_color="white").convert('RGB')
    return QRimg

def qr_with_logo(logo_path: str, QRimg: Image.Image, output_image_name: str):
    logo = Image.open(logo_path)

    # taking base width
    basewidth = 100

    # adjust image size
    wpercent = (basewidth/float(logo.size[0]))
    hsize = int((float(logo.size[1])*float(wpercent)))
    logo = logo.resize((basewidth, hsize))

    # set size of QR code
    pos = ((QRimg.size[0] - logo.size[0]) // 2,
        (QRimg.size[1] - logo.size[1]) // 2)
    QRimg.paste(logo, pos)

    # save the QR code generated
    QRimg.save(output_image_name)
    return QRimg



### Step 7: Generating Artistic QR Codes for Activeloop
First of all, we need to keep in mind that it is still very fresh and unexplored topic and the more pleasing-looking QRs you want to generate, the higher risk of not being readable by a scanner. This results in an endless cycle of adjusting parameters to find the most general setup. Many approaches can be applied, but their main difference is in ControlNet units. The highest success we had was with [brightness and tile preprocessors](https://huggingface.co/lllyasviel/ControlNet-v1-1/tree/main), as well as the [qrcode preprocessor](https://huggingface.co/DionTimmer/controlnet_qrcode). Sometimes, adding a depth preprocessor was also helpful. A great guide on how to set up the Stable-diffusion webui with ControlNet extension to generate your first QR codes can be found for example [here](https://www.youtube.com/watch?v=HOY5J9UT_lY). Nevertheless, there is no single setup that would work 100% of the time and a lot of experimenting is needed, especially in terms of finetuning the control's strength/start/end to achieve a desirable output.

For example, in most of the QR codes we used the following setup:
- Negative prompt: ugly, disfigured, low quality, blurry, nsfw
- Steps: 20
- Sampler: DPM++ 2M Karras
- CFG scale: 9
- Size: 768x768
- Model: dreamshaper_631BakedVae
- ControlNet
    - 0: preprocessor: none, model: control_v1p_sd15_qrcode, weight: 1.1, starting/ending: (0, 1), resize mode: Crop and Resize, pixel perfect: False, control mode: Balanced
    - 1: preprocessor: none, model: control_v1p_sd15_brightness, weight: 0.3, starting/ending: (0, 1), resize mode: Crop and Resize, pixel perfect: False, control mode: Balanced
    
In case of Img2Img, we would also need to put an inpaint mask to disable any changes to the logo.


### Download and run AUTOMATIC1111
In order to play with this interface you have to clone the [official repository](https://github.com/AUTOMATIC1111/stable-diffusion-webui) in you workspace and double click the `webui-user.bat` file.

In [None]:
!git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui

### Integrate ControlNet in AUTOMATIC1111 

If you want to install ControlNet you must follow the istructions given in the [official repository](https://github.com/Mikubill/sd-webui-controlnet):

- Open "Extensions" tab.
- Open "Install from URL" tab in the tab.
- Enter https://github.com/Mikubill/sd-webui-controlnet.git to "URL for extension's git repository".
- Press "Install" button.
- Wait for 5 seconds, and you will see the message "Installed into stable-diffusion-webui\extensions\sd-webui-controlnet. Use Installed tab to restart".
- Go to "Installed" tab, click "Check for updates", and then click "Apply and restart UI". (The next time you can also use these buttons to update ControlNet.)
- Completely restart A1111 webui including your terminal. (If you do not know what is a "terminal", you can reboot your computer to achieve the same effect.)
- Download models (see below).
- After you put models in the correct folder, you may need to refresh to see the models. The refresh button is right to your "Model" dropdown.

If you visit the official page of [AUTOMATIC1111](https://github.com/AUTOMATIC1111/stable-diffusion-webui), you can see the interface from which you can select the settings you prefer.

<img width="600" src="https://raw.githubusercontent.com/AUTOMATIC1111/stable-diffusion-webui/master/screenshot.png">


In the above part you can choose what type of task to perform `txt2img`, `img2img` and so on, in the middle part of the interface there are settings you can change. If you install the controlnet plugin you will be able to find all the configurations that will allow you to control how much the impact of ControlNet will weigh on the final result.

<img width="830" src="images/controlnet_interface_1.PNG">
<img width="800" src="images/controlnet_interface_2.PNG">


#### Txt2Img - generating QR code from a simple QR and previously created prompt
##### Content prompt
| | |
|:-------------------------:|:-------------------------:|
|<img width="400" src="images/7.1.1.1.png"> | <img width="400" src="images/7.1.1.2.png">|
|<img width="400" src="images/7.1.1.3.png"> | <img width="400" src="images/7.1.1.4.png">|

##### Portrait prompt
| | |
|:-------------------------:|:-------------------------:|
|<img width="400" src="images/7.1.2.1.png"> | <img width="400" src="images/7.1.2.2.png">|
|<img width="400" src="images/7.1.2.3.png"> | <img width="400" src="images/7.1.2.4.png">|

##### Deep Lake prompt
| | |
|:-------------------------:|:-------------------------:|
|<img width="400" src="images/7.1.3.1.png"> | <img width="400" src="images/7.1.3.2.png">|
|<img width="400" src="images/7.1.3.3.png"> | <img width="400" src="images/7.1.3.4.png">|
        
#### Img2Img with logo - generating QR code from a QR with logo and previously created prompt
##### Content prompt
| | |
|:-------------------------:|:-------------------------:|
|<img width="400" src="images/7.2.1.1.png"> | <img width="400" src="images/7.2.1.2.png">|
|<img width="400" src="images/7.2.1.3.png"> | <img width="400" src="images/7.2.1.4.png">|

##### Portrait prompt
| | |
|:-------------------------:|:-------------------------:|
|<img width="400" src="images/7.2.2.1.png"> | <img width="400" src="images/7.2.2.2.png">|
|<img width="400" src="images/7.2.2.3.png"> | <img width="400" src="images/7.2.2.4.png">|

##### Deep Lake prompt
| | |
|:-------------------------:|:-------------------------:|
|<img width="400" src="images/7.2.3.1.png"> | <img width="400" src="images/7.2.3.2.png">|
|<img width="400" src="images/7.2.3.3.png"> | <img width="400" src="images/7.2.3.4.png">|



### Step 8: Generating Artistic QR Codes for E-commerce

The idea here is a little different compared to the previous examples in context of [Activeloop](activeloop.com).
Now, we focus on product advertising and we want to generate a QR code only for a single URL and its product. The challenge is to generate QR code, while also keeping the product as similar to the original as possible to avoid misleading information. To do this, we experimented with many preprocessors such as the `tile`, `depth`, `reference_only`, `lineart` or `styles`, but we found most of them too unreliable and far from being similar to the original input. At this moment, we believe that the most useful  is the `tile` preprocessor, which can preserve a lot of information. The disadvantage is, however, that it does not allow for many changes during control phase and the QR fit can sometimes be questionable. In practice, this would be done by adding another CotntrolNet unit:
- 2: preprocessor: none, model: control_v11f1e_sd15_tile, weight: 1.0, starting/ending: (0, 1), resize mode: Crop and Resize, pixel perfect: False, control mode: Balanced
Since the `tile` input image control is very strong, theres not much else we can do. Styles are one of the little extra adjustments possible and very useful style cheat sheet can be found [here](https://supagruen.github.io/StableDiffusion-CheatSheet/). For our purposes, however, we did not end up utilizing any of them.


Similarly as before, we generated prompts automaticaly from the given websites. We randomly selected 2 products and in the first case (Tommy Hilfiger) We added logo to the initial basic QR code while in the second case (Patagonia), we only mask the logo that is already present on the product. To see the comparison, we also provide the original input images (Sources: [Patagonia](https://eu.patagonia.com/cz/en/product/mens-capilene-cool-daily-graphic-shirt/45235.html?dwvar_45235_color=SSMX&cgid=mens-shirts-tech-tops), [Tommy Hilfiger](https://uk.tommy.com/tommy-hilfiger-x-vacation-flag-embroidery-t-shirt-mw0mw33438ybl)).

#### Img2Img with logo - generating Tommy Hilfiger QR code

| | |
|:-------------------------:|:-------------------------:|
| <img width="400" src="images/8.1.1.png"> | <img width="400" src="images/8.1.2.png"> |
| <img width="400" src="images/8.1.3.png"> | <img width="400" src="images/8.1.4.png"> |

#### Img2Img with logo - generating Patagonia QR code
| | |
|:-------------------------:|:-------------------------:|
| <img width="400" src="images/8.2.1.png"> | <img width="400" src="images/8.2.2.png"> |
| <img width="400" src="images/8.2.3.png"> | <img width="400" src="images/8.2.4.png"> |

### Step 9: Hands on with ComfyUI
A different approach from the one listed above can be obtained by taking advantage of ComfyUI which is a powerful and modular stable diffusion GUI.
The main idea is to create a schema from the proposed GUI and transform this schema into code thanks to an extension called `ComfyUI-to-Python-Extension`.
We need to load the Stable Diffusion and the ControlNet checkpoints we want to use. 

In our case we experimented with:
- Diffusion models: `v1-5-pruned-emaonly`, `dreamshaper_8` and `revAnimated_v122EOL` 
- ControlNet models: `control_v1p_sd15_brightness`, `control_v1p_sd15_qrcode`, `control_v11f1e_sd15_tile` and `control_v11f1p_sd15_depth`.

Install ComfyUI

In [None]:
!git clone https://github.com/comfyanonymous/ComfyUI.git
%cd ComfyUI
!pip install -r requirements.txt

Add the ComfyUI plugin to transform the schema into code

In [None]:
!git clone https://github.com/pydn/ComfyUI-to-Python-Extension.git
%cd ComfyUI-to-Python-Extension
!pip install -r requirements.txt

Download all the checkpoints in the right folder

In [None]:
!wget https://github.com/efenocchi/QRCodeGenerator/blob/main/3_workflow_qr_codes2.json https://github.com/efenocchi/QRCodeGenerator/blob/main/2_workflow_qr_codes.json -P /content/ComfyUI/ComfyUI-to-Python-Extension
!wget https://github.com/efenocchi/QRCodeGenerator/blob/main/workflow_api.py -P /content/ComfyUI/ComfyUI-to-Python-Extension
!wget https://huggingface.co/stabilityai/sd-vae-ft-mse-original/resolve/main/vae-ft-mse-840000-ema-pruned.safetensors -P /content/ComfyUI/models/vae
!wget https://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11f1p_sd15_depth.pth -P /content/ComfyUI/models/controlnet
!wget https://huggingface.co/latentcat/latentcat-controlnet/resolve/main/models/control_v1p_sd15_brightness.safetensors -P /content/ComfyUI/models/controlnet
!wget https://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11f1e_sd15_tile.pth -P /content/ComfyUI/models/controlnet
!wget https://huggingface.co/autismanon/modeldump/resolve/main/dreamshaper_8.safetensors -P /content/ComfyUI/models/checkpoints
!wget https://civitai.com/api/download/models/46846?type=Model&format=SafeTensor&size=full&fp=fp32 -P /content/ComfyUI/models/checkpoints

If you have problems downloading the model from CivitAI try downloading it manually after logging in or download it directly from Hugging Face, in this last case remember to pay attention to choosing the right name when loading the model in the following steps.

In [None]:
# !wget https://huggingface.co/emmajoanne/models/resolve/main/revAnimated_v122.safetensors -P /content/ComfyUI/models/checkpoints

Run the Comfyui GUI.

If you encounter problems, run this command from the terminal. For advice and error resolution, refer to the [official repository](https://github.com/comfyanonymous/ComfyUI).

Since this is not a ComfyUI guide and some things may not be clear, if you want to know more, consult the official repository or some free guides like [this one](https://www.youtube.com/watch?v=LNOlk8oz1nY&ab_channel=OlivioSarikas).

<span style="color:red">This command has been put for informational purposes only, to proceed with image generation you will not need to create the scheme from scratch with ComfyUI but directly execute the cells shown below.</span>

In [None]:
#!python main.py

Below will be shown one of the schemes used, it is composed by 1 Diffusion model called `v1-5-pruned-emaonly` and 3 different controlnet models `control_v1p_sd15_brightness` and `control_v11f1e_sd15_tile`.

![schema_2_controlnet.webp](images/schema_2_controlnet.webp)

As illustrated, the QR Code was generated by combining the basic QR Code image with a textual input, making it easy to merge the initial image with the generated one. In this case the positive prompt was simply "a cyborg character" and the negative one "ugly, artefacts, bad". This pipeline produced the images shown below:

<img width="300" src="images/qrcode_detail1.webp"> <img width="300" src="images/qrcode_detail2.webp">

<img width="300" src="images/qrcode_detail3.webp"> <img width="300" src="images/qrcode_detail4.webp">


To export the schema and transform it to python code you must follow some different step:
1) Enable Dev mode Options: you need to click on the settings button (located above to the Queue Prompt text in the window that will appear when you activate ComfyUI) and select the "Enable Dev mode Options" box.
2) Export the schema via the button "Save (API Format)"
3) Put this schema in the ComfyUI-to-Python-Extension folder
4) Run the python file `comfyui_to_python.py`

<span style="color:red">As with the previous command, this explanation has also been made for informational purposes only, to proceed with image generation you will not need to create the scheme from scratch with ComfyUI and convert it into python code because this step has already been done by me.</span>


The following functions are used to create the QR code from a text and to load a logo in the center of it:

Go to the main ComfyUI folder

In [None]:
# if you are in Colab you can simply run
# % cd /content/ComfyUI
%cd ..
!ls

In [None]:
img = create_qrcode('https://www.activeloop.com/')
img.save("activeloop_qr.jpg")
img_with_logo = qr_with_logo("activeloop_logo.jpg", img, "activeloop_qr_with_logo.jpg")
img_with_logo

Set the correct path to be able to work with ComfyUI

In [None]:
import os
import random
import sys
from typing import Sequence, Mapping, Any, Union
import torch


def get_value_at_index(obj: Union[Sequence, Mapping], index: int) -> Any:
    """Returns the value at the given index of a sequence or mapping.

    If the object is a sequence (like list or string), returns the value at the given index.
    If the object is a mapping (like a dictionary), returns the value at the index-th key.

    Some return a dictionary, in these cases, we look for the "results" key

    Args:
        obj (Union[Sequence, Mapping]): The object to retrieve the value from.
        index (int): The index of the value to retrieve.

    Returns:
        Any: The value at the given index.

    Raises:
        IndexError: If the index is out of bounds for the object and the object is not a mapping.
    """
    try:
        return obj[index]
    except KeyError:
        return obj["result"][index]


def find_path(name: str, path: str = None) -> str:
    """
    Recursively looks at parent folders starting from the given path until it finds the given name.
    Returns the path as a Path object if found, or None otherwise.
    """
    # If no path is given, use the current working directory
    if path is None:
        path = os.getcwd()

    # Check if the current directory contains the name
    if name in os.listdir(path):
        path_name = os.path.join(path, name)
        print(f"{name} found: {path_name}")
        return path_name

    # Get the parent directory
    parent_directory = os.path.dirname(path)

    # If the parent directory is the same as the current directory, we've reached the root and stop the search
    if parent_directory == path:
        return None

    # Recursively call the function with the parent directory
    return find_path(name, parent_directory)


def add_comfyui_directory_to_sys_path() -> None:
    """
    Add 'ComfyUI' to the sys.path
    """
    comfyui_path = find_path("ComfyUI")
    if comfyui_path is not None and os.path.isdir(comfyui_path):
        sys.path.append(comfyui_path)
        print(f"'{comfyui_path}' added to sys.path")


# def add_extra_model_paths() -> None:
#     """
#     Parse the optional extra_model_paths.yaml file and add the parsed paths to the sys.path.
#     """
#     from main import load_extra_path_config

#     extra_model_paths = find_path("extra_model_paths.yaml")

#     if extra_model_paths is not None:
#         load_extra_path_config(extra_model_paths)
#     else:
#         print("Could not find the extra_model_paths config file.")


add_comfyui_directory_to_sys_path()
# add_extra_model_paths()

Upload all the models that you will need during the generation phase, in this case we will need a model to generate the image starting from the text and another 2-3 models to integrate the generated image into the image of our QR code


In [None]:
from nodes import (
    KSampler,
    CLIPTextEncode,
    ControlNetApplyAdvanced,
    VAEDecode,
    CheckpointLoaderSimple,
    LoadImage,
    ControlNetLoader,
    NODE_CLASS_MAPPINGS,
    EmptyLatentImage,
    VAELoader,
    SaveImage,
)
controlnetapplyadvanced = ControlNetApplyAdvanced()
ksampler = KSampler()
vaedecode = VAEDecode()
saveimage = SaveImage()
vaeloader = VAELoader()

def load_checkpoints(diffuser_model:str = "dreamshaper_8.safetensors", controlnet_1:str = "control_v11f1e_sd15_tile.pth", controlnet_2: str = "control_v1p_sd15_brightness.safetensors", controlnet_3: str = None):
    with torch.inference_mode():

        checkpointloadersimple = CheckpointLoaderSimple()
        checkpointloadersimple_4 = checkpointloadersimple.load_checkpoint(
            ckpt_name=diffuser_model
        )
        emptylatentimage = EmptyLatentImage()
        emptylatentimage_5 = emptylatentimage.generate(
            width=768, height=768, batch_size=4
        )
        controlnetloader = ControlNetLoader()
        controlnetloader_10 = controlnetloader.load_controlnet(
                control_net_name=controlnet_1
        )

        controlnetloader_11 = controlnetloader.load_controlnet(
            control_net_name=controlnet_2
        )
        controlnetloader_12 = None
        if controlnet_3 is not None:
            controlnetloader_12 = controlnetloader.load_controlnet(
                control_net_name=controlnet_3
            )
        vaeloader_24 = vaeloader.load_vae(
            vae_name="vae-ft-mse-840000-ema-pruned.safetensors"
        )
        return checkpointloadersimple_4, emptylatentimage_5, controlnetloader_10, controlnetloader_11, controlnetloader_12, vaeloader_24

Choose which models to use, pass the same name as the models in the "checkpoints" and "controlnet" folders. You can decide whether to use two controlnets or three, passing the filenames for controlnet_1 and controlnet_2 or controlnet_1, controlnet_2 and controlnet_3.  By default only two are used.


In [None]:
checkpointloadersimple_4, emptylatentimage_5, controlnetloader_10, controlnetloader_11, controlnetloader_12, vaeloader_24 = load_checkpoints()

Load the CLIP templates which, given the texts, will allow you to choose which image to generate and which effects to avoid

In [None]:

def load_text_and_image(prompt_text:str = None, prompt_text_negative:str = None, input_image_path: str = None):
  torch.device("cuda" if torch.cuda.is_available() else "cpu")
    
  with torch.inference_mode():

        cliptextencode = CLIPTextEncode()
        cliptextencode_6 = cliptextencode.encode(
            text=prompt_text, clip=get_value_at_index(checkpointloadersimple_4, 1)
        )

        cliptextencode_7 = cliptextencode.encode(
            text=prompt_text_negative,
            clip=get_value_at_index(checkpointloadersimple_4, 1),
        )


        loadimage = LoadImage()
        loadimage_14 = loadimage.load_image(image=input_image_path)


        return cliptextencode_6, cliptextencode_7, loadimage_14

This is the heart of the generation and allows you to choose the values ​​to attribute to the different controlnets, if you want to play with the parameters and see how the result varies as they vary, you can go to the custom values ​​variable or modify the `start_percent` and `end_percent` values ​​directly in the models. 

- `strength`: strength of controlnet; 1.0 is full strength, 0.0 is no effect at all.
- `start_percent`: sampling step percentage at which controlnet should start to be applied - no matter what start_percent is set on timestep keyframes, they won't take effect until this start_percent is reached.
- `stop_percent`: sampling step percentage at which controlnet should stop being applied - no matter what start_percent is set on timestep keyframes, they won't take effect once this end_percent is reached.

In [None]:
def run_inference(cliptextencode_6, cliptextencode_7, loadimage_14, tile_controlnet_values: float=0.5, brightness_controlnet_values: float = 0.35, depth_controlnet_values: float = 1.0, number_of_examples=3):
  with torch.inference_mode():
          for _ in range(number_of_examples):
            controlnetapplyadvanced_28 = controlnetapplyadvanced.apply_controlnet(
                strength=tile_controlnet_values,
                start_percent=0.35,
                end_percent=0.6,
                positive=get_value_at_index(cliptextencode_6, 0),
                negative=get_value_at_index(cliptextencode_7, 0),
                control_net=get_value_at_index(controlnetloader_10, 0),
                image=get_value_at_index(loadimage_14, 0),
            )

            controlnetapplyadvanced_27 = controlnetapplyadvanced.apply_controlnet(
                strength=brightness_controlnet_values,
                start_percent=0,
                end_percent=1,
                positive=get_value_at_index(controlnetapplyadvanced_28, 0),
                negative=get_value_at_index(controlnetapplyadvanced_28, 1),
                control_net=get_value_at_index(controlnetloader_11, 0),
                image=get_value_at_index(loadimage_14, 0),
            )
            controlnetapplyadvanced_26 = None
            if controlnetloader_12 is not None:
                controlnetapplyadvanced_26 = controlnetapplyadvanced.apply_controlnet(
                    strength=depth_controlnet_values,
                    start_percent=0,
                    end_percent=0.2,
                    positive=get_value_at_index(controlnetapplyadvanced_27, 0),
                    negative=get_value_at_index(controlnetapplyadvanced_27, 1),
                    control_net=get_value_at_index(controlnetloader_12, 0),
                    image=get_value_at_index(loadimage_14, 0),
                )

          if controlnetapplyadvanced_26 is not None:
            ksampler_17 = ksampler.sample(
                seed=random.randint(1, 2**64),
                steps=20,
                cfg=8,
                sampler_name="euler",
                scheduler="normal",
                denoise=1,
                model=get_value_at_index(checkpointloadersimple_4, 0),
                positive=get_value_at_index(controlnetapplyadvanced_26, 0),
                negative=get_value_at_index(controlnetapplyadvanced_26, 1),
                latent_image=get_value_at_index(emptylatentimage_5, 0),
            )
          else:
            ksampler_17 = ksampler.sample(
                seed=random.randint(1, 2**64),
                steps=20,
                cfg=8,
                sampler_name="euler",
                scheduler="normal",
                denoise=1,
                model=get_value_at_index(checkpointloadersimple_4, 0),
                positive=get_value_at_index(controlnetapplyadvanced_27, 0),
                negative=get_value_at_index(controlnetapplyadvanced_27, 1),
                latent_image=get_value_at_index(emptylatentimage_5, 0),
            )

            vaedecode_18 = vaedecode.decode(
                samples=get_value_at_index(ksampler_17, 0),
                vae=get_value_at_index(vaeloader_24, 0),
            )

            saveimage_29 = saveimage.save_images(
                filename_prefix="ComfyUI", images=get_value_at_index(vaedecode_18, 0)
            )

In [None]:
%cd ComfyUI	

In [None]:
# IF YOU ARE IN COLAB COULD BE QUICKER TO USE THE FOLLOWING PATH
#OUTPUT_DIR = "/content/ComfyUI/output/"
#OUTPUT_DIR_NR = "/content/ComfyUI/output/non_readable/"

# IF YOU ARE NOT IN COLAB
OUTPUT_DIR = "output/"
OUTPUT_DIR_NR = "output/non_readable/"

os.makedirs(OUTPUT_DIR_NR, exist_ok = True)

Choose the prompt from the first step based on your chosen company, or write a prompt from scratch

In [None]:
print(image_prompt)
print(chain_answer)
print(answer)

In [None]:
prompt_text = "a Deep Lake in the forest"
prompt_text_negative = "ugly, bad, artifacts"
# must be in the ComfyUI/input/ folder
input_image_path = "activeloop_qr.jpg"

cliptextencode_6, cliptextencode_7, loadimage_14 = load_text_and_image(prompt_text, prompt_text_negative, input_image_path)

Run the inference function and check the generated images in the output folder

In [None]:
run_inference(cliptextencode_6, cliptextencode_7, loadimage_14, tile_controlnet_values=0.5, brightness_controlnet_values=0.35, depth_controlnet_values=1.0, number_of_examples=1)

Depending on the CheckpointLoaderSimple template loaded (see `load_checkpoints` function), you will get different results with the same prompt.

Here, for example, we insert `a cyborg character` as text and the results obtained are the following:

### With v1-5-pruned-emaonly.safetensors

<img width="200" src="images/qrcode_v1.5_1.webp"> <img width="200" src="images/qrcode_v1.5_2.webp">
<img width="200" src="images/qrcode_v1.5_3.webp"> <img width="200" src="images/qrcode_v1.5_4.webp">


### With dreamshaper_8.safetensors

<img width="200" src="images/qrcode_dreamshaper_1.webp"> <img width="200" src="images/qrcode_dreamshaper_2.webp">
<img width="200" src="images/qrcode_dreamshaper_3.webp"> <img width="200" src="images/qrcode_dreamshaper_4.webp">

Each model has its pros and cons and this depends on what type of generation it was trained for

### Keep only truly scannable QR codes


In [None]:
import os
from qreader import QReader
import cv2
import shutil
from PIL import Image
qreader = QReader()


def keep_readable_qrcodes(folder):
    images = []
    for filename in os.listdir(folder):
        path = folder + filename
        img = cv2.imread(folder + filename)

        if img is not None:
          decoded_text = qreader.detect_and_decode(image=img)
          if decoded_text:
            print(decoded_text)
          else:
            print(f"non readable: {filename}")
            shutil.move(path, OUTPUT_DIR_NR + filename)

    return images

In [None]:
#images = keep_readable_qrcodes("/content/ComfyUI/output")
keep_readable_qrcodes(OUTPUT_DIR)

### Apply logo to generated images to make them more intriguing

In [None]:
generated_image = Image.open("output/ComfyUI_00068_.png")
img_with_logo = qr_with_logo("activeloop_logo.jpg", generated_image, "generated_image_with_logo.jpg")
img_with_logo


### Limitations of Our Approach
- Overall, the ControlNet model required extensive manual tuning of parameters. There are many methods to control the QR code generation process, but none are entirely reliable. The problem intensifies when you want to account for the input product image as well. To the best of our knowledge, no other publication has found a way to generate them reliably, and we spent the majority of our time experimenting with various setups.

- Adding an image to the input might offer more control and bring about various use-cases, but it significantly restricts the possibilities of stable diffusion. This usually only results in changes to the image's style without fitting much of the QR structure. Moreover, we saw greater success with text-to-image compared to image-to-image with logo masks. However, the former wasn't as desirable because we believe logos are essential in product QR codes.

- From our examples, it's evident that the generated products don't exactly match the actual products one-to-one. If the goal is to advertise a specific product, even a minor mismatch could be misleading. Nonetheless, we believe that [LORA](https://stable-diffusion-art.com/lora/) models or a different type of preprocessor model could address these issues.

- Automated image prompts can sometimes be confusing, drawing focus to unimportant details within the context. This is particularly problematic if we don't have enough relevant textual information to build upon. This presents an opportunity to further use the Deep Lake Vector Store to analyze the image bind embeddings for a better understanding of the content on e-commerce websites.

- In our examples, we also encountered issues with faces, as they sometimes didn't appear human. However, this could be easily addressed with further processing. In instances where we want to preserve the face and adjust it to the QR code, there are tools like the [Roop](https://github.com/s0md3v/sd-webui-roop) that can be used for a detailed face replacement.


### Conclusion: Scalable Prompt Generation Achieved, QR Code Generation Remains Unreliable
Deep Lake combined with LangChain can significantly reduce the costs of analyzing the contents of a website to provide image descriptions in a scalable way. Thanks to the Deep Lake Vector Store, we can save a large number of documents and images along with their embeddings. This allows us to iteratively adjust the image prompts and efficiently filter based on embedding similarities. However, it is very difficult to find the ControlNet sweet spot of QR readability and "cool" design. Taking into account all of the limitations we've discussed, we believe that there needs to be more experimenting with ControlNet, in order to generated product QR codes that are reliable and applicable for real-world businesses.

I hope that you find this useful and already have many ideas on how to further build on this. Thank you for reading and I wish you a great day and see you in the next one.


## FAQs
<h2 id="faq">What is prompt engineering?</h2>
     Prompt engineering is the practice of carefully crafting inputs (prompts) to be given to AI models, particularly language models, in order to elicit the desired output. It involves understanding how the model interprets inputs and using that knowledge to achieve more accurate, relevant, or creative responses.

<h2 id="faq">What is Stable Diffusion?</h2>
     Stable Diffusion refers to a specific generative artificial intelligence model designed for text-to-image synthesis. This model has the capability to generate photorealistic images based on textual input. It empowers users to create stunning artwork quickly and autonomously. Additionally, besides images, Stable Diffusion can also be used for image-to-image generation or to create videos and animations. It was originally launched in 2022.

<h2 id="faq">Is Stable Diffusion free to use?</h2>
     Stable Diffusion is an open-source project, which means it is freely available for anyone to use. You can access and use Stable Diffusion without any cost, subject to the terms of its open-source license.

<h2 id="faq">What is ControlNet?</h2>
     ControlNet is an innovative neural network architecture that integrates extra conditions to manage the control of diffusion models. These techniques include edge and line detection, human poses, image segmentation, depth maps, image styles, or simple user scribbles, allowing for conditioned output images.

<h2 id="faq">What is AUTOMATIC1111?</h2>
     AUTOMATIC1111 is a robust web-based user interface (WebUI) tailored for Stable Diffusion, an AI model for text-to-image generation. It provides an intuitive platform for creating remarkable images from textual prompts.

<h2 id="faq">How to use ComfyUI?</h2> 
     ComfyUI is a powerful and modular stable diffusion GUI. To use it, you need to clone the ComfyUI repository, install its requirements, and run the main.py file. It allows for the creation of schemas from the GUI and transforms these schemas into code for image generation tasks.

<h2 id="faq">What is QR code and how it works?</h2>
     A QR code, or Quick Response code, is a two-dimensional barcode that stores data. It works by encoding information in black squares arranged on a white grid. When scanned by a QR code reader or smartphone camera, the encoded data is decoded and can trigger actions such as opening a website or displaying text.

<h2 id="faq">How to make artistic QR codes?</h2>
     Making artistic QR codes involves incorporating design elements, colors, and sometimes logos into the QR code without compromising its scan-ability. This can be achieved through specialized software or online tools that allow for the customization of QR codes while ensuring they remain functional.

<h2 id="faq">What is image synthesis?</h2>
     Image synthesis is the process of generating new images from textual descriptions, existing images, or a combination of both, using artificial intelligence and machine learning models. It involves creating visually coherent and contextually relevant images based on the input provided.
     
<h2 id="faq">What are LoRA models?</h2>
     LoRA (Low-Rank Adaptation of Large Language Models) is a popular and lightweight training technique that significantly reduces the number of trainable parameters. It works by inserting a smaller number of new weights into the model and only these are trained. In the context of AI and machine learning, particularly concerning stable diffusion models, LoRA models are mentioned as potentially useful for addressing issues in generating product QR codes that match actual products more closely.