[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/jamescalam/applied-ml-minicourse/blob/main/code/09-making-queries.ipynb)

# 09: Making Queries

Now that we have our data indexed in both Pinecone and Cloud Storage, we can move on to making queries.

<img src="https://github.com/jamescalam/applied-ml-minicourse/raw/main/images/hf-spaces-cacher-components.png" style="width:80%">

The image above shows the intended structure of our app. Every time a user makes a query we will first search for past queries that have been made and have a high similarity to the new query.

If we find a past query aligns with the current query we can skip the long diffusion process and simply return a few of the most similar past queries.

Let's see how to perform these queries.

## Initializing Services

Again, as usual, we must initialize our connection to Cloud Storage, Pinecone, and initialize the `StableDiffusionPipeline`.

Starting with Cloud Storage:

In [None]:
import os
from google.cloud import storage

# set credentials
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'cloud-storage.json'

# connect to bucket (we named it 'diffusion')
storage_client = storage.Client()
bucket = storage_client.get_bucket('diffusion')

Then Pinecone:

In [None]:
import pinecone

pinecone.init(
    api_key='<<YOUR_API_KEY>>',
    environment='us-west1-gcp'
)

# connect to index
index = pinecone.Index('diffusion')

And the `StableDiffusionPipeline`:

In [None]:
import torch
from diffusers import StableDiffusionPipeline

# set the hardware device
device = 'cuda' if torch.cuda.is_available() else 'cpu'

# init all of the pipeline models and move them to a given GPU
pipe = StableDiffusionPipeline.from_pretrained(
    "CompVis/stable-diffusion-v1-4",
  	use_auth_token="<<ACCESS_TOKEN>>"
)
pipe.to(device)
print(device)

Now let's make some queries.

## Making Queries

When making queries we must use the first two components of the pipeline, the tokenizer and CLIP, to create a *query prompt vector*.

In [None]:
prompt = "a person surfing"

# encode prompt to mean_pooled vector
tokens = pipe.tokenizer(
    prompt, padding='max_length',
    return_tensors='pt'
).to(device)
xq = pipe.text_encoder(**tokens)['mean_pooled'].detach().cpu().numpy().tolist()

Make the query to Pinecone, we will return the top `5` most similar matches *and* return the prompt metadata.

<img src="https://github.com/jamescalam/applied-ml-minicourse/raw/main/images/making-queries.png" style="width:60%">

In [None]:
xc = index.query(xq, top_k=5, include_metadata=True)
xc

We extract the ID values so that we can download the images from Cloud Storage.

In [None]:
ids = [match['id'] for match in xc]

Then we download all of the images:

In [None]:
import io
from PIL import Image

images = []

for _id in ids:
    # connect to cloud storage blob and download
    blob = bucket.blob(f"{_id}.png").download_as_string()
    # convert to 'in-memory' file
    blob_bytes = io.BytesIO(blob)
    # convert to PIL image object
    im = Image.open(blob_bytes)
    images.append(im)

Now view the images:

In [None]:
for im in images:
    im.show()

And that is how we make queries to our vector database and use the results to retrieve the most relevant images.