#Advanced Vector Search with Pinecone and CLIP Models

This Notebook provides a detailed demonstration of using Pinecone to manage a large-scale vector database and utilizing CLIP models for executing sophisticated image and text searches.
#Getting Started
**Prerequisites:**


*   Python 3.6+
*   Google Colab, Jupyter Notebook, or JupyterLab
*   Installation of required Python libraries such as pinecone-client, transformers, and torch

#Configuration
Ensure you have your Pinecone API key available. You can obtain one from Pinecone. Set up your environment variables or modify the notebook to include your API key where necessary.

#Usage
Open the notebook in Google Colab, Jupyter Notebook or JupyterLab. Follow the step-by-step instructions within the notebook to connect to Pinecone, select the appropriate CLIP model, and perform searches. The notebook includes comments and guidelines to help you understand each step.

#Features


*   Vector Database Connection: Connect to a scalable Pinecone database.
*   CLIP Model Integration: Utilize CLIP models for converting images and text into vectors.
*   Interactive Search: Conduct searches using a simple user interface within the notebook.
*   Results Visualization: View the outcomes of your searches directly in the notebook.







#General Imports for Demo


Installs

In [2]:
!pip install torch torchvision pillow flax transformers
!pip install -qU \
  pinecone-client==3.1.0



In [3]:
import io
import os
import time
import torch
import requests
import pandas as pd
import numpy as np
from pinecone import Pinecone, ServerlessSpec
from transformers import CLIPProcessor, CLIPModel, AutoTokenizer, FlaxCLIPTextModelWithProjection
from PIL import Image
from IPython.display import display, HTML
from io import BytesIO

  from tqdm.autonotebook import tqdm


#Connecting to Pinecone

Connecting to Pinecone

In [4]:
# initialize connection to pinecone (get API key at app.pinecone.io)
api_key = os.environ.get('e3a21a5f-d3c8-486e-b452-54f073bc80c9') or 'e3a21a5f-d3c8-486e-b452-54f073bc80c9'

# configure client
pc = Pinecone(api_key=api_key)

#Connecting to Pinecone Serverless
cloud = os.environ.get('PINECONE_CLOUD') or 'aws'
region = os.environ.get('PINECONE_REGION') or 'us-west-2'

spec = ServerlessSpec(cloud=cloud, region=region)

Connecting To Pinecone Index We Just Discussed

In [5]:
#Connecting to the Specific Laion Index
index_name = 'laion-400m'

existing_indexes = [
    index_info["name"] for index_info in pc.list_indexes()
]

# connect to index
index = pc.Index(index_name)
time.sleep(1)
# view index stats
index.describe_index_stats()

{'dimension': 512,
 'index_fullness': 0.0,
 'namespaces': {'': {'vector_count': 278400435}},
 'total_vector_count': 278400435}

#Searching with Images

##Creating an image vector

In [7]:
# Load the pre-trained CLIP model and processor from Hugging Face
model = CLIPModel.from_pretrained("openai/clip-vit-base-patch32")
processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")

def get_image_embedding(image_path):
    # Load the image
    image = Image.open(image_path)

    # Preprocess the image and return PyTorch tensor
    inputs = processor(images=image, return_tensors="pt")

    # Generate the image embedding
    with torch.no_grad():
        image_embeddings = model.get_image_features(**inputs)

    # Convert the image embedding from a numpy array to a list
    return image_embeddings.cpu().numpy().tolist()

#Give it an image
image_path = "./boxer.jpeg"  # Replace with your image path
image_embedding = get_image_embedding(image_path)

In [8]:
# Now 'image_embedding' is a list containing the vector representation of your image
print(image_embedding)

[[-0.11499130725860596, -0.2915298044681549, -0.3722771406173706, -0.06986269354820251, -0.039647843688726425, -0.8347188830375671, -0.14220204949378967, -0.3288537263870239, 0.6156586408615112, 0.05957019329071045, -0.312560498714447, 0.21596963703632355, 0.18527407944202423, -0.022201940417289734, 0.8037807941436768, 0.12518104910850525, 1.2980550527572632, 0.04160517454147339, -0.28027772903442383, 0.12055625021457672, -0.5168699026107788, 0.477660596370697, 0.12324009835720062, -0.40923774242401123, -0.44150203466415405, 0.08682079613208771, 0.3866202235221863, 0.2852776050567627, 0.10224686563014984, 0.4253156781196594, 0.18553447723388672, -0.18726889789104462, 0.2171315848827362, 0.38283175230026245, -0.2633007764816284, 0.10114464908838272, -0.07161823660135269, 0.24232521653175354, -0.20367690920829773, 1.0719773769378662, -0.6910799145698547, 0.1403619945049286, 0.06283770501613617, 0.26932865381240845, 0.14178681373596191, 1.3033376932144165, -0.22856053709983826, 0.11312726

##Setting up the query

In [11]:
query = image_embedding
# now query
xc = index.query(vector=query, top_k=25, include_metadata=True)

In [12]:
xc

{'matches': [{'id': '36199-64',
              'metadata': {'caption': 'boxer dog with tail boxer puppies for '
                                      'sale akc puppyfinder',
                           'url': 'http://t0.gstatic.com/images?q=tbn:ANd9GcSW_plCoiEmW5am7GnV4IR9FJdRPndOroFhGcMk0FF-W_byp29y'},
              'score': 0.909388,
              'values': []},
             {'id': '4736-2965',
              'metadata': {'caption': 'Barney the Bulldog by gisondan',
                           'url': 'http://ih1.redbubble.net/image.3890825.8486/flat,220x200,075,t.jpg'},
              'score': 0.908618927,
              'values': []},
             {'id': '30898-6124',
              'metadata': {'caption': 'English Bulldog',
                           'url': 'https://puppiesforsaleinsandiego.com/wp-content/uploads/2018/11/EnglishBulldog-300x300.jpg'},
              'score': 0.904597163,
              'values': []},
             {'id': '40998-3652',
              'metadata': {'caption': 'Bo

##Display Data

In [13]:
# Assuming 'xc' is the Pinecone query result you showed earlier
matches = xc['matches']  # Correctly accessing the matches from the Pinecone response

html_content = "<h2>Image Results</h2>"
for match in matches:
    caption = match['metadata'].get('caption', 'No caption available')
    url = match['metadata'].get('url', '')

    try:
        # Attempt to retrieve and display the image
        image_response = requests.get(url)
        image = Image.open(BytesIO(image_response.content))
        img_tag = f"<img src='{url}' width='300'>"
    except requests.exceptions.SSLError as e:
        print(f"SSL Error for URL {url}: {e}")
        img_tag = "<p>Image not available due to SSL error.</p>"
    except Exception as e:
        print(f"General Error for URL {url}: {e}")
        img_tag = "<p>Image not available due to general error.</p>"

    html_content += f"<div style='margin-bottom: 20px;'><h4>{caption}</h4>{img_tag}<br>Score: {match['score']:.4f}</div>"

General Error for URL https://puppiesforsaleinsandiego.com/wp-content/uploads/2018/11/EnglishBulldog-300x300.jpg: cannot identify image file <_io.BytesIO object at 0x16c252110>
General Error for URL https://photos.smugmug.com/Galleries/Pets/i-gpG9HgV/2/9ad183f7/S/Tesa_29apr2008_0013-Edit-S.jpg: cannot identify image file <_io.BytesIO object at 0x16c252110>
General Error for URL https://www.infinitypups.com/wp-content/uploads/2019/02/english-bulldog-copy.jpg: cannot identify image file <_io.BytesIO object at 0x2825e6ca0>
General Error for URL https://www.mundoanimalia.com/images/cachorros/0c/2f/44/dd14656b73fce3fc005bc64da1926cc4/thumbm__mg_9462__7782.jpg: cannot identify image file <_io.BytesIO object at 0x2b80a7a10>
General Error for URL https://www.naturepl.com/cache/mcache/01457134.jpg: cannot identify image file <_io.BytesIO object at 0x28257ce00>


In [14]:
# Display the HTML content
display(HTML(html_content))

#Searching With Text

##Creating a text vector

In [15]:
# Load the model and tokenizer
model = FlaxCLIPTextModelWithProjection.from_pretrained("openai/clip-vit-base-patch32")
tokenizer = AutoTokenizer.from_pretrained("openai/clip-vit-base-patch32")

flax_model.msgpack: 100%|██████████| 605M/605M [01:10<00:00, 8.54MB/s] 
Some weights of the model checkpoint at openai/clip-vit-base-patch32 were not used when initializing FlaxCLIPTextModelWithProjection: {('vision_model', 'encoder', 'layers', '8', 'self_attn', 'k_proj', 'kernel'), ('vision_model', 'encoder', 'layers', '4', 'self_attn', 'v_proj', 'kernel'), ('vision_model', 'encoder', 'layers', '8', 'self_attn', 'v_proj', 'kernel'), ('vision_model', 'encoder', 'layers', '6', 'self_attn', 'v_proj', 'kernel'), ('vision_model', 'encoder', 'layers', '1', 'mlp', 'fc1', 'kernel'), ('vision_model', 'encoder', 'layers', '1', 'mlp', 'fc1', 'bias'), ('visual_projection', 'kernel'), ('vision_model', 'encoder', 'layers', '5', 'self_attn', 'out_proj', 'bias'), ('vision_model', 'pre_layrnorm', 'bias'), ('vision_model', 'encoder', 'layers', '10', 'layer_norm1', 'bias'), ('vision_model', 'encoder', 'layers', '11', 'layer_norm1', 'scale'), ('vision_model', 'encoder', 'layers', '2', 'mlp', 'fc1', 'kern

In [24]:
# Tokenize input text
inputs = tokenizer(["White boxer puppy"], padding=True, return_tensors="np")

In [25]:
# Get model outputs
outputs = model(**inputs)
text_embeds = outputs.text_embeds  # This is the projection output used for capturing the essence of the input

In [26]:
# Convert JAX array to numpy array and then to list for Pinecone query
query = np.array(text_embeds).tolist()

##Setting up the query

In [27]:
# Now query Pinecone
xc = index.query(vector=query, top_k=10, include_metadata=True)

In [28]:
xc

{'matches': [{'id': '7331-1173',
              'metadata': {'caption': 'Blue / Eyed / Babe / Dog / Animal / '
                                      'Cute / Puppies / Puppy / Douceur / '
                                      'Bestiole / Mignon / Little Dog / '
                                      'Portrait',
                           'url': 'https://i.pinimg.com/564x/4c/47/f0/4c47f05c6460f0541cd661584bd3d89e.jpg'},
              'score': 0.342332,
              'values': []},
             {'id': '40834-8397',
              'metadata': {'caption': 'We love our new white boxer puppy!',
                           'url': 'https://i.pinimg.com/736x/a0/f7/6d/a0f76da678ec389cb728d10800a75f26--white-boxer-puppies-white-boxers.jpg'},
              'score': 0.342070848,
              'values': []},
             {'id': '36425-3168',
              'metadata': {'caption': '365 11web Cute Animals Puppies Animals '
                                      'Beautiful',
                           'url': 

##Display Text Search

In [29]:
# Assuming 'xc' contains your Pinecone query results and 'matches' is part of 'xc'
html_content = "<h2>Text Results</h2>"
for match in xc['matches']:
    caption = match['metadata'].get('caption', 'No caption available')
    url = match['metadata'].get('url', '')

    try:
        # Attempt to retrieve and display the image
        image_response = requests.get(url)
        image = Image.open(BytesIO(image_response.content))
        img_tag = f"<img src='{url}' width='300'>"
    except requests.exceptions.SSLError as e:
        print(f"SSL Error for URL {url}: {e}")
        img_tag = "<p>Image not available due to SSL error.</p>"
    except Exception as e:
        print(f"General Error for URL {url}: {e}")
        img_tag = "<p>Image not available due to general error.</p>"

    html_content += f"<div style='margin-bottom: 20px;'><h4>{caption}</h4>{img_tag}<br>Score: {match['score']:.4f}</div>"

In [30]:
# Display the HTML content
display(HTML(html_content))

#Using this for day to day work

In [31]:
import pandas as pd

# Your existing matches data from Pinecone response
matches = xc['matches']

# Prompt user to input a label
input_label = input("Please enter a label for these images: ")

# Convert each entry to include label
data = []
for match in matches:
    entry = {
        "id": match['id'],
        "metadata": match['metadata'],
        "vector": match.get('values', []),  # Use .get() to avoid KeyError if 'values' is absent
        "label": input_label
    }
    data.append(entry)

# Create a DataFrame
df = pd.DataFrame(data)

# Display the DataFrame
print(df)

           id                                           metadata vector   
0   7331-1173  {'caption': 'Blue / Eyed / Babe / Dog / Animal...     []  \
1  40834-8397  {'caption': 'We love our new white boxer puppy...     []   
2  36425-3168  {'caption': '365 11web Cute Animals Puppies An...     []   
3  34477-4097  {'caption': 'American Bulldog Puppy Is Eating ...     []   
4   6857-4548  {'caption': 'Dogo Argentino Puppy stock image ...     []   
5   2352-2876  {'caption': 'free bulldog puppies bulldog pupp...     []   
6  37260-5811  {'caption': 'bulldog puppie bulldog puppy for ...     []   
7   9470-9945  {'caption': 'Image Gallery white pitbull', 'ur...     []   
8    407-6069  {'caption': 'Staffordshire Bull Terrier lookin...     []   
9  31393-9978  {'caption': 'how much are bulldog puppies bull...     []   

         label  
0  white boxer  
1  white boxer  
2  white boxer  
3  white boxer  
4  white boxer  
5  white boxer  
6  white boxer  
7  white boxer  
8  white boxer  
9  w

In [32]:
from transformers import pipeline

classifier = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")

def auto_label_image(caption, candidate_labels):
    result = classifier(caption, candidate_labels)
    return result['labels'][0]

# Sample usage within batch processing
def process_batch_with_autolabel(matches, candidate_labels):
    data = []
    for match in matches:
        auto_label = auto_label_image(match['metadata']['caption'], candidate_labels)
        entry = {
            "id": match['id'],
            "caption": match['metadata']['caption'],
            "url": match['metadata']['url'],
            "vector": match.get('values', []),
            "label": auto_label
        }
        data.append(entry)
    return data

config.json: 100%|██████████| 1.15k/1.15k [00:00<00:00, 8.67MB/s]
model.safetensors: 100%|██████████| 1.63G/1.63G [02:54<00:00, 9.33MB/s]
tokenizer_config.json: 100%|██████████| 26.0/26.0 [00:00<00:00, 181kB/s]
vocab.json: 100%|██████████| 899k/899k [00:00<00:00, 4.42MB/s]
merges.txt: 100%|██████████| 456k/456k [00:00<00:00, 6.94MB/s]
tokenizer.json: 100%|██████████| 1.36M/1.36M [00:00<00:00, 7.93MB/s]


: 

In [None]:
print(data)

[{'id': '31631-5278', 'metadata': {'caption': 'san-francisco-bay-bridge-2-2', 'url': 'https://photos.smugmug.com/SanFrancisco/The-Bay-Bridge/i-RFTJrtm/0/5584af6b/S/san-francisco-bay-bridge-2-2-S.jpg'}, 'vector': [], 'label': 'San Francisco'}, {'id': '32767-7382', 'metadata': {'caption': 'San Francisco in de mist', 'url': 'https://thumbs.werkaandemuur.nl/1/5ef236a7da08566a2c4519ac887096c1/568x550/thumbnail/fit.jpg'}, 'vector': [], 'label': 'San Francisco'}, {'id': '34611-9061', 'metadata': {'caption': 'View to San Francisco from north side of Golden Gate Bridge', 'url': 'https://photos.smugmug.com/Holidays/USA/San-Francisco-October-2017/i-5B6zZ6B/0/135bb2cf/L/P1260486-2-L.jpg'}, 'vector': [], 'label': 'San Francisco'}, {'id': '25912-8282', 'metadata': {'caption': 'San Francisco Skyline', 'url': 'https://images.alexinwanderland.com/wp-content/uploads/2014/05/SanFran_026.jpg.optimal.jpg'}, 'vector': [], 'label': 'San Francisco'}, {'id': '30381-9768', 'metadata': {'caption': 'Bay Bridge', 