In [1]:
pip install sentence-transformers faiss-cpu


Collecting faiss-cpu
  Downloading faiss_cpu-1.11.0.post1-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (5.0 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch>=1.11.0->sentence-transformers)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch>=1.11.0->sentence-transformers)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch>=1.11.0->sentence-transformers)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch>=1.11.0->sentence-transformers)
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.4.5.8 (from torch>=1.11.0->sentence-transformers)
  Downloading nvidia_cublas_cu12-12.4.5.8-

Text Data

In [2]:
text = """
Pinecone is a vector database that lets you search and store embeddings efficiently.
It is often used in semantic search and AI applications like chatbots.
Embeddings are numerical vectors that represent the meaning of text or data.
...
"""


Split Text into Chunks

In [3]:
def split_text(text, max_words=50):
    import re
    sentences = re.split(r'\.|\n', text)
    chunks = []
    chunk = []

    for sentence in sentences:
        words = sentence.strip().split()
        if not words:
            continue
        chunk.extend(words)
        if len(chunk) >= max_words:
            chunks.append(" ".join(chunk))
            chunk = []

    if chunk:
        chunks.append(" ".join(chunk))

    return chunks

chunks = split_text(text)
print(f"Total Chunks: {len(chunks)}")


Total Chunks: 1


Embed Text Using Local Model

In [4]:
from sentence_transformers import SentenceTransformer

model = SentenceTransformer('all-MiniLM-L6-v2')
embeddings = model.encode(chunks, show_progress_bar=True)


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Store Vectors in FAISS

In [5]:
import faiss
import numpy as np

dimension = embeddings.shape[1]  # e.g., 384
index = faiss.IndexFlatL2(dimension)  # L2 = Euclidean similarity
index.add(np.array(embeddings))  # store the vectors


Create a Search Function

In [6]:
def search(query, k=3):
    query_vec = model.encode([query])
    D, I = index.search(query_vec, k)
    return [chunks[i] for i in I[0]]


Build a Simple Q&A Loop

In [None]:
while True:
    user_input = input("\nAsk a question (or type 'exit'): ")
    if user_input.lower() == "exit":
        break

    results = search(user_input)
    print("\n📚 Top relevant texts:")
    for i, res in enumerate(results):
        print(f"{i+1}. {res}")



Ask a question (or type 'exit'): which data base

📚 Top relevant texts:
1. Pinecone is a vector database that lets you search and store embeddings efficiently It is often used in semantic search and AI applications like chatbots Embeddings are numerical vectors that represent the meaning of text or data
2. Pinecone is a vector database that lets you search and store embeddings efficiently It is often used in semantic search and AI applications like chatbots Embeddings are numerical vectors that represent the meaning of text or data
3. Pinecone is a vector database that lets you search and store embeddings efficiently It is often used in semantic search and AI applications like chatbots Embeddings are numerical vectors that represent the meaning of text or data

Ask a question (or type 'exit'): pinecone is what

📚 Top relevant texts:
1. Pinecone is a vector database that lets you search and store embeddings efficiently It is often used in semantic search and AI applications like chatbo

chatbot now:
Stores and searches our own data
Finds the most relevant passages
Can be improved with a simple LLM later