# Hands-on: Create Embeddings and Store Vectors Locally

This notebook demonstrates how to create text embeddings, store them locally, and perform semantic similarity search.

## 1. Install Dependencies
Run this cell once.

In [11]:
python3 -m pip install sentence-transformers numpy

SyntaxError: invalid syntax (215392287.py, line 1)

## 2. Sample Data

In [None]:
documents = [
    "Vector databases store embeddings",
    "Semantic search focuses on meaning",
    "RAG improves LLM accuracy",
    "Keyword search matches exact words",
    "Embeddings represent text as vectors"
]

## 3. Create Embeddings

In [None]:
from sentence_transformers import SentenceTransformer
import numpy as np

model = SentenceTransformer("all-MiniLM-L6-v2")
embeddings = model.encode(documents)

print("Embedding shape:", embeddings.shape)

  from .autonotebook import tqdm as notebook_tqdm


Embedding shape: (5, 384)


## 4. Store Embeddings Locally

In [None]:
np.save("embeddings.npy", embeddings)
np.save("documents.npy", np.array(documents))

print("Embeddings stored locally.")

Embeddings stored locally.


## 5. Load Stored Embeddings

In [None]:
stored_embeddings = np.load("embeddings.npy")
stored_documents = np.load("documents.npy")

print("Loaded embeddings:", stored_embeddings.shape)

Loaded embeddings: (5, 384)


## 6. Similarity Search

In [None]:
from numpy.linalg import norm

def cosine_similarity(a, b):
    return np.dot(a, b) / (norm(a) * norm(b))

query = "How do embeddings help search?"
query_embedding = model.encode(query)

scores = [cosine_similarity(query_embedding, emb) for emb in stored_embeddings]
best_match_index = np.argmax(scores)

print("Query:", query)
print("Best match:", stored_documents[best_match_index])

Query: How do embeddings help search?
Best match: Vector databases store embeddings
