# Hands-on: Create Embeddings and Store Vectors Locally

This notebook demonstrates how to create text embeddings, store them locally, **check the local file path**, and perform semantic similarity search.

## 1. Install Dependencies
Run this cell once.

In [1]:
!pip install sentence-transformers numpy

Defaulting to user installation because normal site-packages is not writeable


## 2. Sample Data

In [2]:
documents = [
    "Vector databases store embeddings",
    "Semantic search focuses on meaning",
    "RAG improves LLM accuracy",
    "Keyword search matches exact words",
    "Embeddings represent text as vectors"
]

## 3. Create Embeddings

In [3]:
from sentence_transformers import SentenceTransformer
import numpy as np

model = SentenceTransformer("all-MiniLM-L6-v2")
embeddings = model.encode(documents)

print("Embedding shape:", embeddings.shape)

  from .autonotebook import tqdm as notebook_tqdm


Embedding shape: (5, 384)


## 4. Store Embeddings Locally

In [4]:
import os

np.save("embeddings.npy", embeddings)
np.save("documents.npy", np.array(documents))

print("Embeddings stored locally.")

Embeddings stored locally.


## 5. Check Local File Path

In [5]:
import os

print("Current working directory:")
print(os.getcwd())

print("\nFiles in current directory:")
print(os.listdir())

print("\nAbsolute path of embeddings file:")
print(os.path.abspath("embeddings.npy"))

print("\nDoes embeddings file exist?")
print(os.path.exists("embeddings.npy"))

Current working directory:
/Users/immaculate.x/Downloads

Files in current directory:
['1 Introduction to AI.pdf', 'ds6-1_1764522876207 (1).zip', 'OnVUE 2.app', 'Ollama.dmg', 'documents.npy', 'CamScanner 12-01-2025 20.58.28_2 (1).pdf', 'DS_6.ipynb', 'QC-template- April 2023 - March 2024.xls', 'HousePrices.csv', 'C1M1_Assignment.ipynb', 'loan.csv', 'RAG_Assignment_Submission.ipynb', '7thDec.ipynb', '.DS_Store', '9579817_15881981765634509338.pdf', 'certificate.pdf', 'Bishop-Pattern-Recognition-and-Machine-Learning-2006.pdf', 'embeddings_local_demo_with_path.ipynb', '.localized', 'ds_10 (2).ipynb', 'RAG_M5.pdf', 'data_1765040901049.zip', 'd10fcfd6-105c-4383-b82d-b8d754260bd4.pdf', 'amazon_sales_2025_INR.csv', 'RAG_M4.pdf', 'hypo (1).ipynb', 'houseprices_1763830945485.zip', 'data_1765040901049 (1).zip', 'Ex_Files_Python_EssT.zip', 'xavier.zip', 'C1M1_Assignment_filled.ipynb', 'Points to Remember - AWS - Cloud Practioner.pdf', 'Braze Certification AI Fundamentals Study Guide.pdf', 'hypo_2 (

## 6. Load Stored Embeddings

In [6]:
stored_embeddings = np.load("embeddings.npy")
stored_documents = np.load("documents.npy")

print("Loaded embeddings shape:", stored_embeddings.shape)

Loaded embeddings shape: (5, 384)


## 7. Similarity Search

In [7]:
from numpy.linalg import norm

def cosine_similarity(a, b):
    return np.dot(a, b) / (norm(a) * norm(b))

query = "How do embeddings help search?"
query_embedding = model.encode(query)

scores = [cosine_similarity(query_embedding, emb) for emb in stored_embeddings]
best_match_index = np.argmax(scores)

print("Query:", query)
print("Best match:", stored_documents[best_match_index])

Query: How do embeddings help search?
Best match: Vector databases store embeddings
