# Simple Encoder+FAISS
Embeddings are vector representations of words or sentences. They are used in LLM’s to represent language in a format that the model can learn from and understand. Since they are vectors that hold contextual information, many interesting operations can be made with them.

For instance, in this notebook I’ve encoded a set of 5 events and 5 queries about these events into embeddings using the lightweight all-MiniLM encoder model.

In [None]:
!pip install faiss-cpu sentence-transformers

In [4]:
from sentence_transformers import SentenceTransformer
import faiss

In [5]:
events = [
    "The player entered the dark cave.",
    "The dragon sleeps on a pile of gold.",
    "The merchant offered a healing potion.",
    "The knight swore loyalty to the kings.",
    "The village was attacked by bandits."
]

In [6]:
model = SentenceTransformer("all-MiniLM-L6-v2")
embeddings = model.encode(events, convert_to_numpy=True)
print("Embeding shape:", embeddings.shape)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

Embeding shape: (5, 384)


Now, we need a way to store these vectors and query them to do interesting things. FAISS is a database which allows for fast index search, instead of looping through manually.

In [7]:
d = embeddings.shape[1]
index = faiss.IndexFlatL2(d)
index.add(embeddings)
print("Number of vectors indexed:", index.ntotal)

Number of vectors indexed: 5


Since they are vectors, I can compare them using FAISS to find shortest euclidian distance between them and find which event has an answer to which query, because similar sentences will be close to each other in this vector space!

In [8]:
queries = [
    "Who attacked the town?",
    "Who owns the treasure?",
    "Where did the player go?",
    "Who helps with healing?",
    "Who did the knight swore loyalty to?"
]

for q in queries:
  query_vec = model.encode([q], convert_to_numpy=True)
  D, I = index.search(query_vec, k=2)
  print("Query:", q)
  for rank, idx in enumerate(I[0]):
    print(f"Match {rank+1}: {events[idx]} (distance={D[0][rank]:.2f})")


Query: Who attacked the town?
Match 1: The village was attacked by bandits. (distance=0.65)
Match 2: The player entered the dark cave. (distance=1.54)
Query: Who owns the treasure?
Match 1: The player entered the dark cave. (distance=1.52)
Match 2: The knight swore loyalty to the kings. (distance=1.61)
Query: Where did the player go?
Match 1: The player entered the dark cave. (distance=1.15)
Match 2: The merchant offered a healing potion. (distance=1.60)
Query: Who helps with healing?
Match 1: The merchant offered a healing potion. (distance=0.94)
Match 2: The dragon sleeps on a pile of gold. (distance=1.79)
Query: Who did the knight swore loyalty to?
Match 1: The knight swore loyalty to the kings. (distance=0.26)
Match 2: The merchant offered a healing potion. (distance=1.52)


As we can see, the results are not that bad! One thing I noticed, in the dragon-treasure query, the model failed. That's a clear indication that this simple setup just captures surface similarity, not reason or inference. Though, it is still impressive as all of the converting, lookup, attention, and pooling happens inside just one line of model.encode

In conclusion, we translated the sentences into mathematically represented vector space, which is pretty awesome!