# Vector Search via Embeddings

First, we need to load all the Markdown files and chunk them up, so we have smaller texts we can embed.


In [15]:
from langchain.document_loaders import DirectoryLoader, TextLoader
from langchain.text_splitter import MarkdownTextSplitter

loader = DirectoryLoader(
    "./data",
    glob="**/*.md",
    loader_cls=TextLoader,
)

documents = loader.load()

# Split the markdown documents into chunks
text_splitter = MarkdownTextSplitter()

split_documents = text_splitter.split_documents(documents)

## Embedding Model

Oh dear lord, there are a lot of embedding models to choose from.
I guess I could spend days figuring out which one has the best performance for my use case.
If you are interested, here is the [MTEB Leaderboard](https://huggingface.co/spaces/mteb/leaderboard).

I would have liked to choose a bigger model, but for now let's stick with something super simple.


In [16]:
from langchain_huggingface import HuggingFaceEmbeddings

embedding_model = HuggingFaceEmbeddings(model_name="BAAI/bge-small-en-v1.5")

## VectorDB

Let's embed those chunks and stuff them in a vector database locally.

I choose LanceDB because it is written in Rust and I'm a sucker for the crabs.


In [17]:
from langchain_community.vectorstores import LanceDB

db_path = "./db/srd.lancedb"

vectorstore = LanceDB(uri=db_path, embedding=embedding_model, table_name="srd")

In [None]:
vectorstore.add_documents(split_documents)

## Retrieval

### Vector Similarity

Now it is time to search for our documents.


In [20]:
# Simple similarity search
vectorstore.similarity_search_with_score(query="What is an Elf?", k=5)

[(Document(metadata={'source': 'data/DND.SRD.Wiki-0.5.2/Races/Elf.md'}, page_content='# Elf\n\n### Elf Traits\n\nYour elf character has a variety of natural abilities, the result of thousands of years of elven refinement.\n\n***Ability Score Increase***. Your Dexterity score increases by 2.\n\n***Age***. Although elves reach physical maturity at about the same age as humans, the elven understanding of adulthood goes beyond physical growth to encompass worldly experience. An elf typically claims adulthood and an adult name around the age of 100 and can live to be 750 years old.\n\n***Alignment***. Elves love freedom, variety, and self- expression, so they lean strongly toward the gentler aspects of chaos. They value and protect others\' freedom as well as their own, and they are more often good than not. The drow are an exception; their exile has made them vicious and dangerous. Drow are more often evil than not.\n\n***Size***. Elves range from under 5 to over 6 feet tall and have slend

### Hybrid Retrieval

Now let's do a hybrid search with dense vector search + bm25.


In [25]:
import lancedb

db = lancedb.connect(db_path)

table = db.open_table("srd")
table.create_fts_index(field_names=["text"], replace=True)

query = "What is an Elf?"
query_vector = embedding_model.embed_query(query)

table.search(query_type="hybrid").vector(query_vector).text(query).limit(5).to_polars()

vector,id,text,metadata,_relevance_score
"array[f32, 384]",str,str,struct[1],f32
"[-0.069095, 0.058212, … -0.034371]","""0e546ea8-5352-488b-9390-1cb38f…","""# Elf ### Elf Traits Your el…","{""data/DND.SRD.Wiki-0.5.2/Races/Elf.md""}",0.032522
"[-0.078247, 0.056938, … -0.061199]","""1f179c99-c1d0-4a8e-b43e-b48e73…","""# Half-Elf ### Half-Elf Trait…","{""data/DND.SRD.Wiki-0.5.2/Races/Half-Elf.md""}",0.032002
"[-0.1178, 0.026662, … 0.046217]","""738b8007-dfbb-42df-8ed4-b47bbe…","""## Elf, Drow *Medium humanoid…","{""data/DND.SRD.Wiki-0.5.2/Monsters/Elf, Drow.md""}",0.03101
"[-0.068049, -0.006061, … -0.014065]","""fb5c5096-7353-49b0-8e72-3f0d80…","""### Reincarnate *5th-level tr…","{""data/DND.SRD.Wiki-0.5.2/Spells/Reincarnate.md""}",0.016393
"[-0.051547, 0.01843, … 0.030391]","""588f1d62-44fa-4751-b338-5360a7…","""**Elementals** are creatures n…","{""data/DND.SRD.Wiki-0.5.2/Monsters (Alt)/# Monster Statistics.md""}",0.015873


### Hybrid Retrieval with Reranker


In [26]:
from lancedb.rerankers import CrossEncoderReranker

reranker = CrossEncoderReranker(model_name="cross-encoder/ms-marco-MiniLM-L6-v2")

table.search(query_type="hybrid").vector(query_vector).text(query).limit(5).rerank(
    reranker
).to_polars()

vector,id,text,metadata,_relevance_score
"array[f32, 384]",str,str,struct[1],f32
"[-0.069095, 0.058212, … -0.034371]","""0e546ea8-5352-488b-9390-1cb38f…","""# Elf ### Elf Traits Your el…","{""data/DND.SRD.Wiki-0.5.2/Races/Elf.md""}",2.008377
"[-0.078247, 0.056938, … -0.061199]","""1f179c99-c1d0-4a8e-b43e-b48e73…","""# Half-Elf ### Half-Elf Trait…","{""data/DND.SRD.Wiki-0.5.2/Races/Half-Elf.md""}",-1.293953
"[-0.1178, 0.026662, … 0.046217]","""738b8007-dfbb-42df-8ed4-b47bbe…","""## Elf, Drow *Medium humanoid…","{""data/DND.SRD.Wiki-0.5.2/Monsters/Elf, Drow.md""}",-3.574201
"[-0.068587, 0.002545, … -0.012316]","""03a1d37a-42f5-4381-8232-610d87…","""**Table- Reincarnate Race** |…","{""data/DND.SRD.Wiki-0.5.2/Spells (Alt)/Spells R.md""}",-4.437528
"[-0.051547, 0.01843, … 0.030391]","""588f1d62-44fa-4751-b338-5360a7…","""**Elementals** are creatures n…","{""data/DND.SRD.Wiki-0.5.2/Monsters (Alt)/# Monster Statistics.md""}",-5.166459
