### 4a_query_search.ipynb  
Takes a raw user query.  
Performs a top-k similarity search in the vector database.  
Returns the top chunks as preliminary matches.

In [9]:
from langchain.vectorstores import FAISS
from langchain.embeddings import HuggingFaceEmbeddings
import os

we are going to create an embedder object 

use this embedder object to load the FAISS vectorstore locally

In [10]:
# initialize the embedding model
embedder = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

# load the previously saved vector database safely
vectorstore = FAISS.load_local(
    "vectorstore/",
    embeddings=embedder,
    allow_dangerous_deserialization=True # lol had to add this since i created the vector store myself
)


create an example query

In [11]:
# define the question or phrase the user wants to semantically search
query = "battery duration of hip and ankle configurations"

print("query:", query)

query: battery duration of hip and ankle configurations


does a similarity search on this query based off the chunks in the vectordb (these chunks are already embedded=vectorized)

i decreased the chunk size to 200 since 500 was a bit too large i believe

In [12]:
matches = vectorstore.similarity_search(query, k=3)

In [13]:
for i, match in enumerate(matches):
    print(f"\nchunk {i+1}:\n")
    print(match.page_content)


chunk 1:

and hip-and-ankle configurations were able to operate for 35, 25, and 15 min, respectively, before reaching 50% of the voltage capacity of the battery (~900 mAh). Similarly, the elbow configuration

chunk 2:

increase interest in this joint. We evaluated the life span of each configuration when operating on a 22.2-V, 1800-mAh LiPo battery. The hip, ankle, and hip-and-ankle configurations were able to

chunk 3:

whereas the battery decreased until reaching the manufacturer’s recommended limit. When configured to provide simultaneous assistance to the hip and ankle joints, the device was able to operate for


In [14]:
os.makedirs("data", exist_ok=True)

for i, match in enumerate(matches):
    with open(f"data/chunk_{i+1}.txt", "w", encoding="utf-8") as f:
        f.write(match.page_content)