# ðŸ¤– Thesaurus: RAG System for Robotics Research
## End-to-End Demo

**Requirements:**
- GPU runtime (T4 or better)
- HuggingFace token with Llama access


In [None]:
# Check GPU
!nvidia-smi


In [None]:
# Install dependencies
%pip install -q transformers accelerate bitsandbytes torch sentence-transformers faiss-cpu


In [None]:
# Clone repo
!git clone https://github.com/khushnaidu/Thesisaurus.git
%cd Thesisaurus


## Setup: Load LLM and Initialize System


In [None]:
from phase3_llm.llm_wrapper import LLMWrapper
from phase3_llm.pipeline import ResearchAssistant

# Load LLM (takes 2-3 minutes)
print("Loading Llama-3.1-8B...")
llm = LLMWrapper(model_size="8b")
llm.load()
print("âœ“ LLM loaded!")


In [None]:
# Initialize Research Assistant
assistant = ResearchAssistant(
    llm=llm,
    db_path="phase1_data_preparation/outputs/papers.db",
    index_path="phase1_data_preparation/outputs/faiss_index.bin",
    metadata_path="phase1_data_preparation/outputs/chunk_metadata.json"
)
print("âœ“ Research Assistant ready!")


## Demo Queries

Run these cells to see the system in action!

**Demonstrates 3 tool types:**
- Database queries (datasets, models, training)
- Semantic search (paper content, concepts)
- Web search (arXiv API)


In [None]:
# Query 1: Database query - common datasets
result = assistant.answer("What are the most common evaluation datasets?")


In [None]:
# Query 2: Database query - vision models
result = assistant.answer("What vision models are used across papers?")


In [None]:
# Query 3: Semantic search - specific paper
result = assistant.answer("Tell me about OpenVLA")


In [None]:
# Query 4: Conceptual question - semantic search
result = assistant.answer("How does sim-to-real transfer work in robot learning?")


In [None]:
# Query 5: Training details
result = assistant.answer("What training setups were used?")


In [None]:
# Query 6: Web search - arXiv recent papers
result = assistant.answer("Find recent papers on robot manipulation on arXiv")


## Try Your Own Query!


In [None]:
# Custom query
my_question = "What robots are used in these papers?"
result = assistant.answer(my_question)
