# RAG FAQ Assistant - Quickstart Guide

This notebook demonstrates how to use the RAG FAQ Assistant to query documentation.

## Setup

First, make sure you've installed the necessary dependencies and prepared your documentation.

In [None]:
# Install dependencies if needed
!pip install -r ../requirements.txt

## Legal Disclaimer

**Important:** This project is not affiliated with, endorsed by, or connected to Amazon Web Services. The documentation you process should be obtained according to the documentation provider's terms of service.

For AWS documentation, please review the official AWS Documentation Terms: https://aws.amazon.com/terms/

## 1. Creating Embeddings

First, let's import the necessary modules and create embeddings for your documentation.

In [None]:
import sys
import os
sys.path.append('..')
from pathlib import Path

# Make sure raw_docs directory exists with your documentation files
raw_docs_dir = Path('../raw_docs')
if not raw_docs_dir.exists() or not any(raw_docs_dir.iterdir()):
    print("Error: No documentation files found in raw_docs directory.")
    print("Please add documentation files to the raw_docs directory first.")
else:
    print(f"Found {len(list(raw_docs_dir.glob('*.txt')))} documentation files.")

In [None]:
# Import the embedding creation module
from create_embeddings import create_embeddings

# Create embeddings (this might take a few minutes depending on your documentation size)
create_embeddings()

## 2. Querying the Documentation

Now that we have embeddings, let's ask some questions about the documentation.

In [None]:
from query_assistant import answer_question

# Ask a question
question = "What are the five pillars of the AWS Well-Architected Framework?"
result = answer_question(question)

print(f"Question: {question}\n")
print(f"Answer:\n{result['answer']}\n")
print("Sources:")
for i, source in enumerate(result['sources']):
    print(f"  {i+1}. {source}")

## 3. Advanced Usage: Customizing the Retrieval

You can customize how many documents are retrieved and how they're processed.

In [None]:
from query_assistant import create_retriever

# Create a custom retriever with more results
retriever = create_retriever(k=5)  # Get top 5 results instead of default 3

# Ask a different question
question = "How does the Operational Excellence pillar improve workload quality?"

# Get results manually
docs_and_scores = retriever.similarity_search_with_score(question)

# Display results
print(f"Question: {question}\n")
print(f"Found {len(docs_and_scores)} relevant passages:\n")

for i, (doc, score) in enumerate(docs_and_scores):
    print(f"Result {i+1} (Similarity: {score:.4f})")
    print(f"Content: {doc.page_content}\n")
    print(f"Source: {doc.metadata.get('source', 'Unknown')}")
    print("-" * 80)

## 4. Using LLM-based Answering (Optional)

If you have OpenAI API access configured, you can use the LLM-based answering feature.

Note: This feature requires an OpenAI API key in your .env file.

In [None]:
# Check if OpenAI API key is available
import os
from dotenv import load_dotenv
load_dotenv()

if os.getenv("OPENAI_API_KEY"):
    from query_assistant import answer_question_with_llm
    
    question = "What is the AWS Well-Architected Framework?"
    result = answer_question_with_llm(question)
    
    print(f"Question: {question}\n")
    print(f"LLM Answer:\n{result['answer']}\n")
    print("Sources:")
    for i, source in enumerate(result['sources']):
        print(f"  {i+1}. {source}")
else:
    print("OpenAI API key not found in .env file. LLM-based answering is not available.")

## Next Steps

- Try with your own documentation
- Experiment with different similarity thresholds
- Contribute improvements to the project

Thank you for using the RAG FAQ Assistant!