RAG, RAG with Memory, Adaptive RAG, Corrective RAG, self-RAG, Agentive RAG... are you lost? Let me help you with this guide.

1/ Simple RAG
Retrieves relevant documents based on the query and uses them to generate an answer.

2/ Simple RAG with Memory
Extends Simple RAG by maintaining context from previous interactions.

3/ Branched RAG
Performs multiple retrieval steps, refining the search based on intermediate results.

4/ HyDE (Hypothetical Document Embedding)
Generates a hypothetical ideal document before retrieval to improve search relevance.

5/ Adaptive RAG
Dynamically adjusts retrieval and generation strategies based on the query type or difficulty.

6/ Corrective RAG (CRAG)
Iteratively refines generated responses by fact-checking against retrieved information.

7/ Self-RAG
The model critiques and improves its own responses using self-reflection and retrieval.

8/ Agentic RAG
Combines RAG with agentic behavior, allowing for more complex, multi-step problem-solving.


https://python.langchain.com/v0.1/docs/get_started/quickstart/

langchain quick start ^


https://python.langchain.com/docs/integrations/providers/ollama/

Ollama integrations ^

Tool calling:
https://ollama.com/blog/tool-support
https://python.langchain.com/docs/how_to/tool_calling/


- Easy example:
https://github.com/Shubhamsaboo/awesome-llm-apps/blob/main/llama3.1_local_rag/llama3.1_local_rag.py

In [2]:
import torch

# Print the PyTorch version
print(f"PyTorch version: {torch.__version__}")

# Check if CUDA (GPU support) is available
if torch.cuda.is_available():
    print("CUDA is available! GPU is ready to be used.")
    print(f"Number of GPUs available: {torch.cuda.device_count()}")
    print(f"Current GPU: {torch.cuda.get_device_name(torch.cuda.current_device())}")
else:
    print("CUDA is not available. GPU is not set up correctly.")

# Print additional GPU details
if torch.cuda.is_available():
    for i in range(torch.cuda.device_count()):
        print(f"GPU {i}: {torch.cuda.get_device_name(i)}")
        print(f"  - Total Memory: {torch.cuda.get_device_properties(i).total_memory / 1e9} GB")
        print(f"  - Compute Capability: {torch.cuda.get_device_capability(i)}")

if torch.cuda.is_available():
    # Create a random tensor and move it to the GPU
    tensor = torch.rand(3, 3).cuda()
    print("Tensor on GPU:", tensor)
else:
    print("GPU is not available, cannot move tensor to GPU.")


PyTorch version: 2.4.1+cu121
CUDA is available! GPU is ready to be used.
Number of GPUs available: 1
Current GPU: NVIDIA GeForce RTX 4090
GPU 0: NVIDIA GeForce RTX 4090
  - Total Memory: 25.756696576 GB
  - Compute Capability: (8, 9)
Tensor on GPU: tensor([[0.7830, 0.3323, 0.5044],
        [0.1345, 0.2114, 0.7690],
        [0.1381, 0.9195, 0.4452]], device='cuda:0')


In [3]:
import os
from dotenv import load_dotenv


# Print the current working directory (optional for debugging)
print(os.getcwd())

# Set the path to your .env file relative to the current working directory
dotenv_path = os.path.join(os.getcwd(), '../../.env')
load_dotenv(dotenv_path)


# Set up API keys
os.environ["TAVILY_API_KEY"] = os.getenv("TAVILY_API_KEY")


/workspaces/custom_ollama_docker/notebooks/sports_news_rag


# Possible idea: pull in current repo files and make recommendations to fix it

In [9]:
import git
import os
import mimetypes
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.embeddings import OllamaEmbeddings
from langchain_community.vectorstores import Chroma
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

# Function to extract the Git tree structure, ignoring dotfiles if specified
def extract_git_tree(repo_path=".", include_dotfiles=False):
    """
    Extracts the git tree structure including branch names, file paths, and commit history.

    Args:
    - repo_path (str): The path to the repository.
    - include_dotfiles (bool): Whether to include dotfiles in the extracted file paths.

    Returns:
    - dict: A dictionary with keys "branches", "files", and "commits".
    """
    repo = git.Repo(repo_path)
    branches = [branch.name for branch in repo.branches]
    files = repo.git.ls_tree("-r", "--name-only", "HEAD").splitlines()

    # Filter out dotfiles if include_dotfiles is False
    if not include_dotfiles:
        files = [f for f in files if not os.path.basename(f).startswith(".")]

    commit_history = list(repo.iter_commits(max_count=5))

    # Convert relative paths to absolute paths using the repo's working directory
    abs_files = [os.path.join(repo.working_dir, file) for file in files]
    return {"branches": branches, "files": abs_files, "commits": commit_history}

# Function to check if a file is a text file
def is_text_file(file_path):
    """
    Determines if a given file is a text file by checking its MIME type.

    Args:
    - file_path (str): The path to the file.

    Returns:
    - bool: True if the file is a text file, False otherwise.
    """
    mime_type, _ = mimetypes.guess_type(file_path)
    # Consider text files as those having 'text/*' MIME type or no MIME type (unrecognized files)
    if mime_type and mime_type.startswith("text"):
        return True
    return False

# Create embeddings for code files and store them in a vector store
def create_vectorstore_from_repo(repo_info, local_llm_model="llama3.2"):
    """
    Create embeddings for code files and store them in a Chroma vector store.

    Args:
    - repo_info (dict): The repository information containing file paths.
    - local_llm_model (str): The local LLM model to use for creating embeddings.

    Returns:
    - retriever (object): A retriever for retrieving documents based on queries.
    """
    file_contents = []
    valid_files = []  # Keep track of successfully read files
    
    for file_path in repo_info["files"]:
        if is_text_file(file_path):
            try:
                # Attempt to open and read the file using the absolute path
                with open(file_path, 'r', encoding='utf-8') as f:
                    content = f.read()
                    file_contents.append(content)
                    valid_files.append(file_path)  # Only add to valid_files if read is successful
            except FileNotFoundError:
                print(f"File not found: {file_path}. Skipping.")
            except UnicodeDecodeError:
                print(f"Error decoding file {file_path}. Skipping.")
            except Exception as e:
                print(f"Error reading file {file_path}: {e}. Skipping.")
        else:
            print(f"Skipping binary or non-text file: {file_path}")
    
    # Create a list of document objects only for valid files
    documents = [{"page_content": content, "metadata": {"source": file_path}} for content, file_path in zip(file_contents, valid_files)]
    
    # Print the number of valid documents to ensure we have content
    print(f"Number of valid documents created: {len(documents)}")
    
    # Check if there are any documents to process
    if not documents:
        raise ValueError("No valid documents were found. Ensure that the repository files are accessible and readable.")
    
    # Use RecursiveCharacterTextSplitter to split code into smaller chunks
    text_splitter = RecursiveCharacterTextSplitter()
    split_docs = text_splitter.split_documents(documents)
    
    # Create embeddings using Ollama's local model
    embeddings = OllamaEmbeddings(model=local_llm_model)
    
    # Create a Chroma vector store from the split document chunks
    vectorstore = Chroma.from_documents(documents=split_docs, embedding=embeddings)
    return vectorstore.as_retriever()

# Example usage: Initialize retriever for a repository, ignoring dotfiles
repo_info = extract_git_tree("../../", include_dotfiles=False)
print(f"Branches: {repo_info['branches']}")
print(f"Files: {repo_info['files'][:5]}")  # Print first 5 files for brevity

# Create a vectorstore retriever for the filtered repository information
try:
    retriever = create_vectorstore_from_repo(repo_info)
except ValueError as ve:
    print(f"Error creating vector store: {ve}")


Branches: ['main']
Files: ['/workspaces/custom_ollama_docker/.devcontainer/Dockerfile', '/workspaces/custom_ollama_docker/.devcontainer/devcontainer.env', '/workspaces/custom_ollama_docker/.devcontainer/devcontainer.json', '/workspaces/custom_ollama_docker/.devcontainer/environment.yml', '/workspaces/custom_ollama_docker/.devcontainer/install_dependencies.sh']
Skipping binary or non-text file: /workspaces/custom_ollama_docker/.devcontainer/Dockerfile
Skipping binary or non-text file: /workspaces/custom_ollama_docker/.devcontainer/devcontainer.env
Skipping binary or non-text file: /workspaces/custom_ollama_docker/.devcontainer/devcontainer.json
Skipping binary or non-text file: /workspaces/custom_ollama_docker/.devcontainer/environment.yml
Skipping binary or non-text file: /workspaces/custom_ollama_docker/chroma_db/chroma.sqlite3
Skipping binary or non-text file: /workspaces/custom_ollama_docker/custom_ollama_docker.code-workspace
Skipping binary or non-text file: /workspaces/custom_oll

AttributeError: 'dict' object has no attribute 'page_content'

In [10]:
import os
import ast
import mimetypes
from langchain_community.embeddings import OllamaEmbeddings
from langchain_community.vectorstores import Chroma
from langchain_text_splitters import RecursiveCharacterTextSplitter

# Define a function to parse Python code and extract its structure
def extract_python_code_structure(file_path):
    """
    Extracts structure from a Python code file, including functions, classes, and docstrings.

    Args:
    - file_path (str): The path to the Python code file.

    Returns:
    - str: A structured representation of the code in natural language.
    """
    with open(file_path, "r", encoding="utf-8") as file:
        code = file.read()
    
    # Parse the code using AST
    tree = ast.parse(code)
    
    # Initialize an empty list to hold code descriptions
    code_structure = []
    
    for node in ast.walk(tree):
        # Extract function definitions
        if isinstance(node, ast.FunctionDef):
            func_info = f"Function: {node.name}\n"
            if ast.get_docstring(node):
                func_info += f"Docstring: {ast.get_docstring(node)}\n"
            code_structure.append(func_info)
        
        # Extract class definitions
        elif isinstance(node, ast.ClassDef):
            class_info = f"Class: {node.name}\n"
            if ast.get_docstring(node):
                class_info += f"Docstring: {ast.get_docstring(node)}\n"
            code_structure.append(class_info)
    
    # Join all code descriptions into a single string
    return "\n".join(code_structure)

# Define a function to determine if a file is a code file (e.g., .py, .js, .java)
def is_code_file(file_path):
    """
    Determines if a given file is a code file based on its extension.

    Args:
    - file_path (str): The path to the file.

    Returns:
    - bool: True if the file is a code file, False otherwise.
    """
    code_extensions = {".py", ".js", ".java", ".cpp", ".c", ".ts", ".go", ".rb"}
    return os.path.splitext(file_path)[1] in code_extensions

# Create embeddings for code structures and store them in a vector store
def create_vectorstore_from_code(repo_info, local_llm_model="llama3.2"):
    """
    Create embeddings for code structures and store them in a Chroma vector store.

    Args:
    - repo_info (dict): The repository information containing file paths.
    - local_llm_model (str): The local LLM model to use for creating embeddings.

    Returns:
    - retriever (object): A retriever for retrieving documents based on queries.
    """
    code_descriptions = []
    for file_path in repo_info["files"]:
        if is_code_file(file_path):
            try:
                # Extract the code structure from the file
                code_structure = extract_python_code_structure(file_path)
                if code_structure:
                    code_descriptions.append({"page_content": code_structure, "metadata": {"source": file_path}})
            except Exception as e:
                print(f"Error extracting code structure from {file_path}: {e}. Skipping.")
    
    # Use RecursiveCharacterTextSplitter to split code descriptions into smaller chunks
    text_splitter = RecursiveCharacterTextSplitter()
    split_docs = text_splitter.split_documents(code_descriptions)
    
    # Create embeddings using Ollama's local model
    embeddings = OllamaEmbeddings(model=local_llm_model)
    
    # Create a Chroma vector store from the split document chunks
    vectorstore = Chroma.from_documents(documents=split_docs, embedding=embeddings)
    return vectorstore.as_retriever()

# Define a function to provide code recommendations using an LLM
def provide_code_recommendations(retriever, query):
    """
    Provide recommendations on code structure and content based on a query.

    Args:
    - retriever (object): A retriever for retrieving relevant code documents.
    - query (str): The query or question related to code recommendations.

    Returns:
    - str: The LLM-generated recommendation or response.
    """
    # Retrieve relevant documents using the query
    retrieved_docs = retriever.query(query)
    
    # Combine the retrieved content for context
    context = "\n".join([doc.page_content for doc in retrieved_docs])
    
    # Generate a recommendation using a local LLM (e.g., LLaMA, Ollama)
    llm = OllamaEmbeddings(model="llama3.2")
    response = llm.invoke([{"role": "user", "content": f"Question: {query}\n\nContext: {context}"}])
    return response.content

# Example usage: Initialize retriever for a repository's code structure and provide recommendations
repo_info = extract_git_tree("../../", include_dotfiles=False)  # Extract repo structure
print(f"Branches: {repo_info['branches']}")
print(f"Files: {repo_info['files'][:5]}")  # Print first 5 files for brevity

# Create a vectorstore retriever for the code structure information
try:
    retriever = create_vectorstore_from_code(repo_info)
    print("Code vector store created successfully.")
    
    # Provide a sample code recommendation
    query = "How can I improve the function structures in the repository?"
    recommendation = provide_code_recommendations(retriever, query)
    print(f"Recommendation:\n{recommendation}")
except ValueError as ve:
    print(f"Error creating vector store: {ve}")


Branches: ['main']
Files: ['/workspaces/custom_ollama_docker/.devcontainer/Dockerfile', '/workspaces/custom_ollama_docker/.devcontainer/devcontainer.env', '/workspaces/custom_ollama_docker/.devcontainer/devcontainer.json', '/workspaces/custom_ollama_docker/.devcontainer/environment.yml', '/workspaces/custom_ollama_docker/.devcontainer/install_dependencies.sh']
Error extracting code structure from /workspaces/custom_ollama_docker/tests/test3_files/libs/bootstrap/bootstrap.min.js: closing parenthesis ']' does not match opening parenthesis '(' (<unknown>, line 6). Skipping.
Error extracting code structure from /workspaces/custom_ollama_docker/tests/test3_files/libs/clipboard/clipboard.min.js: invalid character '©' (U+00A9) (<unknown>, line 5). Skipping.
Error extracting code structure from /workspaces/custom_ollama_docker/tests/test3_files/libs/quarto-html/anchor.min.js: leading zeros in decimal integer literals are not permitted; use an 0o prefix for octal integers (<unknown>, line 3). S

AttributeError: 'dict' object has no attribute 'page_content'

In [15]:
%%writefile ../../src/sports_news_rag/modules/data_crawling.py
from langchain_ollama import ChatOllama, OllamaEmbeddings
from langchain_community.document_loaders import WebBaseLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain.schema import Document
import concurrent.futures  # For parallel processing

def crawl_and_ingest(url, debug=False):
    """
    Crawls a given URL, splits the document, generates propositions, runs quality checks, and returns the processed documents.
    """
    if debug:
        print(f"Crawling data from: {url}")

    # Load documents from the web URL
    loader = WebBaseLoader(url)
    docs = loader.load()

    # Split the documents into smaller chunks
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)  # Adjust these values
    document_chunks = text_splitter.split_documents(docs)
    if debug:
        print(f"Number of document chunks crawled and ingested: {len(document_chunks)}")

    # Generate propositions from each document chunk and perform quality checks
    proposition_documents = []
    with concurrent.futures.ThreadPoolExecutor(max_workers=10) as executor:
        futures = [executor.submit(process_chunk, chunk, debug) for chunk in document_chunks]
        for future in concurrent.futures.as_completed(futures):
            proposition_documents.extend(future.result())

    if debug:
        print(f"Total number of high-quality propositions generated: {len(proposition_documents)}")

    return proposition_documents

def process_chunk(chunk, debug=False):
    """
    Generates and quality checks propositions for a given chunk.
    """
    propositions = generate_propositions(chunk.page_content, debug)
    high_quality_propositions = quality_check_propositions(propositions, debug)
    return [Document(page_content=prop) for prop in high_quality_propositions]

def generate_propositions(text, debug=False):
    """
    Generates propositions from the given text using an LLM.
    """
    llm = ChatOllama(model="llama3.2", temperature=0)
    max_length = 2000
    text = text[:max_length] if len(text) > max_length else text

    proposition_prompt = (
        f"Break down the following text into concise, complete, and meaningful factual statements:\n\n{text}\n\n"
        "Provide each proposition as a separate statement."
    )
    response = llm.invoke([{"role": "user", "content": proposition_prompt}]).content

    propositions = [prop.strip() for prop in response.split('\n') if prop.strip()]

    if debug:
        print(f"Generated propositions: {propositions}")

    return propositions

def quality_check_propositions(propositions, debug=False):
    """
    Checks the quality of the propositions for accuracy, clarity, completeness, and conciseness.
    """
    llm = ChatOllama(model="llama3.2", temperature=0)
    high_quality_propositions = []

    batch_size = 5
    for i in range(0, len(propositions), batch_size):
        batch = propositions[i:i + batch_size]
        quality_prompt = (
            f"Evaluate the following propositions for accuracy, clarity, completeness, and conciseness. "
            f"Score each aspect from 1 to 10 and provide an overall assessment. Reply with 'pass' if the proposition is acceptable:\n\n"
            f"{', '.join(batch)}"
        )
        response = llm.invoke([{"role": "user", "content": quality_prompt}]).content

        results = response.lower().split('\n')

        if debug:
            print(f"Batch being processed: {batch}")
            print(f"LLM Response: {response}")
            print(f"Number of results received: {len(results)}, Number of propositions in batch: {len(batch)}")

        min_length = min(len(results), len(batch))
        for j in range(min_length):
            if 'pass' in results[j]:
                high_quality_propositions.append(batch[j])

    return high_quality_propositions



def main(debug=False):
    # Sample sites for testing
    sports_sites = ["https://www.nba.com/", "https://www.espn.com/"]
    all_documents = []
    for site in sports_sites:
        documents = crawl_and_ingest(site, debug)
        all_documents.extend(documents)
    if debug:
        print(f"Total documents ingested: {len(all_documents)}")

if __name__ == "__main__":
    main(debug=True)


Overwriting ../../src/sports_news_rag/modules/data_crawling.py


In [16]:
%%writefile ../../src/sports_news_rag/modules/vector_store.py
from langchain_community.vectorstores import Chroma
from langchain_ollama import OllamaEmbeddings, ChatOllama
from langchain.schema import Document  # Import the Document class
import os 

def create_vectorstore(documents, persist_directory='../../data/chroma_dbs', debug=False):
    """
    Creates a vector store from the provided documents, embedding them for later retrieval.
    """
    # Ensure each document is a Document object
    if not all(isinstance(doc, Document) for doc in documents):
        documents = [Document(page_content=doc["page_content"]) for doc in documents]

    embeddings = OllamaEmbeddings(model="llama3.2")
    if debug:
        print(f"Creating vector store with {len(documents)} high-quality propositions...")

    # Create Chroma vector store from documents
    vectorstore = Chroma.from_documents(
        documents=documents,
        embedding=embeddings,
        persist_directory=persist_directory
    )
    if debug:
        print(f"Vector store created at {persist_directory}")

    return vectorstore

def create_pre_ingested_vectorstore(site_name, documents):
    # Create directory if it doesn't exist
    directory = f"../../data/vectorstores/{site_name.lower()}"
    os.makedirs(directory, exist_ok=True)
    
    # Create the vector store
    embeddings = OllamaEmbeddings(model="llama3.2")
    vectorstore = Chroma.from_documents(documents, embedding=embeddings, persist_directory=directory)
    print(f"Vector store for {site_name} created and saved at {directory}")

def main(debug=False):
    # Use a list of high-quality Document objects instead of dictionaries
    sample_docs = [Document(page_content="This is a high-quality sample document for testing.")]
    vectorstore = create_vectorstore(sample_docs, debug=debug)
    if debug:
        print("Vector store successfully created.")
        
    # Example usage:
    site_name = "ESPN"
    documents = [Document(page_content="This is a sample document for NFL data.")]
    create_pre_ingested_vectorstore(site_name, documents)

if __name__ == "__main__":
    main(debug=True)


Overwriting ../../src/sports_news_rag/modules/vector_store.py


In [17]:
%%writefile ../../src/sports_news_rag/modules/contextual_retrieval.py

import copy
from langchain_community.vectorstores import Chroma
from langchain_ollama import OllamaEmbeddings, ChatOllama
from langchain.schema import Document  # Import the Document class

def create_contextual_nodes(documents, debug=False):
    """
    Creates contextual nodes by enriching each document with additional context.
    
    Parameters:
    - documents (List[Document]): List of LangChain Document objects.
    - debug (bool): Flag for printing debug information.
    
    Returns:
    - List[Document]: List of contextually enriched Document objects.
    """
    # Initialize the LLM
    llm = ChatOllama(model="llama3.2", temperature=0)
    
    contextual_documents = []
    for doc in documents:
        # Generate contextual information using LLM
        context_prompt = (
            f"Given the following document, generate contextual information that would help better understand its content:\n\n{doc.page_content}\n\n"
            "Contextual information:"
        )
        context = llm.invoke([{"role": "user", "content": context_prompt}]).content
        
        # Append the context to the document's metadata
        enriched_doc = copy.deepcopy(doc)
        enriched_doc.metadata["context"] = context
        contextual_documents.append(enriched_doc)
        
        if debug:
            print(f"Generated context for document: {context}")

    return contextual_documents

def create_embedding_retriever(documents, persist_directory='../../data/chroma_dbs', debug=False):
    """
    Creates a Chroma vector store retriever using contextual nodes.
    
    Parameters:
    - documents (List[Document]): List of contextually enriched Document objects.
    - persist_directory (str): Directory to persist the Chroma database.
    - debug (bool): Flag for printing debug information.
    
    Returns:
    - Chroma: Chroma vector store retriever object.
    """
    # Create embeddings with Ollama
    embeddings = OllamaEmbeddings(model="llama3.2")
    
    # Create the Chroma vector store
    if debug:
        print(f"Creating vector store with {len(documents)} contextually enriched documents...")
        
    vectorstore = Chroma.from_documents(
        documents=documents,
        embedding=embeddings,
        persist_directory=persist_directory
    )
    
    if debug:
        print(f"Vector store created at {persist_directory}")
    
    return vectorstore

def main(debug=True):
    """
    Main function to test the contextual retrieval pipeline.
    """
    # Sample documents for testing
    sample_docs = [Document(page_content="The Boston Celtics won the NBA Finals in 2023.")]
    
    # Create contextual nodes
    contextual_docs = create_contextual_nodes(sample_docs, debug=debug)
    
    # Create and test the vector store
    vectorstore = create_embedding_retriever(contextual_docs, debug=debug)
    
    # Output a message indicating successful creation of contextual retriever
    if debug:
        print(f"Successfully created contextual retriever with {len(contextual_docs)} contextually enriched documents.")

if __name__ == "__main__":
    main(debug=True)


Overwriting ../../src/sports_news_rag/modules/contextual_retrieval.py


In [18]:
%%writefile ../../src/sports_news_rag/modules/hyde_rag.py

from langchain_ollama import OllamaEmbeddings, ChatOllama
# from modules.contextual_retrieval import create_contextual_nodes, create_embedding_retriever
from langchain.schema import Document

def contextual_retrieval(question, retriever, debug=False):
    """
    Performs contextual retrieval based on a given question and contextually enriched documents.
    
    Parameters:
    - question (str): The query or question to retrieve documents for.
    - retriever: The retriever object created from the contextual vector store.
    - debug (bool): Flag for printing debug information.
    
    Returns:
    - List[Document]: List of retrieved documents based on the contextual retriever.
    """
    # Generate a hypothetical answer to enrich the retrieval process
    llm = ChatOllama(model="llama3.2", temperature=0)
    hypo_prompt = f"Generate a detailed answer to the following question:\n\n{question}\n\nAnswer:"
    hypo_answer = llm.invoke([{"role": "user", "content": hypo_prompt}]).content

    if debug:
        print(f"Hypothetical answer generated: {hypo_answer}")

    # Retrieve documents using the contextual retriever
    retrieved_docs = retriever.invoke(hypo_answer)
    
    if debug:
        print(f"Number of documents retrieved based on hypothetical answer: {len(retrieved_docs)}")
        
    return retrieved_docs

def main(debug=False):
    """
    Main function to test the contextual retrieval.
    """
    question = "What are the recent updates in the NBA?"
    
    # Create a sample document
    sample_docs = [Document(page_content="The Boston Celtics won the NBA Finals in 2023.")]
    
    # Create contextual nodes and retriever
    contextual_docs = create_contextual_nodes(sample_docs, debug=debug)
    vectorstore = create_embedding_retriever(contextual_docs, debug=debug)
    retriever = vectorstore.as_retriever()
    
    # Test the contextual retrieval
    contextual_retrieval(question, retriever, debug)

if __name__ == "__main__":
    main(debug=True)


Overwriting ../../src/sports_news_rag/modules/hyde_rag.py


In [19]:
%%writefile ../../src/sports_news_rag/modules/corrective_rag.py
from langchain_ollama import OllamaEmbeddings, ChatOllama
from langchain.schema import Document  # Import the Document class

def corrective_rag(question, retrieved_docs, debug=False):
    # Convert the list of dicts to Document objects if necessary
    if not all(isinstance(doc, Document) for doc in retrieved_docs):
        retrieved_docs = [Document(page_content=doc["page_content"]) for doc in retrieved_docs]

    llm = ChatOllama(model="llama3.2", temperature=0)
    context = "\n\n".join(doc.page_content for doc in retrieved_docs)

    initial_prompt = f"Context: {context}\n\nQuestion: {question}\n\nAnswer:"
    initial_answer = llm.invoke([{"role": "user", "content": initial_prompt}]).content

    if debug:
        print(f"Initial answer generated: {initial_answer}")

    max_iterations = 2
    for i in range(max_iterations):
        verify_prompt = f"Context: {context}\n\nAnswer: {initial_answer}\n\nIs the answer fully supported by the context? Identify any inaccuracies."
        verification = llm.invoke([{"role": "user", "content": verify_prompt}]).content

        if "no inaccuracies" in verification.lower():
            if debug:
                print(f"No inaccuracies found. Answer is verified on iteration {i + 1}.")
            break
        else:
            refine_prompt = f"Context: {context}\n\nThe initial answer may have inaccuracies: {verification}\n\nQuestion: {question}\n\nProvide a corrected answer:"
            initial_answer = llm.invoke([{"role": "user", "content": refine_prompt}]).content

    return initial_answer

def main(debug=False):
    # Sample usage for testing
    question = "Who won the NBA Finals in 2023?"
    # Use a list of Document objects instead of dictionaries for the retrieved documents
    retrieved_docs = [Document(page_content="The Boston Celtics won the NBA Finals in 2024.")]
    answer = corrective_rag(question, retrieved_docs, debug=debug)
    if debug:
        print(f"Final corrected answer: {answer}")

if __name__ == "__main__":
    main(debug=True)


Overwriting ../../src/sports_news_rag/modules/corrective_rag.py


In [20]:
%%writefile ../../src/sports_news_rag/modules/self_rag.py
from langchain_ollama import OllamaEmbeddings, ChatOllama

def self_rag(question, initial_answer, debug=False):
    """Refine an initial answer by performing self-reflection and improvements."""
    llm = ChatOllama(model="llama3.2", temperature=0)
    if debug:
        print(f"Initial answer before self-refinement: {initial_answer}")
    
    max_reflections = 2  # Number of self-reflection iterations
    for i in range(max_reflections):
        # Self-reflection step
        reflect_prompt = f"Answer: {initial_answer}\n\nReflect on the answer and identify any areas for improvement."
        reflection = llm.invoke([{"role": "user", "content": reflect_prompt}]).content

        if debug:
            print(f"Reflection result for iteration {i+1}: {reflection}")

        # If no improvements are needed, break out of the loop
        if "no improvements" in reflection.lower():
            if debug:
                print(f"No further improvements suggested after {i+1} iterations.")
            break
        else:
            # Improve the answer based on the reflection
            improve_prompt = f"Based on the reflection: {reflection}\n\nProvide an improved answer to the question: {question}"
            initial_answer = llm.invoke([{"role": "user", "content": improve_prompt}]).content

            if debug:
                print(f"Improved answer after iteration {i+1}: {initial_answer}")

    return initial_answer

def main(debug=False):
    # Sample usage for testing
    question = "Who won the NBA Finals in 2023?"
    initial_answer = "The winner of the 2023 NBA Finals is unknown to me as my knowledge cutoff is December 2023."
    refined_answer = self_rag(question, initial_answer, debug=debug)
    if debug:
        print(f"Final refined answer: {refined_answer}")

if __name__ == "__main__":
    main(debug=True)


Overwriting ../../src/sports_news_rag/modules/self_rag.py


In [21]:
%%writefile ../../src/sports_news_rag/modules/web_search.py
from langchain_community.retrievers import TavilySearchAPIRetriever

tavily_retriever = TavilySearchAPIRetriever(k=3)

def tavily_search(question, debug=False):
    docs = tavily_retriever.invoke(question)
    context = "\n\n".join(f"Source {i+1} ({doc.metadata.get('source')}):\n{doc.page_content}" for i, doc in enumerate(docs))
    if debug:
        print(f"Web search context retrieved: {context[:500]}...")  # Display first 500 chars
    return context

def main(debug=False):
    question = "Who was the first pick in the 2024 NBA Draft?"
    context = tavily_search(question, debug)
    if debug:
        print(f"Retrieved context from Tavily search: {context}")

if __name__ == "__main__":
    main(debug=True)


Overwriting ../../src/sports_news_rag/modules/web_search.py


In [22]:
%%writefile ../../src/sports_news_rag/modules/decision_mechanism.py
from modules.hyde_rag import contextual_retrieval
from modules.corrective_rag import corrective_rag
from modules.web_search import tavily_search
from modules.self_rag import self_rag  # Include the self_rag module for refinement
from langchain_ollama import OllamaEmbeddings, ChatOllama


def evaluate_confidence(answer, debug=False):
    """Evaluate the confidence of an answer using a language model."""
    llm = ChatOllama(model="llama3.2", temperature=0)
    eval_prompt = (
        f"Evaluate the confidence level (on a scale of 1-10) of the following answer being correct, "
        f"fully supported by reliable sources, and free from contradictions or inaccuracies:\n\n{answer}\n\n"
        "Confidence Score:"
    )
    confidence_score = llm.invoke([{"role": "user", "content": eval_prompt}]).content
    try:
        score = int(confidence_score.strip())
    except ValueError:
        score = 5  # Default to medium confidence if the evaluation fails
    if debug:
        print(f"Confidence score evaluated: {score}")
    return score

def decide_and_answer(question, retriever, progress_bar=None, progress_status=None, debug=False):
    """Generate answers using RAG and Tavily, and decide the best answer with self-refinement."""
    progress_step = 0.25

    # Step 1: Use contextual retrieval to get documents and generate an initial RAG-based answer
    if progress_status:
        progress_status.text("Step 1/4: Running HyDE retrieval...")
    retrieved_docs = contextual_retrieval(question, retriever, debug)
    if progress_bar:
        progress_bar.progress(progress_step)

    # Step 2: Generate a corrective RAG-based answer
    if progress_status:
        progress_status.text("Step 2/4: Generating a corrective RAG answer...")
    rag_answer = corrective_rag(question, retrieved_docs, debug)
    rag_refined_answer = self_rag(question, rag_answer, debug)  # Refine RAG answer with self-rag
    rag_confidence = evaluate_confidence(rag_refined_answer, debug)
    progress_step += 0.25
    if progress_bar:
        progress_bar.progress(progress_step)

    # Step 3: Use Tavily search to generate an answer
    if progress_status:
        progress_status.text("Step 3/4: Running Tavily search for additional context...")
    tavily_context = tavily_search(question, debug)
    tavily_prompt = f"Context: {tavily_context}\n\nQuestion: {question}\n\nAnswer:"
    llm = ChatOllama(model="llama3.2", temperature=0)
    tavily_initial_answer = llm.invoke([{"role": "user", "content": tavily_prompt}]).content
    tavily_refined_answer = self_rag(question, tavily_initial_answer, debug)  # Refine Tavily answer with self-rag
    tavily_confidence = evaluate_confidence(tavily_refined_answer, debug)
    progress_step += 0.25
    if progress_bar:
        progress_bar.progress(progress_step)

    # Step 4: Decision mechanism to choose the final answer based on confidence scores
    if progress_status:
        progress_status.text("Step 4/4: Making the final decision...")
    if rag_confidence > tavily_confidence:
        final_answer = rag_refined_answer
        source = "RAG-based response"
    elif tavily_confidence > rag_confidence:
        final_answer = tavily_refined_answer
        source = "Tavily-based response"
    else:
        # Combine answers if confidence scores are similar
        combined_prompt = (
            f"Here are two potential answers to the question:\n\n"
            f"Answer 1 (RAG-based):\n{rag_refined_answer}\n\n"
            f"Answer 2 (Tavily-based):\n{tavily_refined_answer}\n\n"
            f"Based on these, provide the best possible answer to the question: {question}"
        )
        final_answer = llm.invoke([{"role": "user", "content": combined_prompt}]).content
        source = "Combined response"

    if debug:
        print(f"Selected final answer from: {source}")
    return final_answer



import streamlit as st

def main(debug=False):
    """Main function to test the decision mechanism."""
    question = "What pick of the draft what Bronny James?"
    sample_docs = [{"page_content": "This is a sample document for testing."}]
    vectorstore = create_vectorstore(sample_docs, debug=debug)
    retriever = vectorstore.as_retriever()

    # Create Streamlit progress bar and status
    progress_bar = st.progress(0)  # Creates a Streamlit progress bar
    progress_status = st.empty()  # Placeholder for status messages

    # Pass these objects when calling decide_and_answer
    final_answer = decide_and_answer(question, retriever, progress_bar, progress_status, debug)
    st.write(f"Final answer selected: {final_answer}")  # Display the final answer

if __name__ == "__main__":
    main(debug=True)



Overwriting ../../src/sports_news_rag/modules/decision_mechanism.py


In [23]:
%%writefile ../../src/sports_news_rag/modules/fact_checker.py
from langchain_ollama import OllamaEmbeddings, ChatOllama
from modules.hyde_rag import contextual_retrieval  # Import contextual_retrieval from the hyde_rag module
from modules.web_search import tavily_search  # Import tavily_search from the web_search module
from langchain.schema import Document

def final_fact_check(question, answer, retriever, debug=False):
    """
    Perform a final fact-check of the answer based on a combined context from retrieved documents and web search results.

    Parameters:
    question (str): The question asked by the user.
    answer (str): The initial answer generated by the RAG or web search.
    retriever: The retriever object created from the vector store.
    debug (bool): If True, print debug information.

    Returns:
    str: The fact-checked and potentially corrected answer.
    """
    # Initialize the LLM for fact-checking
    llm = ChatOllama(model="llama3.2", temperature=0)

    # Retrieve documents using HyDE
    retrieved_docs = contextual_retrieval(question, retriever, debug=debug)
    context = "\n\n".join(doc.page_content for doc in retrieved_docs) if retrieved_docs else ""

    # Retrieve web context using Tavily search
    tavily_context = tavily_search(question, debug=debug)

    # Combine both contexts
    combined_context = context + "\n\n" + tavily_context

    # Debug output for context combination
    if debug:
        print(f"Combined context for fact-checking:\n{combined_context}")

    # Create the fact-checking prompt
    fact_check_prompt = (
        f"Context: {combined_context}\n\nAnswer: {answer}\n\n"
        f"Verify the accuracy of the answer based on the context. Provide a corrected answer if necessary."
    )

    # Generate the fact-checked answer using the LLM
    final_answer = llm.invoke([{"role": "user", "content": fact_check_prompt}]).content

    # Debug output for final answer
    if debug:
        print(f"Fact-checked answer: {final_answer}")

    return final_answer

def main(debug=False):
    """
    Test the final_fact_check function with sample input.
    """
    # Sample question and answer for testing
    question = "Who won the NBA Finals in 2023?"
    initial_answer = "The Los Angeles Lakers won the NBA Finals in 2023."  # Sample incorrect answer

    # Create a sample retriever for testing (assuming documents are already in vector store)
    # from modules.vector_store import create_vectorstore  # Import create_vectorstore function
    sample_docs = [Document(page_content="The Golden State Warriors won the NBA Finals in 2023.")]
    vectorstore = create_vectorstore(sample_docs, debug=debug)
    retriever = vectorstore.as_retriever()

    # Run the final_fact_check function
    corrected_answer = final_fact_check(question, initial_answer, retriever, debug=debug)
    if debug:
        print(f"Corrected answer after final fact-check: {corrected_answer}")

if __name__ == "__main__":
    main(debug=True)


Overwriting ../../src/sports_news_rag/modules/fact_checker.py


In [24]:
%%writefile ../../src/sports_news_rag/main.py

from modules.data_crawling import crawl_and_ingest
from modules.vector_store import create_vectorstore
from modules.decision_mechanism import decide_and_answer
from modules.fact_checker import final_fact_check
from modules.hyde_rag import contextual_retrieval  # Use the new contextual retrieval function

def main(debug=False):
    """
    Main function to run the entire RAG bot pipeline, integrating all modules.
    """
    # Step 1: Crawl and ingest data from sample sports sites
    sports_sites = ["https://www.nba.com/", "https://www.espn.com/"]
    all_documents = []
    for site in sports_sites:
        documents = crawl_and_ingest(site, debug)
        all_documents.extend(documents)

    # Step 2: Create contextual nodes and vector store
    contextual_docs = create_contextual_nodes(all_documents, debug=debug)
    vectorstore = create_embedding_retriever(contextual_docs, debug=debug)
    retriever = vectorstore.as_retriever()

    # Step 3: Use the contextual retrieval to generate an answer to a sample question
    question = "What pick of the Draft was Bronny James jr"
    initial_answer = contextual_retrieval(question, retriever, debug)

    # Step 4: Perform a final fact-check on the selected answer
    final_answer = final_fact_check(question, initial_answer, retriever, debug)
    print(f"Final answer for the question '{question}': {final_answer}")

if __name__ == "__main__":
    main(debug=True)


Overwriting ../../src/sports_news_rag/main.py


In [26]:
%%writefile ../../src/sports_news_rag/app.py
import streamlit as st
from modules.decision_mechanism import decide_and_answer
from modules.vector_store import create_vectorstore
from modules.data_crawling import crawl_and_ingest
from modules.fact_checker import final_fact_check
import os
from dotenv import load_dotenv
from langchain_community.vectorstores import Chroma  # Import Chroma
from langchain_chroma import Chroma

# Load environment variables
dotenv_path = os.path.join(os.getcwd(), '../../.env')
load_dotenv(dotenv_path)

# Set up the Streamlit app title and description
st.set_page_config(page_title="Advanced Sports News RAG Bot", layout="wide")
st.title("Advanced Sports News RAG Bot")
st.write("Get the most up-to-date sports news using advanced RAG techniques. This bot can combine information from various sources and fact-check responses to ensure you get the most reliable information.")

# Adding the introduction tab
tabs = st.tabs(["Introduction", "Ask a Question"])

# Introduction tab content
with tabs[0]:
    st.header("Approaches Used in Advanced Versatile RAG Bot")
    st.write("""
    This project leverages a variety of Retrieval-Augmented Generation (RAG) strategies to create an interactive assistant capable of providing reliable, up-to-date information for any type of website, though it has been initially applied to sports news. Below, we detail the approaches utilized, how they contribute to the quality of answers, and the innovative combination of different RAG methodologies.
    """)

    st.subheader("Simple RAG")
    st.write("""
    This is the foundation of our bot. It involves retrieving relevant documents based on a user query and generating answers using a large language model (LLM). This approach ensures that the generated responses are grounded in relevant information, minimizing the hallucination issues typical of generative models.
    """)

    st.subheader("Branched RAG")
    st.write("""
    To improve the quality of our retrieval, we implemented a Branched RAG strategy that performs multiple retrieval steps. By refining the search based on intermediate results, we are able to narrow down to the most contextually appropriate documents. This iterative narrowing process results in more specific and higher quality answers for complex queries.
    """)

    st.subheader("Contextual Retrieval (Replacing HyDE)")
    st.write("""
    Contextual Retrieval enhances the retrieval process by adding contextual knowledge to each document before performing the search. Instead of generating a hypothetical ideal document as in HyDE, the contextual retrieval process first creates enriched contextual nodes for each document by using a large language model (LLM). These contextual nodes contain additional relevant information, context, or explanations that provide deeper insight into the document's content.

    The enriched contextual nodes are then stored in a vector database, making them more informative and aligned with user queries. This means that when a user asks a question, the retrieval process can better understand and match the query with documents that have been contextually enhanced, resulting in higher precision and recall.

    This approach is particularly beneficial for scenarios where direct matches may not yield good results due to lack of context or nuanced query intent. By performing contextual retrieval, we ensure that the documents retrieved are not only relevant based on keyword matching but also contextually aligned, providing better support for the generated answers.
    """)

    st.subheader("Corrective RAG (CRAG)")
    st.write("""
    Corrective RAG plays an important role in ensuring the quality of our generated responses. By iteratively refining the response and checking it against the retrieved documents, CRAG helps ensure the answer is not only relevant but also factually correct. This approach is critical for maintaining trustworthiness, particularly when accuracy is key.
    """)

    st.subheader("Self-RAG")
    st.write("""
    Once an answer is generated, our system uses Self-RAG to critique and improve its own response through self-reflection. This means the language model can re-evaluate the initial answer and refine it to be clearer, more concise, or more accurate. This layer of self-assessment adds robustness to the overall response quality.
    """)

    st.subheader("Agentic RAG")
    st.write("""
    For more complex multi-step queries, our implementation uses Agentic RAG, allowing the bot to behave more like an autonomous agent capable of executing several steps to reach an answer. This might involve retrieving documents, performing corrective checks, synthesizing information, and finally fact-checking, all in a cohesive process. The agentic approach gives the bot the ability to intelligently navigate between information sources and make informed decisions about how to assemble an answer.
    """)

    st.subheader("Tavily Search for Web Information")
    st.write("""
    To complement our RAG methods, we employ Tavily Search, which allows the bot to dynamically search the web to provide additional context or verify information when the retrieved documents do not contain sufficient detail. This external search capability ensures the assistant can always provide the most recent and comprehensive information available.
    """)

    st.header("How These Approaches Work Together")
    st.write("""
    In creating this bot, we found that no single RAG strategy could solve all the complexities inherent in answering dynamic questions from different types of websites. Therefore, combining these RAG approaches has allowed us to create a more reliable and versatile assistant. Here is how they come together:

    First, the bot initiates a Simple RAG to gather relevant documents from the specified websites. This can be adapted for any type of domain—whether it's sports, finance, education, or other topics.

    Second, it applies Branched RAG and Contextual Retrieval to refine the search and create more robust retrieval pathways, particularly if the initial results are sparse or irrelevant.

    Third, once a preliminary answer is generated, Corrective RAG is used to verify the facts, cross-referencing them with additional sources to ensure correctness.

    Fourth, the Self-RAG component refines the response further, evaluating its clarity and precision before delivering it to the user.

    Fifth, Tavily Search is used when necessary to provide additional up-to-date information, especially when initial documents do not fully address the query.

    Finally, in scenarios that require multiple interconnected steps, Agentic RAG allows the bot to navigate complex information spaces, ensuring that all retrieved information is used cohesively.
    """)

    st.header("The Value of Combined RAG Approaches")
    st.write("""
    By integrating these methods, we achieve a system capable of:

    - High accuracy: Through corrective checks and iterative refining.
    - Adaptability: Using contextual document enhancement to bridge knowledge gaps and improve retrieval accuracy.
    - Depth in retrieval: Thanks to branched and agentic RAG approaches, which enable deeper contextual understanding.
    - Versatility across domains: This system is designed to work with various types of websites, making it applicable to numerous domains beyond just sports.
    - Real-time information: With Tavily Search integration, the bot can dynamically pull the latest information from the web.
    - Context retention: Enabling the assistant to carry information from prior questions when beneficial, creating more interactive and insightful conversations.

    These combined efforts make the Versatile RAG Bot not only capable of answering questions reliably but also explaining its sources, refining its own outputs intelligently, and adapting across various domains with ease.
    """)


# Ask a Question tab content
with tabs[1]:
    # Sidebar configuration options
    st.sidebar.title("Configuration")
    enable_debug = st.sidebar.checkbox("Enable Debugging", value=False)
    include_fact_check = st.sidebar.checkbox("Include Final Fact-Check", value=True)

    # Allow the user to choose between pre-ingested data or dynamic ingestion
    use_pre_ingested_data = st.sidebar.checkbox("Use Pre-Ingested Data", value=True)

    # Ensure the session state attributes are initialized
    if "vectorstore" not in st.session_state:
        st.session_state.vectorstore = None
    if "retriever" not in st.session_state:
        st.session_state.retriever = None
    if "all_documents" not in st.session_state:
        st.session_state.all_documents = []


    # Pre-Ingested Data Loading Section
    if use_pre_ingested_data:
        st.sidebar.subheader("Pre-Ingested Data Loading")

        # Predefined sites with pre-ingested data available
        known_sites = ["NBA", "ESPN", "NFL"]
        selected_site = st.sidebar.selectbox("Select Pre-Ingested Site:", known_sites)

        # Button to load pre-ingested data
        if st.sidebar.button("Load Pre-Ingested Data"):
            with st.spinner(f"Loading pre-ingested data for {selected_site}..."):
                # Define the path to pre-ingested vector store based on site selection
                pre_ingested_vectorstore_path = f"/data/vectorstores/{selected_site.lower()}"

                # Debug: Print the path being accessed
                st.sidebar.write(f"Looking for pre-ingested data at: {pre_ingested_vectorstore_path}")

                # Check if the pre-ingested vector store directory exists
                if os.path.exists(pre_ingested_vectorstore_path):
                    try:
                        # Correctly instantiate the Chroma vector store using the persist_directory
                        st.session_state.vectorstore = Chroma(
                            persist_directory=pre_ingested_vectorstore_path,
                            collection_name=None  # Set this to your collection name if applicable
                        )

                        # Ensure that the vectorstore is not None before accessing its properties
                        if st.session_state.vectorstore is None:
                            raise ValueError("Chroma vector store initialization returned None.")

                        # Debug: Print loaded collection contents to verify documents
                        if hasattr(st.session_state.vectorstore, '_collection'):
                            total_documents = len(st.session_state.vectorstore._collection._collection)
                            st.sidebar.write(f"Total documents loaded: {total_documents}")

                        # Set up retriever
                        st.session_state.retriever = st.session_state.vectorstore.as_retriever()
                        st.sidebar.success(f"Loaded pre-ingested data for {selected_site}.")
                    except Exception as e:
                        # Handle loading errors and provide feedback
                        st.sidebar.error(f"Error loading pre-ingested data for {selected_site}: {str(e)}")
                else:
                    # Provide feedback if the directory is not found
                    st.sidebar.error(f"Pre-ingested data for {selected_site} not found at path: {pre_ingested_vectorstore_path}")

    # Step 2: Handle Dynamic Data Ingestion
    else:
        # Allow the user to select new websites to dynamically ingest data
        sports_sites = st.sidebar.multiselect(
            "Select sports websites to crawl for data",
            ["https://www.nba.com/", "https://www.espn.com/", "https://www.nfl.com/"],
            default=["https://www.nba.com/", "https://www.espn.com/"]
        )

        # Step 2.1: Crawl and ingest data from selected sites
        st.sidebar.subheader("Data Ingestion for New Websites")
        if st.sidebar.button("Ingest Data"):
            with st.spinner("Crawling and ingesting data..."):
                # Reset existing data if new ingestion is requested
                st.session_state.all_documents = None
                st.session_state.vectorstore = None
                st.session_state.retriever = None

                # Crawl and ingest data from selected sites
                all_documents = []
                for site in sports_sites:
                    documents = crawl_and_ingest(site, debug=enable_debug)
                    all_documents.extend(documents)

                # Store new documents in session state
                st.session_state.all_documents = all_documents
                st.sidebar.success(f"Data ingested from {len(sports_sites)} sites. Total documents: {len(st.session_state.all_documents)}")

        # Step 2.2: Create vector store from dynamically ingested documents
        st.sidebar.subheader("Create Vector Store from Dynamic Data")
        if st.sidebar.button("Create Vector Store"):
            with st.spinner("Creating vector store from dynamically ingested data..."):
                # Create a vector store if there are any ingested documents
                if st.session_state.all_documents:
                    st.session_state.vectorstore = create_vectorstore(st.session_state.all_documents, debug=enable_debug)
                    st.session_state.retriever = st.session_state.vectorstore.as_retriever()
                    st.sidebar.success("Vector store created successfully and retriever is set up.")
                else:
                    st.sidebar.error("No documents available for vector store creation. Please ingest data first.")

    # Step 4: Input field for user question
    st.subheader("Ask a Sports-Related Question")
    user_question = st.text_input("Enter your question about sports news or events:")

    # Step 5: Generate the answer using the decision mechanism
    if st.button("Get Answer"):
        # Check if 'retriever' is properly initialized
        if "retriever" not in st.session_state or st.session_state.retriever is None:
            st.error("Vector store and retriever not set up. Please load or create the vector store first.")
        elif user_question:
            # Initialize progress bar
            progress_bar = st.progress(0)
            progress_status = st.empty()

            with st.spinner("Generating answer..."):
                # Step 5.1: Use decision mechanism to get an answer
                progress_status.text("Step 1/4: Running Contextual retrieval...")
                answer = decide_and_answer(user_question, st.session_state.retriever, progress_bar, progress_status, enable_debug)

                # Step 5.2: Optional final fact-check if enabled
                if include_fact_check:
                    progress_status.text("Step 4/4: Performing final fact-check...")
                    answer = final_fact_check(user_question, answer, st.session_state.retriever, debug=enable_debug)
                    progress_bar.progress(100)

                # Display the final answer
                st.subheader("Answer")
                st.write(answer)
        else:
            st.error("Please enter a question to get an answer.")



Overwriting ../../src/sports_news_rag/app.py
