# Augmented Self-RAG for interview preparation

In the first stage of AutoInterview, we aim to create a system that helps the candidate create strong responses to interview questions.

For this, we  created a RAG system to answer the interview question using the most appropiate documents of the candidate´s database.

The candidate´s database is composed by several documents. For example:
* CV
* Cover Letter
* Description of personal projects
* Publications

To create a strong baseline for our interview generating system, we combined self-reflection with multi-query retrieval.

Self-reflection can enhance RAG, enabling correction of poor quality retrieval or generations.

We implemented self-reflective RAG using `HuggingFace`, `Mistral`, and `LangGraph`.

We'll focus on ideas from one paper, `Self RAG` [here](https://arxiv.org/abs/2310.11511).

This can run fully locally (e.g., on a laptop).![image.png](attachment:a0576fff-c4b8-4323-b342-c649535a2898.png)


# Set-up

In [1]:
from dotenv import load_dotenv
import os

load_dotenv()

langchain_api_key = os.getenv('LANGCHAIN_API_KEY', 'YourAPIKey')

In [2]:
os.environ['LANGCHAIN_TRACING_V2'] = 'true'
os.environ['LANGCHAIN_ENDPOINT'] = 'https://api.smith.langchain.com'
os.environ['LANGCHAIN_API_KEY'] = os.getenv('LANGCHAIN_API_KEY', 'YourAPIKey')
os.environ['LANGCHAIN_PROJECT'] = "AutoInterview"

In [3]:
# Ollama model name
local_llm = "mistral:instruct"

# Index (HF embeddings)

In [4]:
from langchain_community.document_loaders import PyPDFDirectoryLoader
from langchain.embeddings import OllamaEmbeddings 
from langchain_community.vectorstores import Chroma
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import Chroma
from langchain_community.embeddings import HuggingFaceEmbeddings

# Load
profile_dir = "Profiles/Jesus"
loader = PyPDFDirectoryLoader(profile_dir)
docs = loader.load()
# Split
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    chunk_size=500, 
    chunk_overlap=100)

# Make splits
splits = text_splitter.split_documents(docs)

# Define the path to the pre-trained model you want to use
modelPath = "sentence-transformers/all-mpnet-base-v2"

# Create a dictionary with model configuration options, specifying to use the CPU for computations
model_kwargs = {'device':'cpu'}

# Create a dictionary with encoding options, specifically setting 'normalize_embeddings' to False
encode_kwargs = {'normalize_embeddings': False}

# Initialize an instance of HuggingFaceEmbeddings with the specified parameters
embeddings = HuggingFaceEmbeddings(
    model_name=modelPath,     # Provide the pre-trained model's path
    model_kwargs=model_kwargs, # Pass the model configuration options
    encode_kwargs=encode_kwargs # Pass the encoding options
)

# Index
vectorstore = Chroma.from_documents(
    documents=splits,
    persist_directory="db/chroma/RAG",
    collection_name="system_test",
    embedding=embeddings,
)
retriever = vectorstore.as_retriever()

  from .autonotebook import tqdm as notebook_tqdm
  return self.fget.__get__(instance, owner)()


## Build Graph

This just follows the flow we outlined in the figure above.

In [6]:
import pprint

from langgraph.graph import END, StateGraph
from pipelines import (
retrieve, grade_documents, transform_query, generate, prepare_for_final_grade,
decide_to_generate, grade_generation_v_documents, grade_generation_v_question,
GraphState)

workflow = StateGraph(GraphState)

# Define the nodes
workflow.add_node("retrieve", retrieve)  # retrieve
workflow.add_node("grade_documents", grade_documents)  # grade documents
workflow.add_node("generate", generate)  # generate
workflow.add_node("transform_query", transform_query)  # transform_query
workflow.add_node("prepare_for_final_grade", prepare_for_final_grade)  # passthrough

# Build graph
workflow.set_entry_point("retrieve")
workflow.add_edge("retrieve", "grade_documents")
workflow.add_conditional_edges(
    "grade_documents",
    decide_to_generate,
    {
        "transform_query": "transform_query",
        "generate": "generate",
    },
)
workflow.add_edge("transform_query", "retrieve")
workflow.add_conditional_edges(
    "generate",
    grade_generation_v_documents,
    {
        "supported": "prepare_for_final_grade",
        "not supported": "generate",
    },
)
workflow.add_conditional_edges(
    "prepare_for_final_grade",
    grade_generation_v_question,
    {
        "useful": END,
        "not useful": "transform_query",
    },
)

# Compile
app = workflow.compile()

## Run

In [7]:
question = "What are your biggest strengths?"

In [8]:
# Run
inputs = {"keys": {"question": question, "model": local_llm, "retriever": retriever}}
for output in app.stream(inputs):
    for key, value in output.items():
        # Node
        pprint.pprint(f"Node '{key}':")
        # Optional: print full state at each node
        # pprint.pprint(value["keys"], indent=2, width=80, depth=None)
    pprint.pprint("\n---\n")

# Final generation
pprint.pprint(value['keys']['generation'])

---RETRIEVE---


  warn_beta(


"Node 'retrieve':"
'\n---\n'
---CHECK RELEVANCE---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
"Node 'grade_documents':"
'\n---\n'
---DECIDE TO GENERATE---
---DECISION: GENERATE---
---GENERATE---
"Node 'generate':"
'\n---\n'
---GRADE GENERATION vs DOCUMENTS---
---DECISION: SUPPORTED, MOVE TO FINAL GRADE---
---FINAL GRADE---
"Node 'prepare_for_final_grade':"
'\n---\n'
---GRADE GENERATION vs QUESTION---
---DECISION: USEFUL---
"Node '__end__':"
'\n---\n'
(" I'm glad you asked about my biggest strengths. As a quick learner, I've "
 'always been passionate about absorbing new knowledge and applying it to '
 'real-world situations. During my PhD, I took the initiative to develop CNN '
 'models for precise neuron responses predictions, leading the Brain-Machine '
 'Interface development team. This experience allowed me to seamlessly 

In [9]:
question = "Tell me about yourself"
# Run
inputs = {"keys": {"question": question, "model": local_llm, "retriever": retriever}}
for output in app.stream(inputs):
    for key, value in output.items():
        # Node
        pprint.pprint(f"Node '{key}':")
        # Optional: print full state at each node
        # pprint.pprint(value["keys"], indent=2, width=80, depth=None)
    pprint.pprint("\n---\n")

# Final generation
pprint.pprint(value['keys']['generation'])

---RETRIEVE---
"Node 'retrieve':"
'\n---\n'
---CHECK RELEVANCE---
---GRADE: DOCUMENT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
"Node 'grade_documents':"
'\n---\n'
---DECIDE TO GENERATE---
---DECISION: GENERATE---
---GENERATE---
"Node 'generate':"
'\n---\n'
---GRADE GENERATION vs DOCUMENTS---
---DECISION: SUPPORTED, MOVE TO FINAL GRADE---
---FINAL GRADE---
"Node 'prepare_for_final_grade':"
'\n---\n'
---GRADE GENERATION vs QUESTION---
---DECISION: USEFUL---
"Node '__end__':"
'\n---\n'
(" I'm Jesus Garcia Ramirez, currently based in Leuven, Belgium. I hold a "
 'B.Sc. in Industrial Engineering with a major in Systems Control from the '
 'University of Seville, and I recently completed my PhD at KU Leuven. During '
 'my time at KU Leuven, I specialized in machine learning and developed a '
 'strong foundation in deep learning using Python, Pytorch, Pa

In [None]:
# Delete the collection
vectorstore.delete_collection()