# Advanced RAG Pipeline with CrewAI: Analyzing Research Papers
 
This notebook demonstrates an advanced Retrieval-Augmented Generation (RAG) pipeline using CrewAI to analyze research papers because although I am not a scientist, I love science and wish to understand impressive new published papers. We will, rightfully, use the industry shaking "Attention Is All You Need" paper as an example.

## Import Necessary Libraries

In [1]:
import os
import warnings
from uuid import uuid4
from dotenv import load_dotenv

from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain.vectorstores import Qdrant
from langchain import hub
from langchain.chains import RetrievalQA
from langchain.prompts import ChatPromptTemplate
from langchain.callbacks import StdOutCallbackHandler

from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, PointStruct

from crewai import Agent, Task, Crew
from crewai_tools import BaseTool

## Load Environment Variables and Set Up Configurations

In [3]:
load_dotenv()

# Set up API keys and configurations
OPENAI_API_KEY = os.getenv('OPENAI_API_KEY')
QDRANT_API_KEY = os.getenv('QDRANT_API_KEY')
QDRANT_URL = os.getenv('QDRANT_URL')
COLLECTION_NAME = "attention_pdf_paper_1"

# Ensure all necessary environment variables are set
assert all([OPENAI_API_KEY, QDRANT_API_KEY, QDRANT_URL]), "Please set all required environment variables."

## Document Loading and Processing

In [4]:
def load_and_split_document(file_path: str, chunk_size: int = 512, chunk_overlap: int = 50) -> list:
    """
    Load a PDF document and split it into chunks for processing.
    
    Args:
    file_path (str): Path to the PDF file.
    chunk_size (int): Size of each text chunk.
    chunk_overlap (int): Overlap between chunks.
    
    Returns:
    list: List of document chunks.
    """
    loader = PyPDFLoader(file_path)
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=chunk_size, chunk_overlap=chunk_overlap)
    return loader.load_and_split(text_splitter)

# Load and split the document
document = load_and_split_document("attention.pdf")
print(f"Document loaded and split into {len(document)} chunks.")

Document loaded and split into 92 chunks.


## Set Up Qdrant Vector Store

In [5]:
def setup_qdrant_client():
    """
    Set up and return a Qdrant client for vector storage.
    """
    return QdrantClient(url=QDRANT_URL, api_key=QDRANT_API_KEY)

def create_qdrant_collection(client: QdrantClient, collection_name: str, vector_size: int = 1536):
    """
    Create a new collection in Qdrant for storing document vectors.
    
    Args:
    client (QdrantClient): Initialized Qdrant client.
    collection_name (str): Name of the collection to create.
    vector_size (int): Size of the vector embeddings.
    """

    # Use cosine similarity to focus on the direction of vectors rather than their magnitude.
    # crucial when comparing features like text embeddings where the magnitude is less important than the angle between vectors
    client.create_collection(
        collection_name=collection_name,
        vectors_config={
            "content": VectorParams(size=vector_size, distance=Distance.COSINE)
        }
    )
    print(f"Collection '{collection_name}' created successfully.")

# Set up Qdrant client and create collection
qdrant_client = setup_qdrant_client()
create_qdrant_collection(qdrant_client, COLLECTION_NAME)

Collection 'attention_pdf_paper_1' created successfully.


In [6]:
# ## Document Embedding and Storage
def embed_and_store_documents(documents: list, client: QdrantClient, collection_name: str):
    """
    Embed documents and store them in the Qdrant collection.
    
    Args:
    documents (list): List of document chunks to embed and store.
    client (QdrantClient): Initialized Qdrant client.
    collection_name (str): Name of the collection to store embeddings.
    """
    embedding = OpenAIEmbeddings(openai_api_key=OPENAI_API_KEY)
    
    chunked_metadata = []
    for doc in documents:
        id = str(uuid4())
        content_vector = embedding.embed_documents([doc.page_content])[0]
        
        metadata = PointStruct(
            id=id,
            vector={"content": content_vector},
            payload={
                "page_content": doc.page_content,
                "metadata": {
                    "id": id,
                    "source": doc.metadata["source"],
                    "page": doc.metadata["page"],
                }
            }
        )
        chunked_metadata.append(metadata)
    
    client.upsert(collection_name=collection_name, wait=True, points=chunked_metadata)
    print(f"{len(chunked_metadata)} document chunks embedded and stored in Qdrant.")

# Embed and store documents
embed_and_store_documents(document, qdrant_client, COLLECTION_NAME)

92 document chunks embedded and stored in Qdrant.


## Set Up RAG Pipeline

In [7]:
def setup_rag_pipeline():
    """
    Set up the Retrieval-Augmented Generation (RAG) pipeline.
    
    Returns:
    RetrievalQA: Initialized RetrievalQA object for question answering.
    """
    embedding = OpenAIEmbeddings(openai_api_key=OPENAI_API_KEY)
    vectorstore = Qdrant(client=qdrant_client,
                         collection_name=COLLECTION_NAME,
                         embeddings=embedding,
                         vector_name="content")
    
    retriever = vectorstore.as_retriever()
    
    template = """
    You are an AI assistant specializing in analyzing research papers.
    Use the following retrieved context to answer the question.
    If you can't answer the question based on the context, say you don't know.
    
    Question: {question}
    Context: {context}
    
    Answer:
    """
    
    prompt = ChatPromptTemplate.from_template(template)
    
    llm4o = ChatOpenAI(openai_api_key=OPENAI_API_KEY,
                     temperature=0.0,
                     model="gpt-4o",
                     max_tokens=512)
    
    return RetrievalQA.from_chain_type(
        llm=llm4o,
        chain_type="stuff",
        retriever=retriever,
        return_source_documents=True,
        chain_type_kwargs={"prompt": prompt}
    )

# Set up RAG pipeline
qa_chain = setup_rag_pipeline()

  warn_deprecated(


In [12]:
print(qa_chain.invoke({"query": "In a short, direct, jargonless, and structured manner, explain what is the self attention mechanism?"})["result"])

Self-attention, also known as intra-attention, is a mechanism that relates different positions within a single sequence to compute a representation of that sequence. It has been effectively used in tasks like reading comprehension, summarization, and sentence representation.


### This approach can be adapted to analyze various types of research papers or documents, providing both detailed analysis and accessible summaries.

## CrewAI Setup

In [10]:
class ResearchPaperAnalysisTool(BaseTool):
    """Custom tool for analyzing research papers using the RAG pipeline."""
    
    name: str = "Research Paper Analysis"
    description: str = "Analyzes the content of a research paper using a RAG pipeline."

    def _run(self, query: str) -> str:
        """Run the research paper analysis tool."""
        result = qa_chain({"query": query})
        return result["result"]

# Create an instance of the custom tool
research_tool = ResearchPaperAnalysisTool()

# Define CrewAI agents
researcher = Agent(
    role="Senior Computer Science Researcher",
    goal="Conduct a comprehensive analysis of the provided research paper",
    backstory="""You are a renowned scientist specializing in Machine Learning and Artificial Intelligence.
    Your expertise allows you to dissect complex research papers and extract key insights.""",
    verbose=True,
    allow_delegation=False,
    tools=[research_tool]
)

writer = Agent(
    role="Technical Writer and Educator",
    goal="Create an accessible summary of the research paper for graduate students",
    backstory="""You are an experienced technical writer with a knack for explaining complex concepts in simple engaging terms.
    Your goal is to make cutting-edge research accessible to a wider audience.""",
    verbose=True,
    allow_delegation=False
)

# Define CrewAI tasks
analysis_task = Task(
    description="""Conduct an in-depth analysis of the 'Attention Is All You Need' paper.
    Focus on the key contributions, architectural details, and potential impacts on the field of NLP.""",
    expected_output="""A comprehensive analysis report covering:
    1. Key innovations of the Transformer architecture
    2. Detailed explanation of the self-attention mechanism
    3. Comparison with previous architectures (e.g., RNNs, LSTMs)
    4. Potential implications for future NLP research and applications""",
    agent=researcher
)

summary_task = Task(
    description="""Using the analysis provided by the researcher, create a concise yet comprehensive summary
    of the 'Attention Is All You Need' paper. The summary should be suitable for graduate students in computer science.""",
    expected_output="""A clear, concise summary of the paper, including:
    1. Main idea and significance of the Transformer architecture
    2. Key components of the model (e.g., self-attention, positional encoding)
    3. Advantages over previous approaches
    4. Potential applications and impact on the field
    The summary should be accessible to graduate students with basic knowledge of deep learning.""",
    agent=writer
)

# Set up the CrewAI
crew = Crew(
    agents=[researcher, writer],
    tasks=[analysis_task, summary_task],
    verbose=2
)

## Run CrewAI Analysis

In [11]:
result = crew.kickoff()

print("CrewAI Analysis Result:")
print(result)

[1m[95m [2024-07-31 10:12:50][DEBUG]: == Working Agent: Senior Computer Science Researcher[00m
[1m[95m [2024-07-31 10:12:50][INFO]: == Starting Task: Conduct an in-depth analysis of the 'Attention Is All You Need' paper.
    Focus on the key contributions, architectural details, and potential impacts on the field of NLP.[00m


[1m> Entering new CrewAgentExecutor chain...[0m
[32;1m[1;3mTo conduct a comprehensive analysis of the 'Attention Is All You Need' paper, I will utilize the Research Paper Analysis tool to extract detailed insights on the specified criteria. The first step is to obtain an analysis of the key contributions and innovations of the Transformer architecture.

Action: Research Paper Analysis
Action Input: {"query": "Key innovations of the Transformer architecture in 'Attention Is All You Need' paper"}[0m

  warn_deprecated(


[95m 

The key innovations of the Transformer architecture in the "Attention Is All You Need" paper include:

1. **Attention Mechanism**: The Transformer is the first sequence transduction model based entirely on attention mechanisms, replacing the recurrent layers commonly used in encoder-decoder architectures with multi-headed self-attention.

2. **Multi-Headed Self-Attention**: This allows the model to focus on different parts of the input sequence simultaneously, capturing various aspects of the data.

3. **Elimination of Recurrence**: By eschewing recurrence, the Transformer can model dependencies without regard to their distance in the input or output sequences, which is a significant departure from traditional recurrent neural networks (RNNs).

4. **Stacked Self-Attention and Fully Connected Layers**: Both the encoder and decoder in the Transformer architecture use stacked self-attention and point-wise, fully connected layers, enhancing the model's ability to process sequences 


## Conclusion

This notebook demonstrates an advanced RAG pipeline using CrewAI to analyze research papers. The pipeline includes:

1. Loading and processing PDF documents
2. Embedding and storing document chunks in a Qdrant vector database
3. Setting up a RAG pipeline with OpenAI's GPT-4o
4. Using CrewAI to orchestrate a detailed analysis and summary of the paper